Those assertions will prevent the backup functions from overwriting
the dirstate file in case both: suffix and prefix are empty.
(foozy suggested making that change and I agree with him)
The issue with using actualfilename is that dirstate saved during transaction
with "pending" in filename will be impossible to recover from outside of the
transaction because the recover method will be looking for the name without
"pending".
As the copymap is short-lived object regenerated from dirstate on each
read this didn't affect us in any serious way. But since I've started working
on permanent storage of copymap in my experiments with sqldirstate[1] I've seen
this bug leaving the copy information in copymap after reverting the file
moves and copies.
[1] https://www.mercurial-scm.org/wiki/SQLDirstatePlan
This would allow the code explicitly copying dirstate to use this method instead.
Use of this method will increase encapsulation (the dirstate class will be sole
owner of its on-disk storage).
The recommended way to check default value (when None is not as option) is a
token object. Identity testing to integer is less explicit and not guaranteed to
work in all implementations.
When we introduce the develwarning, we did not had an official deprecation API
and infrastructure. We can now officially deprecate the old way with a version
deadline.
Before this patch, we were using python code for computing the nonnormal
dirstate entries. This patch makes us use the C implementation of the function
when it is available.
Using the nonnormal set in hgwatchman improves hg status performance. Below
are the numbers for mozilla-central.
with the changes:
$ hg perfstatus
! wall 0.010632 comb 0.000000 user 0.000000 sys 0.000000 (best of 246)
without the changes:
$ hg perfstatus
! wall 0.036442 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
On mozilla-central the improvement to hg status is ~20% (0.25s to 0.2s),
on our big repos at Facebook, the win is ~40% (1.2s to 0.72s).
Before this patch, we were only populating the non-normal set when parsing
or packing the dirstate. This was not enough to keep the non-normal set up to
date at all time as we don't write and read the dirstate whenever a change
happens. This patch solves this issue by updating the non-normal set when it
should be updated. note: pack_dirstate changes the dmap and we have it keep
it unchanged for retrocompatibility so we are forced to recompute the
non-normal set after calling it.
This patch adds a new python function in the dirstate to compute the set of
non-normal files from the dmap. These files are useful to compute the repository
status.
Rather than sleep for 2 seconds, we sleep until the next even-numbered
second, which has the same effect, but makes tests faster. This
removes test-largefiles-update as the long pole of the test suite.
When debugrebuilddirstate --minimal is called, rebuilding the dirstate was done
outside of the appropriate rebuild function. This patch makes
debugrebuilddirstate use dirstate.rebuild.
This was done to allow our extension to become aware debugrebuilddirstate
--minimal
We've globablly forced stat to return integer times which agrees with
our extension code, so this is no longer needed.
This speeds up status on mozilla-central substantially:
$ hg perfstatus
! wall 0.190179 comb 0.180000 user 0.120000 sys 0.060000 (best of 53)
$ hg perfstatus
! wall 0.275729 comb 0.270000 user 0.210000 sys 0.060000 (best of 36)
The _filefoldmap is not updated in when files are deleted from dirstate. In the
case where the file with the same but differently cased name is added afterwards
it renders _filefoldmap incorrect. Those steps must occur to for a problem to
reproduce:
- call status (with listunknown=True),
- update working rectory to a commit which does a casefolding change (A -> a)
- call status again (it will show the file "a" as deleted)
Unfortunately I'm unable to write a test for it because I don't know any
core-mercurial command able to reproduce those steps.
The bug was originally spotted when hgwatchman was enabled. It caused the
changeset contents change during hg rebase (one file unrelarted to changeset
was deleted in it after rebase).
The hgwatchman is able to hit it because when hgignore changes the hgwatchmans
overridestatus is calling original status with listunknown=True.
This violation, which passes repo object to dirstate, was introduced
by dc3ddedecae7.
This patch uses 'False' instead of 'None' as default value of 'tr'
argument, to distinguish "None as repo.currenttransaction() result"
from "legacy invocation without explicit tr passing".
True/False value of '_pendingmode' means whether 'dirstate.pending' is
used to initialize own '_map' and so on. When it is None, neither
'dirstate' nor 'dirstate.pending' is read in yet.
This is used to keep consistent view between '_pl()' and '_read()'.
Once '_pendingmode' is determined by reading one of 'dirstate' or
'dirstate.pending' in, '_pendingmode' is kept even if 'invalidate()'
is invoked. This should be reasonable, because:
- effective 'invalidate()' invocation should occur only in wlock scope, and
- wlock can't be gotten under HG_PENDING mode
'_trypending()' is defined as a normal function to factor similar code
path (in bookmarks and phases) out in the future easily.
This patch delays writing in-memory changes out, if transaction is
running.
'_getfsnow()' is defined as a function, to hook it easily for
ambiguous timestamp tests (see also fakedirstatewritetime.py)
'if tr:' code path in this patch is still disabled at this revision,
because there is no client invoking 'dirstate.write()' with repo
object.
BTW, this patch changes 'dirstate.invalidate()' semantics around
'dirstate.write()' in a transaction scope:
before:
with repo.transaction():
dirstate.CHANGE('A')
dirstate.write() # change for A is written out here
dirstate.CHANGE('B')
dirstate.invalidate() # discards only change for B
after:
with repo.transaction():
dirstate.CHANGE('A')
dirstate.write() # change for A is still kept in memory
dirstate.CHANGE('B')
dirstate.invalidate() # discards changes for A and B
Fortunately, there is no code path expecting the former, at least, in
Mercurial itself, because 'dirstateguard' was introduced to remove
such 'dirstate.invalidate()'.
Some comments in this patch assume that subsequent patch changes
'dirstate.write()' like as below:
def write(self, repo):
if not self._dirty:
return
tr = repo.currenttransaction()
if tr:
tr.addfilegenerator('dirstate', (self._filename,),
self._writedirstate, location='plain')
return # omit actual writing out
st = self._opener('dirstate', "w", atomictemp=True)
self._writedirstate(st)
This patch makes '_savebackup()' write in-memory changes out, and it
causes clearing 'self._dirty'. If dirstate isn't changed after
'_savebackup()', subsequent 'dirstate.write()' never invokes
'tr.addfilegenerator()' because 'not self._dirty' is true.
Then, 'tr.writepending()' unintentionally returns False, if there is
no other (e.g. changelog) changes pending, even though dirstate
changes are already written out at '_savebackup()'.
To avoid such situation, this patch makes '_savebackup()' explicitly
invoke 'tr.addfilegenerator()', if transaction is running.
'_savebackup()' should get awareness of transaction before 'write()',
because the former depends on the behavior of the latter before this
patch.
This can centralize the logic to write in-memory changes out correctly
according to transaction activity into dirstate.
Passing 'repo' object to newly added functions is needed to examine
current transaction activity in subsequent patches, because 'dirstate'
itself doesn't have direct reference to it.
On recent OS, 'stat.st_mtime' has a double precision floating point
value to represent nano seconds, but it is not wide enough for actual
file timestamp: nowadays, only 52 - 32 = 20 bit width is available for
decimal places in sec.
Therefore, casting it to 'int' may cause unexpected result. See also
changeset 8102a3981272 fixing issue4836 for detail.
For example, changed file A may be treated as "clean" unexpectedly in
steps below. "rounded now" is the value gotten by rounding via
'int(st.st_mtime)' or so.
---------------------+--------------------+------------------------
"now" | | timestamp of A (time_t)
float rounded time_t| action | FS dirstate
------ ------- ------+--------------------+-------- ---------------
N+.nnn N N | | --- ---
| update file A | N
| dirstate.normal(A) | N
N+.999 N+1 N | |
| dirstate.write() | N (*1)
| : |
| change file A | N
| : |
N+1.00 N+1 N+1 | |
| "hg status" (*2) | N N
------ ------- ------+--------------------+-------- ---------------
Timestamp N of A in dirstate isn't dropped at (*1), because "rounded
now" is N+1 at that time, even if 'st_mtime' in 'time_t' is still N.
Then, file A is unexpectedly treated as "clean" at (*2) in this case.
For consistent handling of 'stat.st_mtime', this patch makes
'pack_dirstate()' take 'now' argument not in floating point but in
integer.
This patch makes 'PyArg_ParseTuple()' in 'pack_dirstate()' use format
'i' (= checking type mismatch or overflow), even though it is ensured
that 'now' is in the range of 32bit signed integer by masking with
'_rangemask' (= 0x7fffffff) on caller side.
It should be cheaper enough than packing itself, and useful to
detect that legacy code invokes 'pack_dirstate()' with 'now' in
floating point value.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
It's useless to handle file patterns as relative to the cwd of the server
process. The only sensible way in hgweb is to resolve paths relative to the
repository root.
It seems dirstate.getcwd() isn't used to get a real file path, so this patch
won't cause problem.
Previously, importing a case-only rename patch on a case insensitive filesystem
caused the original file to be marked as '!' in status. The source was being
forgotten properly in patch.workingbackend.close(), but the call it makes to
scmutil.marktouched() then put the file back into the 'n' state (but it was
still missing from the filesystem).
The cause of this was scmutil._interestingfiles() would walk dirstate,
and since dirstate was able to lstat() the old file via the new name,
was treating this as a forgotten file, not a removed file.
scmutil.marktouched() re-adds forgotten files, so dirstate got out of
sync with the filesystem.
This could be handled with less code in the "kind == regkind or kind
== lnkkind" branch of dirstate._walkexplicit(), but this avoids
filesystem accesses unless case collisions occur. _discoverpath() is
used instead of normalize(), since the dirstate case is given first
precedence, and the old file is still in it. What matters is the
actual case in the filesystem.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
Python 2.6 introduced a new octal syntax: "0oXXX", replacing "0XXX". The
old syntax is not recognized in Python 3 and will result in a parse
error.
Mass rewrite all instances of the old octal syntax to the new syntax.
This patch was generated by `2to3 -f numliterals -w -n .` and the diff
was selectively recorded to exclude changes to "<N>l" syntax conversion,
which will be handled separately.
This uses a simple heuristic to avoid expensive resizes.
On a real-world repo with around 400,000 files, perfdirstate:
before: ! wall 0.155562 comb 0.160000 user 0.150000 sys 0.010000 (best of 64)
after: ! wall 0.132638 comb 0.130000 user 0.120000 sys 0.010000 (best of 75)
On another real-world repo with around 250,000 files:
before: ! wall 0.098459 comb 0.100000 user 0.090000 sys 0.010000 (best of 100)
after: ! wall 0.089084 comb 0.090000 user 0.080000 sys 0.010000 (best of 100)
Default value was not tested with 'is None', this made empty list seen as
default value and result the invalidation of every single entry in the
dirstate. On repos with hundred of thousand of files, this results in minutes
of lookup time instead nothing.
This is a text book example of why we should test 'is None' if this is what we
mean.
This prevents immediate string `dirstate` from multiplying. This is also
a preparation for making dirstate aware of PENDING mechanism in
subsequent patches.
Now that the matcher supports 'include:' rules, let's change the dirstate.ignore
creation to just create a matcher with a bunch of includes. This allows us to
completely delete ignore.py.
I moved some of the syntax documentation over to readpatternfile in match.py so
we don't lose it.
Previously we would always pass the root .hgignore path to the ignore parser.
The parser then had to be aware that the first path was special, and not warn if
it didn't exist.
In preparation for making the ignore file parser more generically usable, let's
make the parse logic not aware of this special case, and instead just not pass
the root .hgignore in if it doesn't exist.