Currently hgweb is not streaming its output -- it accumulates the
entire response before sending it. This patch restores streaming
behaviour. To avoid having to synchronously write many tiny fragments,
this patch also adds buffering to the template generator. Local
testing of a fetch of a 100,000 line file with wget produces a slight
slowdown overall (up from 6.5 seconds to 7.2 seconds), but instead of
waiting 6 seconds for headers to arrive, output begins immediately.
util module implements two versions of statfiles function
_statfiles calls lstat per file
_statfiles_clustered takes advantage of optimizations in osutil.c, stats all
files in directory at once when new directory is hit and caches the results
util.statfiles dispatches to appropriate version during module loading
The speedup on directory tree with 2k directories and 63k files is about
factor of 1.8 (1.3s -> 0.8s for hg diff - hg startup overhead about .2s)
At this point only Win32 now benefit from this patch.
Rest of OSes use the non clustered implementation.
add _checklink var to dirstate
introduce dirstate.flagfunc
switch users of util.execfunc/linkfunc to flagfunc
change manifestdict.set to take a flags string
change ctx.fileflags to ctx.flags
change gitmode func to a dict
remove util.execfunc/linkfunc
The function, given a relative filename and a root, returns the filename
modified to use the case actually stored in the filesystem (or None if the
file does not exist). The returned name is relative to the root, but retains
the path separators used in the input path. (This is not strictly necessary,
but retaining the path separators minimises misleading test suite failures).
A win32-specific implementation (using win32api.FindFiles) is possible, but it
has not been implemented as testing seems to demonstrate that the
win32-specific code is not significantly faster (thanks to the caching of
results in the generic code).
datestr:
- add format specifiers %1 and %2 for timezone hours and minutes
- remove timezone and timezone format options
- correctly find timezone hours and minutes for fractional and negative timezones
- update users
strdate:
- correctly find timezone hours and minutes for fractional and negative timezones
Moves ssh argument building to platform specific utils code.
The win32 version looks for plink in ssh command string and
uses '-P' in lieu of '-p' for specifying a port
Allow adding to dirstate files that clash with previously existing
but marked for removal. Protect from reintroducing clashes by revert.
This change doesn't address related issues with update. Current
workaround is to do "clean" update by manually removing conflicting
files/dirs from working directory.
commit (aborts _after_ typing in a commit message)
backout (aborted after the initial revert)
tag (edited .hgtags and couldn't commit)
import (patch applied, then commit fails)
qnew (aborts on bad dates, but writes any valid date into the # Date header)
qrefresh (like qnew)
sign (like tag)
fetch (merge, merge, merge, merge, abort)
This could happen e.g. in group writable local repositories where a file
should become executable on update.
(Patch by Benoit Boissinot attached to issue530)
On Linux VFAT execution mode can be modified, but changes don't
persist a filesy stem remount. The current test can be trickled by
this. We can help with the det ection of VFAT checking whether new
files get created with the execution bits on
(as usually these partitions are mounted with the exec option, for
convenience)
.
We use a separate cache to avoid problems with
audit = path_auditor(repo.root)
audit("subrepo")
audit("subrepo/file")
whitelisting "subrepo" (which is fine) and then using the same whitelist
with "subrepo/file" (which is not fine).
Since we create a separate path_auditor for every path on the command line,
a "hg add dir/a dir/b dir/c" will still lstat dir 3 times just to audit
the paths.
The following properties of a path are now checked for:
- under top-level .hg
- starts at the root of a windows drive
- contains ".."
- traverses a symlink (e.g. a/symlink_here/b)
- inside a nested repository
If any of these is true, the path is rejected.
The check for traversing a symlink is arguably stricter than necessary;
perhaps we should be checking for symlinks that point outside the
repository.
Simply use find_exe('hg') as the default value for $HG and require to manually
set it if you have special requirements.
While the default will not always be 100% correct (i.e. the identical hg
version) for many users it is and for the others the hg executable found in
the PATH should do most things correctly.
Developers or other users with multiple installs can set $HG or run something
like util.set_hgexecutable in their shell or python scripts.
Additionally util.hgexecutable() is now available so extensions can access
the value with a public interface, too.
Differences from os.symlink:
- the symlink name is relative to the opener base directory
- if a file with that name already exists, it's removed
- if necessary, parent directories are created
- if the system (OS or filesystem) doesn't support symlinks, a
regular file is created. Its contents are the symlink target.
The original idea might have been to prevent circular references, but
as this assignment would have created another reference, this makes
no difference.
In python 2.4+ on darwin, locale.getpreferredencoding() returns
mac-roman regardless of what LC_CTYPE, LANG etc are set to. This can
produce hard-to-notice conversion errors if input text is not in
mac-roman. So this patch overrides it with setlocale/getlocale if the
environment has been customized, on the assumption that the user has
done so deliberately.
The interface provided by opener(atomic=True) is inherently unsafe:
if an exception is raised in the code using the atomic file, the
possibly incomplete file will be renamed to its final destination,
defeating the whole purpose of atomic files.
To get around this, we would either need some bad hacks involving
sys.exc_info (to make sure things work in except: blocks), or an
interface to say "file is complete; rename it".
This is the exact interface provided by atomictempfile. Since there
are no remaining users of the atomicfile class, just remove it.
The behaviour of find_in_path was broken for config options containing
path names, because it always searched the given path, even when not
necessary. The find_exe function is more polite: if the name passed
to it contains a path component, it just returns it.
In cdc7e3627e1b, Benoit made a change that substantially slows matching
when a big .hgignore file is in play, because it calls into the regexp
matching engine potentially hundreds of times per file to be matched.
I've partly rolled back his change, so that we only call into the matcher
once per file, but preserved the ability to report a meaningful error
message if there's a syntax error in the regexp.
Right now, surprisingly enough, if you request an atomic file but the
file still doesn't exist, you get a regular file. AFAICS, the only time
this happens is during the initial creation of the dirstate.
With this change, you have to use "hg locate 'hgweb/**'" to locate
all the files in directories named hgweb. OTOH, "hg locate '*l'"
will locate only files that end with "l" - e.g. a file called "hg.py"
will not be matched just because it's in a directory whose name ends
with "l" (e.g. "mercurial").
With that changeset, it's impossible to use a glob: pattern to match
e.g. all files ending in .py - glob:**.py would also match all files
in a directory called dir.py.
This makes the behaviour of glob: patterns more consistent:
hg status glob:dir and hg status -I glob:dir will match
the same files.
It's also consistent with the fact that {rel,}path patterns
recursively match the contents of a directory.
This will trigger every time somebody runs something like "hg diff"
or "hg status" without any arguments.
The important part here is returning util.always as the match function,
which is a much simpler (and faster) function than the usual return
value, and allows other code to just skip the filtering if it knows
all files will match.
This makes its default behaviour useful again (issue108), and
changes it search the entire repository by default (instead
of just the cwd), just like all other commands.
It also hides issue204 by default, but you'll still see the
same behaviour if you give it a relpath: pattern.
This removes a hack where we appended '/' to a dirname so that:
- it would not appear on the "dc" dict
- it would always be matched by the match function
This was a contorted way of checking if the directory was matched by
some hgignore pattern, and it would still fail with some uses of
--include/--exclude patterns.
Things would still work fine if we removed the check altogether and
just appended things to "work" directly, but then we would end up
walking ignored directories too, which could be quite a bit of work.
This allows further simplification of the match function returned by
util._matcher, and fixes walking the working directory with a
--include pattern that matches only the end of a name.