- Add patchmeta.copy() and emit copies from iterhunks. Modifying patchmeta
instances in applydiff() makes things simpler.
- Rename selectfile() into makepatchmeta(). It is responsible for creating
patchmeta for regular patches.
- Pass patchmeta objects to patchfile() directly
patchmeta instances were associated with git patches, for regular patches we
had to pass additional variables to tell the patch intent to patchfile().
Instead, we generate patchmeta for regular patches and pass them. This will
also help with patch filtering by matcher objects.
git patches may require copies to be handled out-of-order. For instance, take
the following sequence:
* modify a
* copy a into b
Here, we have to generate b from a before its modification. To do so,
applydiff() was scanning for copy metadata and performing the copies before
processing the other changes in-order. While smart and efficient, this approach
complicates things by handling file copies and file creations at different
places and times. While a new file must not exist before being patched a copied
file already exists before applying the first hunk.
Instead of copying the files at their final destination before patching, we
store them in a temporary file location and retrieve them when patching. The
filestore always stores file content in real files but nothing prevents adding
a cache layer. The filestore class was kept separate from fsbackend for at
least two reasons:
- This class is likely to be reused as a temporary result store for a future
repository patching call (entries just have to be extended to contain copy
sources).
- Delegating this role to backends might be more efficient in a repository
backend case: the source files are already available in the repository itself
and do not need to be copied again. It also means that third-parties backend
would have to implement two other methods. If we ever decide to merge the
filestore feature into backend, a minimalistic approach would be to compose
with filestore directly. Keep in mind this copy overhead only applies for
copy/rename sources, and may even be reduced to copy sources which have to
handled ahead of time.
The patcher has to know if a file is being created or removed to check if the
target already exists, or to actually unlink the file when a hunk emptying it
is applied. This was done by embedding the creation/removal information in the
first (and only) hunk attached to the file.
There are two problems with this approach:
- creation/removal is really a property of the file being patched and not its
hunk.
- for regular patches, file creation cannot be deduced at parsing time: there
are case where the *stripped* file paths must be compared. Modifying hunks
after their creation is clumsy and prevent further refactorings related to
copies handling.
Instead, we delegate this job to selectfile() which has all the relevant
information, and remove the hunk createfile() and rmfile() methods.
Most filesystem calls are already isolated in patchfile but this is not enough:
renames are performed before patchfile is available and some chmod calls are
even done outside of the applydiff call. Once all these calls are extracted
into a backend class, we can provide cleaner APIs to write to a working
directory context directly into the repository.
These leaks may occur in environments that don't employ a reference
counting GC, i.e. PyPy.
This implies:
- changing opener(...).read() calls to opener.read(...)
- changing opener(...).write() calls to opener.write(...)
- changing open(...).read(...) to util.readfile(...)
- changing open(...).write(...) to util.writefile(...)
Before the additional datefilters (utcdate, svnisodate, svnutcdate)
were used when kwtemplater was initialized. Now they always be used
once the extension is enabled.
* remove obsolete reference to potential problems with merge and import
* emphasize that running kwshrink before configuration changes which
affect active/expanded keywords is mandatory
1) hg cp symlink copy -> copy is a symlink.
2) cp symlink copy; hg cp -A symlink copy -> copy is a regular file.
In the second case we have to follow the symlink to its target
to find out whether we have to unexpand keywords in the copy.
Add test covering the case where the copied link's target is ignored
by keyword but has content which would match the regex for expanded
keywords to check whether we indeed leave the destination alone.
- dirstate of overwritten files must be forced to normal
with kwexpand/kwshrink, not commit.
- recorded files must be weeded before overwriting.
- add test cases.
- move preselection of expansion candidates for rollback
and record into helper function
- same overwrite order in rollback and record:
1. modified, 2. added
- self.wlock() inside kwrepo class instead of repo.wlock()
Comparing sizes is cheaper than comparing file contents, as it does not
involve reading the file on disk or from the filelog.
It is however not always possible: some extensions, or encode filters,
change data when extracting it to the working directory.
There are only 2 patterns to choose, and so far only 1 case
where kwtemplater.re_kw.subn is applied on data read from
the working directory: when recording added files.
With this change the code reflects more closely the boolean
character of the switch and underlines the special case.
More safeguarding against accidental (un)expansion:
Reading filelog: act only on \$(kw1|kw2|..)\$ as keywords are always
stored unexpanded.
Reading wdir: act only on \$(kw1|kw2|..): [^$\n\r]*? \$ as we only
are interested in expanded keywords in this situation.
Note: we cannot use ..): [^$\n\r]+? \$ because e.g.
the {branch} template might be empty.
hg record is a special case as we read from the working directory and
need one regex each for modified and added files. Therefore test
recording an added file.
This way we finally also forbid sequences like $Id: $ being treated
as keywords.
copy/rename destinations being unversioned and possibly ignored by
the extension should not contain expanded keywords.
Files copied/renamed from an ignored source are not touched.
Add tests covering both of the above cases, plus the corner case of
cp symlink foo; hg cp -A symlink foo (where foo becomes a regular file).
Make kwexpand, kwshrink restricted commands - i.e. read from
filelog without expansion for substition in kwtemplater.overwrite,
and set/unset restricted mode for overwrite() in in kwcommitctx
and the dorecord wrapper.
Preselect candidates when working on changed files (rollback, record)
outside kwtemplater class, and remove 6th argument from overwrite().
Avoid duplicate substitution/search in overwrite():
Only go into restricted read mode when reading from filelog.
rollback and record read from the working directory, where
restricted mode would already shrink keywords before overwrite()
either expands or shrinks them again.
This ensures that the usual automatic operations on keywords
are turned off during overwrite() and only overwrite() itself
acts on them.
Reduce manifest calculation to the cases where it is needed.
Move helper function for expansion removal outside kwtemplater class.
Always shrink and never expand keywords during a diff operation.
Avoid user distraction e.g. because of spurious differences
appearing in the commit editor.
Even though just enforcing expansion after overwriting files in
the working directory caused no problems that we know of, this avoids
a potential source of problems (e.g. in collaboration other extensions)
at no costs.
When cloning, prevent [keyword] filename patterns configured locally
in the source directory to persist during the update in the destination.
a) move [keyword] retrieval (back) to reposetup
b) remove the corresponding global kwtools attributes
Add test cases.
We can check for file existence in the working directory (needed
in case of recording) by simply using the given context and
calculate the manifest only when there are in fact candidates
for expansion/shrinking.
this helps users to know what kind of option is:
- no value is required(flag option)
- value is required
- value is required, and multiple occurrences are allowed
each kinds are shown as below:
-f --force force push
-e --ssh CMD specify ssh command to use
-b --branch BRANCH [+] a specific branch you would like to push
if one or more 3rd type options are shown, explanation for '[+]' mark
is also shown as footnote.