hgsql: add a test demonstrating issues we saw with treeonly pushrebases

Summary:
This happens if during prepushrebase hook, a hgsql repo sync (db -> local)
is completed by another process. `repo.manifestlog` does not get invalidated
correctly if it's treeonly.

The issue was partially detected by a C program modified from fanotify (2)
manpage example monitoring `00manifesttree.i` changes:

  [00:32:35.780] pid 7734 opens 00manifesttree.i (size 1000264)                       # First open.
  [00:32:35.930] pid 7734 reads closes (no write) 00manifesttree.i (size 1000264)
  [00:32:38.685] pid 9175 opens 00manifesttree.i (size 1000264)
  [00:32:38.885] pid 9175 reads 00manifesttree.i (size 1000264)
  [00:32:38.886] pid 9175 closes (no write) 00manifesttree.i (size 1000264)
  [00:32:39.235] pid 9175 opens 00manifesttree.i (size 1000264)
  [00:32:39.235] pid 9175 closes (no write) 00manifesttree.i (size 1000264)
  [00:32:39.236] pid 9175 opens 00manifesttree.i (size 1000264)
  [00:32:39.236] pid 9175 modifies closes 00manifesttree.i (size 1000328)             # Appended by another process.
  [00:32:41.169] pid 10759 opens 00manifesttree.i (size 1000328)
  [00:32:41.355] pid 10759 reads 00manifesttree.i (size 1000328)
  [00:32:41.355] pid 10759 closes (no write) 00manifesttree.i (size 1000328)
  [00:32:41.537] pid 10759 opens closes (no write) 00manifesttree.i (size 1000328)
  [00:32:41.537] pid 10759 opens 00manifesttree.i (size 1000392)
  [00:32:41.537] pid 10759 modifies closes 00manifesttree.i (size 1000392)            # Appended by another process.
  [00:32:44.930] pid 7734 opens closes (no write) 00manifesttree.i (size 1000392)     # Main process picked up changes.
  [00:32:44.930] pid 7734 opens 00manifesttree.i (size 1000392)
  [00:32:44.930] pid 7734 reads 00manifesttree.i (size 1000392)
  [00:32:44.930] pid 7734 modifies closes 00manifesttree.i (size 1000456)             # Main process wrote data.
  [00:32:45.275] pid 7734 opens 00manifesttree.i (size 1000456)
  [00:32:45.459] pid 7734 reads 00manifesttree.i (size 1000456)
  [00:32:45.459] pid 7734 closes (no write) 00manifesttree.i (size 1000456)
  [00:32:45.550] pid 7734 opens closes (no write) 00manifesttree.i (size 1000456)
  [00:32:45.550] pid 7734 opens 00manifesttree.i (size 1000264)
  [00:32:45.550] pid 7734 closes 00manifesttree.i (size 1000264)                      # Main process truncated to the wrong position.

Pid 7734 had "IntegrityError: 1062 (23000): Duplicate entry" error. The
fanotify log showed it truncated the revlog to a wrong location, indicating
an outdated revlog was kept in memory.

The C program was sent as D10418991.

Reviewed By: DurhamG

Differential Revision: D10417797

fbshipit-source-id: 7ccc0a976d05efbca5b3ed6fb5ff7886766d06d2
This commit is contained in:
Jun Wu 2018-10-17 20:02:12 -07:00 committed by Facebook Github Bot
parent 3f06e4734e
commit 5a3842e136
3 changed files with 120 additions and 3 deletions

View File

@ -21,9 +21,13 @@ Config::
syncinterval = -1
# Enable faster "need sync or not" check. It could be 6x faster, and
# removes some time in the critial section.
# removes some time in the critical section.
# (default: true)
fastsynccheck = true
# Whether to do an initial "best-effort" pull from the database.
# (default: true)
initialsync = true
"""
@ -91,6 +95,7 @@ configitem("hgsql", "bypass", default=False)
configitem("hgsql", "enabled", default=False)
configitem("hgsql", "engine", default="innodb")
configitem("hgsql", "fastsynccheck", default=True)
configitem("hgsql", "initialsync", default=True)
configitem("hgsql", "locktimeout", default=60)
configitem("hgsql", "maxcommitsize", default=52428800)
configitem("hgsql", "maxinsertsize", default=1048576)
@ -245,6 +250,10 @@ def extsetup(ui):
)
)
global initialsync
if not ui.configbool("hgsql", "initialsync"):
initialsync = INITIAL_SYNC_DISABLE
# Directly examining argv seems like a terrible idea, but it seems
# necessary unless we refactor mercurial dispatch code. This is because
# the first place we have access to parsed options is in the same function
@ -252,7 +261,6 @@ def extsetup(ui):
# the sync operation in which the lock is elided unless we set this.
if "--forcesync" in sys.argv:
ui.debug("forcesync enabled\n")
global initialsync
initialsync = INITIAL_SYNC_FORCE

View File

@ -0,0 +1,109 @@
#require no-windows
Test:
1. Process X is handling a pushrebase request.
2. While running prepushrebase hooks, the local repo and the database were updated.
3. Process X enters the critical section and thinks the local repo is
up-to-date while some internal states might be not.
$ shorttraceback
$ . "$TESTDIR/hgsql/library.sh"
$ enable treemanifest remotefilelog remotenames pushrebase
$ setconfig hgsql.initialsync=false treemanifest.treeonly=1 treemanifest.sendtrees=1 remotefilelog.reponame=x remotefilelog.cachepath=$TESTTMP/cache ui.ssh="python $TESTDIR/dummyssh" pushrebase.verbose=1 experimental.bundle2lazylocking=True
$ newrepo state1
$ echo remotefilelog >> .hg/requires
$ hg debugdrawdag << 'EOS'
> A
> EOS
$ newrepo state2
$ echo remotefilelog >> .hg/requires
$ hg debugdrawdag << 'EOS'
> B
> |
> A
> EOS
$ newrepo state3
$ echo remotefilelog >> .hg/requires
$ hg debugdrawdag << 'EOS'
> B C
> |/
> A
> EOS
$ cd $TESTTMP
$ initserver serverrepo master
Update the server repo and the database to state1.
$ cd $TESTTMP/serverrepo
$ setconfig treemanifest.server=1
$ hg pull -r tip $TESTTMP/state1 -q
$ hg bookmark -r tip master
Prepare the prepushrebase hook to update the server repo and the database.
$ cat > $TESTTMP/update-to-state2.sh <<EOF
> # Bypass pushrebase logic that enforces a bundle repo
> unset HG_HOOK_BUNDLEPATH
> # Update the server repo and the database to state2
> hg pull --cwd $TESTTMP/serverrepo -R $TESTTMP/serverrepo -r tip $TESTTMP/state2
> EOF
Another prepushrebase hook just to warm up in-memory repo states (changelog and
manifest).
$ cat > $TESTTMP/prepushrebase.py <<EOF
> def warmup(ui, repo, *args, **kwds):
> # Just have some side-effect loading the changelog and manifest
> data = repo['tip']['A'].data()
> ui.write_err('prepushrebase hook called. A = %r\n' % data)
> EOF
00manifest.i needs to exist so manifestlog is cacheable. Windows has a
different filecache implemenation and is excluded.
$ cd $TESTTMP/serverrepo
$ touch .hg/store/00manifest.i
$ hg dbsh -c 'repo.manifestlog; print(repo._filecache["manifestlog"]._entries[0].cacheable())'
True
$ cat >> .hg/hgrc << EOF
> [hgsql]
> verbose=1
> [hooks]
> prepushrebase.step1=python:$TESTTMP/prepushrebase.py:warmup
> prepushrebase.step2=bash $TESTTMP/update-to-state2.sh
> EOF
Do the push!
$ cd $TESTTMP/state3
$ hg push -r C --to master ssh://user@dummy/serverrepo
pushing rev dc0947a82db8 to destination ssh://user@dummy/serverrepo bookmark master
searching for changes
remote: prepushrebase hook called. A = 'A'
remote: [hgsql] got lock after * seconds (read 1 rows) (glob)
remote: pulling from $TESTTMP/state2
remote: searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
remote: new changesets 112478962961
remote: (run 'hg update' to get a working copy)
remote: [hgsql] held lock for * seconds (read 6 rows; write 7 rows) (glob)
remote: checking conflicts with 426bada5c675
remote: pushing 1 changeset:
remote: dc0947a82db8 C
remote: [hgsql] got lock after * seconds (read 1 rows) (glob)
remote: rebasing stack from 426bada5c675 onto 426bada5c675
remote: transaction abort!
remote: rollback completed
remote: [hgsql] held lock for * seconds (read 4 rows; write 0 rows) (glob)
remote: IntegrityError: 1062 (23000): Duplicate entry 'master-00manifesttree.i-1-0' for key 'PRIMARY'
abort: stream ended unexpectedly (got 0 bytes, expected 4)
[255]

View File

@ -58,7 +58,7 @@ drawdag() {
# This is useful to match error messages without the traceback.
shorttraceback() {
enable errorredirect
setconfig errorredirect.script='printf "%s" "$TRACE" | tail -1'
setconfig errorredirect.script='printf "%s" "$TRACE" | tail -1 1>&2'
}
# Set config items like --config way, instead of using cat >> $HGRCPATH