Summary:
We've been seeing some issues where non-write processes (like clone) are taking
the wlock/lock without taking the sqllock. This causes processes which do want
the sqllock (like pushes) to hang waiting. Streaming clones are one known case
of this, but this extra logging will help us catch other cases.
Once we're confident that nothing takes the wrong lock ordering, we can change
this to throw an exception instead of logging the error.
This required a minor refactor the synclimit code, so we can detect if we're in
a sync lock (which is a case where it's ok to take the wlock/lock without taking
the sqlock). And it configures the tests to use streaming clones when possible,
which trigger the current known bug.
Test Plan: Added a test
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark
Differential Revision: https://phabricator.intern.facebook.com/D4287384
Signature: t1:4287384:1481065929:dfac78e3537c3cdf44c810bc99c5f07ed5b12f4c
Summary:
Internally we are now logging the sqllock times to our metrics infrastructure.
It's easier if the numbers are integers, so let's turn the elapsed sqllock time
into milliseconds. We leave the user readable message alone though.
Test Plan: doitlive
Reviewers: #mercurial, quark
Reviewed By: quark
Differential Revision: https://phabricator.intern.facebook.com/D4287378
Signature: t1:4287378:1481065691:5ae3b259598692a41927c793dab1bc2c847acae0
Summary:
The manifest has been replaced with a manifestlog in upstream, so we need to
update our usage accordingly.
Test Plan: Ran the tests
Reviewers: #mercurial, quark
Reviewed By: quark
Differential Revision: https://phabricator.intern.facebook.com/D4198744
Signature: t1:4198744:1479408675:94fd5a5de367fcde1fc1ff16c0a88f052ef60a8d
Summary:
In the past we've introduced code into hgsql that tries to load bookmarks into
memory very early in the process, so there's no chance of the bookmarks being
out of sync with the changelog. When we added the synclimiter short circuit code
path, we lost the ordered loading which has caused the master bookmark to seem
to disappear when users were doing a pull and the bookmark file changed after
they had already loaded the changelog.
A real fix should go upstream, but since hgsql wraps the repo object so tightly
(before anything else gets a chance to operate on it), putting a work around
here is a simple, high impact fix.
Test Plan: Ran the tests, but really the tests won't catch anything here.
Reviewers: #mercurial, quark
Reviewed By: quark
Subscribers: quark
Differential Revision: https://phabricator.intern.facebook.com/D4121849
Signature: t1:4121849:1478133009:3cad7decdc866e68872c93b884467dfbc17f850c
Summary:
We had tried to be smart about incrementally updating only the heads and
bookmarks that actually changed, but the old logic was broken. It compared
binary and hex nodes against each other, and therefore didn't actually do
anything incrementally.
This patch makes us convert the Mercurial data to hex at the very beginning, so
the incremental check uses hex as well.
Test Plan:
Inserted print statements around the insert/delets and verified they
changed before and after
Reviewers: #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D4012317
Summary:
We sometimes "succeeded" in syncing to the database if we could not get local write locks, because we assumed that the only time we can't get those locks is if we're not waiting for them. However, if another process holds the local write locks and is not releasing them, we hit a timeout.
If you're doing a write operation, this is harmless - the MySQL lock ensures that only one client at a time can be syncing anyway. However, for read operations run with `--forcesync`, this can cause us to succeed without syncing, breaking our contract with our users.
Test Plan:
Run the test suite and confirm no change.
Manually take the local write locks, run against the test DB `hg log -r . --forcesync`, and confirm that we fail correctly with this change but not without it.
Confirm that even with the local locks taken, `hg log -r .` returns success
Reviewers: durham
Reviewed By: durham
Subscribers: tja, #mercurial
Differential Revision: https://phabricator.intern.facebook.com/D3976452
Tasks: 13484857
Signature: t1:3976452:1475757832:e8370819bc345ce8e1bde0b057755803adb9896c
Summary:
Upstream a099aba68382 changed the words about backup bundles.
Let's update our tests.
Test Plan: Run the changed tests.
Reviewers: #sourcecontrol, stash
Reviewed By: stash
Differential Revision: https://phabricator.intern.facebook.com/D3949507
Signature: t1:3949507:1475221527:712dc13005a2860aac64293968472ab8de074f8c
Summary:
Previously, if we needed to disable hgsql for some reason (like testing
corruption, etc) we would just disable the extension. Since we now add a
requires flag to repos that are created with hgsql, this no longer works.
Let's add a config option that can disable hgsql instead and update the tests to
use it. I'm not sure how these test failures slipped through the original
commit.
Test Plan: Ran the tests
Reviewers: #mercurial, ttung, simonfar
Reviewed By: simonfar
Subscribers: simonfar
Differential Revision: https://phabricator.intern.facebook.com/D3289119
Signature: t1:3289119:1462999047:f33337610a1777c379adb70ec307971823f7ae59
Summary:
Upstream mercurial has deleted bookmarks.write() since all bookmark writes
happen via transactions now. So we can delete our write wrapper for that. We
already wrap transaction closes, so nothing new needs to be added.
Test Plan: Ran the tests
Reviewers: #mercurial, ttung, rmcelroy
Reviewed By: rmcelroy
Differential Revision: https://phabricator.intern.facebook.com/D3289117
Signature: t1:3289117:1463000038:aaf3454059514d92a017bd579b6de0691b334795
Summary:
We don't want to allow commands to run in the repo if hgsql is not
enabled, so let's add a requirement.
This will not affect existing repositories, since they don't have the flag in
their requires, but the flag can be added manually once this is deployed.
Test Plan: Added a test
Reviewers: #sourcecontrol, ttung, simonfar
Reviewed By: simonfar
Subscribers: simonfar
Differential Revision: https://phabricator.intern.facebook.com/D3239263
Signature: t1:3239263:1461923217:b8fb59498292e1f9d317d98d5de7ce8513685ba6
Summary:
This adds a config option that can be enabled to profile what is
happening when the sqlock is taken (since that is the time spent that limits how
fast we can accept commits). It writes the profile out to an external file for
reading.
Test Plan:
Added a test. Also manually copied this to an hg server and did a
push.
Reviewers: #sourcecontrol, ttung, simonfar
Reviewed By: simonfar
Subscribers: simonfar
Differential Revision: https://phabricator.intern.facebook.com/D3239204
Signature: t1:3239204:1461922938:45298ada2955df8e5e145805f137ccc1974becea
Summary:
This fixes a bug where the initial sync of a generaldelta repo could
cause the changelog.i to have a bad header byte that indicated the revlog was
generaldelta when it really wasn't. This fixes by applying the changelog
hardcoded value to the lightweight hgsql revlog class.
Test Plan: Added a test. It failed before, and passes now.
Reviewers: #sourcecontrol, ttung, quark
Reviewed By: quark
Subscribers: quark
Differential Revision: https://phabricator.fb.com/D3194429
Signature: t1:3194429:1461030817:6cc836e2dcd8b1feeeec16ee46548eac443695bf
Test Plan: this is fixing some tests
Reviewers: #sourcecontrol, quark, ttung
Reviewed By: quark
Differential Revision: https://phabricator.fb.com/D3117092
Signature: t1:3117092:1459371915:231c133209a06f3e0e207482858775802d5820e7
Summary:
We've been seeing bugs where pulls result in bookmarks missing for clients. We
suspect it to be a race condition on the server side. One possible race is that
we load the changelog data before the bookmarks, and therefore may receive
bookmarks that point to a node that isn't in memory.
A simple fix to decrease the chances of this is to force loading bookmarks
first. Since hgsql is the first bit of code that access the repo, we can just
change the order here to ensure bookmarks are loaded first.
Test Plan:
It's not really a correctness change, so I wasn't able to test it.
What I did do however was insert exceptions into the bookmark loading and
changelog loading logic on one of our servers. This showed me that the changelog
was indeed being loaded first before because of this code.
Reviewers: #sourcecontrol, ttung
Reviewed By: ttung
Differential Revision: https://phabricator.fb.com/D3063659
Signature: t1:3063659:1458244130:7448377065fa0f0f6d98e43eaa00bdfc380948b6
Summary:
We are overriding the _afterlock to mind one more lock before executing
callbacks. We need that so our post commit hooks are executed out of the lock.
Test Plan: run tests
Reviewers: #sourcecontrol, ttung, durham
Reviewed By: durham
Differential Revision: https://phabricator.fb.com/D2809038
Signature: t1:2809038:1452124328:3b1d2da7711cdf1c97fe06d5dcf07be253582fcd
Summary:
Upstream has moved almost all bookmark writes to use bookmark.recordchange()
instead of write, which works inside a transaction. For hgsql, we need to
perform our bookmark writes inside a transaction now.
Additionally, we need to check self.disablesync inside committodb. disablesync
is used to allow us to write bookmarks during a syncdb without triggering the
bookmark writes to the db (since that would be a infinite loop). Previously we
only needed the syncdb check for bookmark writes, but since that has moved to
transactions, we need to do the check there too (committodb fires on
pretxnclose).
Lastly, in theory we could remove our wrapping of bookmark.write now, since
nothing should use it. But since upstream still has code for bookmark.write to
write bookmarks when outside a transaction, we can't remove our code just yet,
in case some extension still uses bookmark.write.
Test Plan: Ran the tests
Reviewers: ttung, lcharignon, #sourcecontrol, mitrandir
Reviewed By: mitrandir
Differential Revision: https://phabricator.fb.com/D2808601
Signature: t1:2808601:1452116992:ed34107a97ac633281ad8934266b35e3375cb247
Summary:
We've seen load issues on the mysql server in certain situations. This is caused
by large revision_reference query load from every read connection checking the
database.
To fix it, let's use Mercurial locks to enforce that only one reader is checking
and syncing at any given time. All other will just serve the existing on disk
data immediately.
Writer clients are unaffected. They do not obey the rate limiting lock and do
exactly what they did before.
Test Plan: I'll try to add a test
Reviewers: #sourcecontrol, ttung
Differential Revision: https://phabricator.fb.com/D2756217
Summary:
hgsql failed when run against bundle repos before. We've fixed it so that no
sync is performed when reading from a bundle repo, but writes will be rejected.
Test Plan: Added a test and ran the tests
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2665347
Summary:
The old MySQLdb library did not support IPv6 so we've switched to using
mysql.connector, which is more recent and does support IPv6.
The most notable change in functionality is that it returns byte arrays instead
of strings, so I had to add a custom type converter that returns strings.
Test Plan: Ran the tests (after installing the new rpm)
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2665344
Summary:
We're going to be refactoring which functions actually sync with sql soon, and
putting this logic in a function will help us stay clean.
Test Plan: Ran the tests
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2665341
Summary:
In order to increase commit throughput on the server side, we need to be able to
run hooks and do caching outside of the lock. Upstream Mercurial and the
pushrebase extension are doing this by making the unbundle lock possible to
acquire lazily.
For hgsql, this means we can't just wrap the entire unbundle command in a lock.
Instead we need to take the lock only when it's requested.
Test Plan:
Wrote a test, and changed the existing bundle2 test to use the lazy
locking
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2525414
Summary:
In a future patch we need to be able to manipulate the sqlsync/lock logic more
freely than the current executewithsql allows. This patch refactors the sql
sync/lock logic into a context class and changes executewithsql to use it.
Test Plan: Ran the tests
Reviewers: pyd, sid0, #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2521341
Summary:
Previously, any time we needed to edit the heads or bookmarks in the
revision_references table, we would delete all entries for that reponame and
insert them again. This ends up being very expensive and creates a large series
of deletes and inserts that need to be replicated to slave dbs.
This patch does the minimal number of deletions and inserts to update the table.
Test Plan: Ran the tests
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2534693
Summary:
The manifest cache is important for push performance, so let's keep it even
after we sync with the db. Previously repo.manifest was just invalidated and
thrown away.
This is safe because the manifest cache is key'd on the node, which will never
result in a stale cache.
Test Plan:
Ran the tests. And manually applied it to a mercurial server with a
bunch of printf's to make sure the exact cache instance survived the sync.
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2491693
Summary:
Since all the unbundle logic has been moved down into exchange.unbundle, we only
need to wrap that. In addition, not wrapping wireproto.unbundle allows us to use
that function in the pushrebase extension to prefetch the manifest contents
outside of the sql lock.
Test Plan:
Ran the tests. Also manually applied it on one server and verified
that pushrebase's prefetch occurred outside of the sql lock.
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2491688
e0a0f9ad3e4c in upstream Mercurial caused the filecache entries to have their
stat times updated when a transaction closed. This meant that our changelog
wasn't actually up to date, but the cache thought it was. And it turns out
self.invalidate() doesn't actually invalidate those time stamps.
The fix is to manually destroy the file caches that we know we've invalidated.
Summary:
The lock held timer was also including how long the lock had been waited on,
which was wrong. Let's break it into two timers.
Also fix an annoying indentation error.
Test Plan: Ran the tests
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2474908
Summary:
We were using the revlog data length to know if we needed to commit to the
database. This is wrong since empty revlogs are allowed. This would only
trigger when a commit was made that *only* merged files and didn't change any
history at all.
This bug was introduced by the recent batching improvements.
Test Plan:
Ran the tests. We should probably write a test covering this, but we
need to get this deployed asap.
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2439674
Summary:
Previously we only used locktimeout for configuring the connection's innodb lock
timeout. Let's also use it for the actual GET_LOCK() call.
When combined with the pushrebase extension, there can be situations where
someone else is doing a large push, and we don't want to fail the local client's
push just because it's taking a while. So this config gives us the ability to
configure how long the push will wait before giving up.
Test Plan: Ran the tests
Reviewers: #sourcecontrol, ericsumner
Reviewed By: #sourcecontrol, ericsumner
Differential Revision: https://phabricator.fb.com/D2419968
Summary:
We've been seeing the occasional spurious connection issues. Let's do some basic
retrying when connecting to mysql.
Test Plan: It's probably fine!
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2419942
Summary:
Previously we did a mysql insert for every revision that was being inserted.
This involves a lot of round trips and slows the push down dramatically. Let's
batch up inserts instead.
Test Plan:
Ran the tests and already deployed it to a hg server and pushed to
it.
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2396633
Summary:
I was seeing errors where mysql timed about because it was fetching too much
data during the rev validation stage. So let's change it to verify the revs in
batches.
Test Plan:
Tests pass, I also deployed it to a hg server and pushed changes to
it.
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2396632
OperationalError exceptions were being hidden by "MySQL server has gone away"
exceptions caused by the unlock. We already accounted for certain kinds of
exceptions, so we just need to add the other type to the list.
Tested it by hotfixing a server.
Summary:
We've seen some hg processes hang on the Mercurial servers because
mysql throws an exception on the background thread and the foreground thread
just waits forever.
This patch catches background exceptions and sends them to the foreground for
throwing.
Test Plan:
I'm not sure how to cause an exception via test, but I manually
verified it works by adding a 'raise Exception(...)' to the background thread
and running the tests and verifying the exception showed up.
Reviewers: #sourcecontrol
Differential Revision: https://phabricator.fb.com/D2337535
Summary:
We suspect that these delete statements sometimes cause full table
scans which lock up the table and cause lock wait timeout errors from other
repositories. By forcing the index, we should prevent the scan and prevent
these errors from happening, increasing hgsql throughput.
Test Plan: ./run-tests.py
Reviewers: #sourcecontrol, durham, ebergen
Reviewed By: ebergen
Subscribers: durham
Differential Revision: https://phabricator.fb.com/D2235417
Tasks: 7462484
Signature: t1:2235417:1436579460:b0d0193db4c2d799c8263fb2225b70b8762eb06f
Summary:
pretxnclose hooks were incorrectly running during the sql sync. Since the
commits have already been committed (on another machine), there's no need to run
the hooks, so let's disable them.
Test Plan: Added a test
Reviewers: pyd, lcharignon, akushner, rmcelroy
Differential Revision: https://phabricator.fb.com/D2173886
Summary:
The hg-ssh wrapper prevents all transactions from running, including the one we
use to sync our repositories from hgsql. This patch updates hgsql to override
that hook temporarily during syncs.
This wasn't an issue before because upstream mercurial used to hook on
prechangegroup which hgsql did not fire.
Test Plan: Added a test. Verified it failed before, and passes now.
Reviewers: pyd, lcharignon, rmcelroy, sid0
Differential Revision: https://phabricator.fb.com/D2148580
Summary:
This can't hurt anything, and it makes sure we have a fresh
transaction for the references which have been timing out, strangely.
Test Plan: run_tests.py
Reviewers: durham
Reviewed By: durham
Subscribers: durham
Differential Revision: https://phabricator.fb.com/D2137954
Signature: t1:2137954:1433811692:1234cde23efd2fa8a95ad87453390aa8edeb335d