Commit Graph

105 Commits

Author SHA1 Message Date
Durham Goode
a6491a0b9c locks: add logging for invalid lock order taking
Summary:
We've been seeing some issues where non-write processes (like clone) are taking
the wlock/lock without taking the sqllock. This causes processes which do want
the sqllock (like pushes) to hang waiting. Streaming clones are one known case
of this, but this extra logging will help us catch other cases.

Once we're confident that nothing takes the wrong lock ordering, we can change
this to throw an exception instead of logging the error.

This required a minor refactor the synclimit code, so we can detect if we're in
a sync lock (which is a case where it's ok to take the wlock/lock without taking
the sqlock). And it configures the tests to use streaming clones when possible,
which trigger the current known bug.

Test Plan: Added a test

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark

Differential Revision: https://phabricator.intern.facebook.com/D4287384

Signature: t1:4287384:1481065929:dfac78e3537c3cdf44c810bc99c5f07ed5b12f4c
2016-12-06 16:04:20 -08:00
Durham Goode
593af04245 metrics: change elapsed time to ms for sqllock metric
Summary:
Internally we are now logging the sqllock times to our metrics infrastructure.
It's easier if the numbers are integers, so let's turn the elapsed sqllock time
into milliseconds. We leave the user readable message alone though.

Test Plan: doitlive

Reviewers: #mercurial, quark

Reviewed By: quark

Differential Revision: https://phabricator.intern.facebook.com/D4287378

Signature: t1:4287378:1481065691:5ae3b259598692a41927c793dab1bc2c847acae0
2016-12-06 16:04:09 -08:00
Durham Goode
ccbf0a3566 manifest: update to work with upstream changes
Summary:
The manifest has been replaced with a manifestlog in upstream, so we need to
update our usage accordingly.

Test Plan: Ran the tests

Reviewers: #mercurial, quark

Reviewed By: quark

Differential Revision: https://phabricator.intern.facebook.com/D4198744

Signature: t1:4198744:1479408675:94fd5a5de367fcde1fc1ff16c0a88f052ef60a8d
2016-11-17 10:51:40 -08:00
Durham Goode
d61aae9385 races: load bookmarks earlier in no-sync cases
Summary:
In the past we've introduced code into hgsql that tries to load bookmarks into
memory very early in the process, so there's no chance of the bookmarks being
out of sync with the changelog. When we added the synclimiter short circuit code
path, we lost the ordered loading which has caused the master bookmark to seem
to disappear when users were doing a pull and the bookmark file changed after
they had already loaded the changelog.

A real fix should go upstream, but since hgsql wraps the repo object so tightly
(before anything else gets a chance to operate on it), putting a work around
here is a simple, high impact fix.

Test Plan: Ran the tests, but really the tests won't catch anything here.

Reviewers: #mercurial, quark

Reviewed By: quark

Subscribers: quark

Differential Revision: https://phabricator.intern.facebook.com/D4121849

Signature: t1:4121849:1478133009:3cad7decdc866e68872c93b884467dfbc17f850c
2016-11-03 10:22:46 -07:00
Durham Goode
27925e5fef Fix incremental updating of heads and bookmarks
Summary:
We had tried to be smart about incrementally updating only the heads and
bookmarks that actually changed, but the old logic was broken. It compared
binary and hex nodes against each other, and therefore didn't actually do
anything incrementally.

This patch makes us convert the Mercurial data to hex at the very beginning, so
the incremental check uses hex as well.

Test Plan:
Inserted print statements around the insert/delets and verified they
changed before and after

Reviewers: #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D4012317
2016-10-13 09:47:40 -07:00
Simon Farnsworth
145b3a59b8 hgsql: Do not succeed in syncing if we failed to take locks
Summary:
We sometimes "succeeded" in syncing to the database if we could not get local write locks, because we assumed that the only time we can't get those locks is if we're not waiting for them. However, if another process holds the local write locks and is not releasing them, we hit a timeout.

If you're doing a write operation, this is harmless - the MySQL lock ensures that only one client at a time can be syncing anyway. However, for read operations run with `--forcesync`, this can cause us to succeed without syncing, breaking our contract with our users.

Test Plan:
Run the test suite and confirm no change.

Manually take the local write locks, run against the test DB `hg log -r . --forcesync`, and confirm that we fail correctly with this change but not without it.

Confirm that even with the local locks taken, `hg log -r .` returns success

Reviewers: durham

Reviewed By: durham

Subscribers: tja, #mercurial

Differential Revision: https://phabricator.intern.facebook.com/D3976452

Tasks: 13484857

Signature: t1:3976452:1475757832:e8370819bc345ce8e1bde0b057755803adb9896c
2016-10-07 05:25:21 -07:00
Jun Wu
f42f6a17ec hgsql: update tests to fix upstream compatibility
Summary:
Upstream a099aba68382 changed the words about backup bundles.
Let's update our tests.

Test Plan: Run the changed tests.

Reviewers: #sourcecontrol, stash

Reviewed By: stash

Differential Revision: https://phabricator.intern.facebook.com/D3949507

Signature: t1:3949507:1475221527:712dc13005a2860aac64293968472ab8de074f8c
2016-09-29 19:56:45 -07:00
Durham Goode
3f97d92346 bypass: add a hgsql.bypass config option for disabling hgsql
Summary:
Previously, if we needed to disable hgsql for some reason (like testing
corruption, etc) we would just disable the extension. Since we now add a
requires flag to repos that are created with hgsql, this no longer works.

Let's add a config option that can disable hgsql instead and update the tests to
use it. I'm not sure how these test failures slipped through the original
commit.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3289119

Signature: t1:3289119:1462999047:f33337610a1777c379adb70ec307971823f7ae59
2016-05-11 13:54:24 -07:00
Durham Goode
95858fbd8a bookmarks: stop wrapping bookmarks.write
Summary:
Upstream mercurial has deleted bookmarks.write() since all bookmark writes
happen via transactions now. So we can delete our write wrapper for that. We
already wrap transaction closes, so nothing new needs to be added.

Test Plan: Ran the tests

Reviewers: #mercurial, ttung, rmcelroy

Reviewed By: rmcelroy

Differential Revision: https://phabricator.intern.facebook.com/D3289117

Signature: t1:3289117:1463000038:aaf3454059514d92a017bd579b6de0691b334795
2016-05-11 13:54:21 -07:00
Durham Goode
c58a61d5cd init: add hgsql requirement during repo init and clone
Summary:
We don't want to allow commands to run in the repo if hgsql is not
enabled, so let's add a requirement.

This will not affect existing repositories, since they don't have the flag in
their requires, but the flag can be added manually once this is deployed.

Test Plan: Added a test

Reviewers: #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3239263

Signature: t1:3239263:1461923217:b8fb59498292e1f9d317d98d5de7ce8513685ba6
2016-04-29 09:55:23 -07:00
Durham Goode
031d404a34 profiling: add the ability to profile the sqllock time
Summary:
This adds a config option that can be enabled to profile what is
happening when the sqlock is taken (since that is the time spent that limits how
fast we can accept commits). It writes the profile out to an external file for
reading.

Test Plan:
Added a test. Also manually copied this to an hg server and did a
push.

Reviewers: #sourcecontrol, ttung, simonfar

Reviewed By: simonfar

Subscribers: simonfar

Differential Revision: https://phabricator.intern.facebook.com/D3239204

Signature: t1:3239204:1461922938:45298ada2955df8e5e145805f137ccc1974becea
2016-04-29 09:53:17 -07:00
Durham Goode
83c63cedf7 Fix corruption when syncing initial data for generaldelta repos
Summary:
This fixes a bug where the initial sync of a generaldelta repo could
cause the changelog.i to have a bad header byte that indicated the revlog was
generaldelta when it really wasn't. This fixes by applying the changelog
hardcoded value to the lightweight hgsql revlog class.

Test Plan: Added a test. It failed before, and passes now.

Reviewers: #sourcecontrol, ttung, quark

Reviewed By: quark

Subscribers: quark

Differential Revision: https://phabricator.fb.com/D3194429

Signature: t1:3194429:1461030817:6cc836e2dcd8b1feeeec16ee46548eac443695bf
2016-04-18 22:07:07 -07:00
Kostia Balytskyi
00ae3a7007 hgsql: add keyword-argument-based logging 2016-04-01 03:11:34 -07:00
Mateusz Kwapich
173231304d hgsql: norepo is a parameter now
Test Plan: this is fixing some tests

Reviewers: #sourcecontrol, quark, ttung

Reviewed By: quark

Differential Revision: https://phabricator.fb.com/D3117092

Signature: t1:3117092:1459371915:231c133209a06f3e0e207482858775802d5820e7
2016-03-30 14:10:48 -07:00
Durham Goode
3aba9cef0c Force bookmark-then-changelog load order
Summary:
We've been seeing bugs where pulls result in bookmarks missing for clients. We
suspect it to be a race condition on the server side. One possible race is that
we load the changelog data before the bookmarks, and therefore may receive
bookmarks that point to a node that isn't in memory.

A simple fix to decrease the chances of this is to force loading bookmarks
first. Since hgsql is the first bit of code that access the repo, we can just
change the order here to ensure bookmarks are loaded first.

Test Plan:
It's not really a correctness change, so I wasn't able to test it.
What I did do however was insert exceptions into the bookmark loading and
changelog loading logic on one of our servers. This showed me that the changelog
was indeed being loaded first before because of this code.

Reviewers: #sourcecontrol, ttung

Reviewed By: ttung

Differential Revision: https://phabricator.fb.com/D3063659

Signature: t1:3063659:1458244130:7448377065fa0f0f6d98e43eaa00bdfc380948b6
2016-03-18 17:02:48 -07:00
Mateusz Kwapich
bd3030045b hgsql: move repo._afterlock callbacks execution after the sqllock release
Summary:
We are overriding the _afterlock to mind one more lock before executing
callbacks. We need that so our post commit hooks are executed out of the lock.

Test Plan: run tests

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Differential Revision: https://phabricator.fb.com/D2809038

Signature: t1:2809038:1452124328:3b1d2da7711cdf1c97fe06d5dcf07be253582fcd
2016-01-06 16:37:46 -08:00
Mateusz Kwapich
45bc7261ee hgsql: lint nits
Summary: Just unused imports and variables - nothing serius.

Test Plan: Ran tests.

Reviewers: #sourcecontrol, ttung, durham

Reviewed By: durham

Differential Revision: https://phabricator.fb.com/D2808903

Signature: t1:2808903:1452120578:705abfbcd653eecd150d41c443edc855ef334ea3
2016-01-06 15:01:10 -08:00
Durham Goode
9bf82a7d92 bookmarks: remove use of bookmark.write()
Summary:
Upstream has moved almost all bookmark writes to use bookmark.recordchange()
instead of write, which works inside a transaction. For hgsql, we need to
perform our bookmark writes inside a transaction now.

Additionally, we need to check self.disablesync inside committodb.  disablesync
is used to allow us to write bookmarks during a syncdb without triggering the
bookmark writes to the db (since that would be a infinite loop). Previously we
only needed the syncdb check for bookmark writes, but since that has moved to
transactions, we need to do the check there too (committodb fires on
pretxnclose).

Lastly, in theory we could remove our wrapping of bookmark.write now, since
nothing should use it. But since upstream still has code for bookmark.write to
write bookmarks when outside a transaction, we can't remove our code just yet,
in case some extension still uses bookmark.write.

Test Plan: Ran the tests

Reviewers: ttung, lcharignon, #sourcecontrol, mitrandir

Reviewed By: mitrandir

Differential Revision: https://phabricator.fb.com/D2808601

Signature: t1:2808601:1452116992:ed34107a97ac633281ad8934266b35e3375cb247
2016-01-06 14:52:23 -08:00
Durham Goode
2ef0e253f8 Add rate limiting to read-only database syncs
Summary:
We've seen load issues on the mysql server in certain situations. This is caused
by large revision_reference query load from every read connection checking the
database.

To fix it, let's use Mercurial locks to enforce that only one reader is checking
and syncing at any given time.  All other will just serve the existing on disk
data immediately.

Writer clients are unaffected. They do not obey the rate limiting lock and do
exactly what they did before.

Test Plan: I'll try to add a test

Reviewers: #sourcecontrol, ttung

Differential Revision: https://phabricator.fb.com/D2756217
2015-12-14 12:34:06 -08:00
Durham Goode
ab07fcd0a7 Add support for running on top of bundle repos
Summary:
hgsql failed when run against bundle repos before. We've fixed it so that no
sync is performed when reading from a bundle repo, but writes will be rejected.

Test Plan: Added a test and ran the tests

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2665347
2015-11-17 13:28:56 -08:00
Durham Goode
883b577953 Switch to mysql.connector
Summary:
The old MySQLdb library did not support IPv6 so we've switched to using
mysql.connector, which is more recent and does support IPv6.

The most notable change in functionality is that it returns byte arrays instead
of strings, so I had to add a custom type converter that returns strings.

Test Plan: Ran the tests (after installing the new rpm)

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2665344
2015-11-17 13:34:00 -08:00
Durham Goode
0d62bc2a7a Refactor issqlrepo check into a function
Summary:
We're going to be refactoring which functions actually sync with sql soon, and
putting this logic in a function will help us stay clean.

Test Plan: Ran the tests

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2665341
2015-11-17 13:28:15 -08:00
Durham Goode
fae660bb46 One more spot with a+ opener 2015-11-05 12:57:17 -08:00
Durham Goode
6b151bffeb Open data files with read and write mode
Upstream Mercurial now reuses file handles for reads, so we need to open the dfh
handle with a+ so it can be read from.
2015-11-03 13:13:39 -08:00
Durham Goode
c21db7cde6 Fix wlock/lock wrapping order issues
These were caught by the devel warnings in the tests
2015-10-26 15:48:14 -07:00
Durham Goode
2d8cf7b54e Update wrapping to work with Mercurial 3.6 2015-10-26 15:47:53 -07:00
Durham Goode
a6e7769d60 Add a mandatory 5 second sleep before sqlstrip runs
I almost deleted some very important history.  We need a grace period for people
to cancel this command.
2015-10-15 16:21:31 -07:00
Durham Goode
d8a2f1ef92 Allow lazily taking the lock in bundle2 pushes
Summary:
In order to increase commit throughput on the server side, we need to be able to
run hooks and do caching outside of the lock. Upstream Mercurial and the
pushrebase extension are doing this by making the unbundle lock possible to
acquire lazily.

For hgsql, this means we can't just wrap the entire unbundle command in a lock.
Instead we need to take the lock only when it's requested.

Test Plan:
Wrote a test, and changed the existing bundle2 test to use the lazy
locking

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2525414
2015-10-09 13:13:18 -07:00
Durham Goode
085535d0f1 Refactor sql locking into a enter/exit context class
Summary:
In a future patch we need to be able to manipulate the sqlsync/lock logic more
freely than the current executewithsql allows. This patch refactors the sql
sync/lock logic into a context class and changes executewithsql to use it.

Test Plan: Ran the tests

Reviewers: pyd, sid0, #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2521341
2015-10-07 16:42:52 -07:00
Durham Goode
a42c4eefda Do minimal heads and bookmarks edits to revision_references
Summary:
Previously, any time we needed to edit the heads or bookmarks in the
revision_references table, we would delete all entries for that reponame and
insert them again. This ends up being very expensive and creates a large series
of deletes and inserts that need to be replicated to slave dbs.

This patch does the minimal number of deletions and inserts to update the table.

Test Plan: Ran the tests

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2534693
2015-10-12 17:41:58 -07:00
Durham Goode
ee8b7120d5 Save manifest cache across db syncs
Summary:
The manifest cache is important for push performance, so let's keep it even
after we sync with the db. Previously repo.manifest was just invalidated and
thrown away.

This is safe because the manifest cache is key'd on the node, which will never
result in a stale cache.

Test Plan:
Ran the tests. And manually applied it to a mercurial server with a
bunch of printf's to make sure the exact cache instance survived the sync.

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2491693
2015-09-29 15:41:30 -07:00
Durham Goode
0a2a6f3554 Stop wrapping wireproto.unbundle
Summary:
Since all the unbundle logic has been moved down into exchange.unbundle, we only
need to wrap that. In addition, not wrapping wireproto.unbundle allows us to use
that function in the pushrebase extension to prefetch the manifest contents
outside of the sql lock.

Test Plan:
Ran the tests. Also manually applied it on one server and verified
that pushrebase's prefetch occurred outside of the sql lock.

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2491688
2015-09-29 15:39:28 -07:00
Durham Goode
6700d7c441 Fix lock timeout error message
Previously we'd throw an ugly stack trace when the mysql lock timed out. Now we
throw a meaningful error message, without a stack trace.
2015-09-24 22:37:52 -07:00
Durham Goode
4bd5c787d8 Fix changelog cache invalidation
e0a0f9ad3e4c in upstream Mercurial caused the filecache entries to have their
stat times updated when a transaction closed. This meant that our changelog
wasn't actually up to date, but the cache thought it was. And it turns out
self.invalidate() doesn't actually invalidate those time stamps.

The fix is to manually destroy the file caches that we know we've invalidated.
2015-09-24 21:23:45 -07:00
Durham Goode
9d22da7132 Fix lock wait timer
Summary:
The lock held timer was also including how long the lock had been waited on,
which was wrong. Let's break it into two timers.

Also fix an annoying indentation error.

Test Plan: Ran the tests

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2474908
2015-09-23 19:32:34 -07:00
Durham Goode
7a1430d607 Simplify even further by just always doing a commit 2015-09-14 11:19:13 -07:00
Durham Goode
08c12f8dc4 Fix not writing empty revlog entries to database
Summary:
We were using the revlog data length to know if we needed to commit to the
database. This is wrong since empty revlogs are allowed.  This would only
trigger when a commit was made that *only* merged files and didn't change any
history at all.

This bug was introduced by the recent batching improvements.

Test Plan:
Ran the tests. We should probably write a test covering this, but we
need to get this deployed asap.

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2439674
2015-09-14 11:06:25 -07:00
Durham Goode
696882251f Use hgsql.locktimeout config for GET_LOCK
Summary:
Previously we only used locktimeout for configuring the connection's innodb lock
timeout. Let's also use it for the actual GET_LOCK() call.

When combined with the pushrebase extension, there can be situations where
someone else is doing a large push, and we don't want to fail the local client's
push just because it's taking a while. So this config gives us the ability to
configure how long the push will wait before giving up.

Test Plan: Ran the tests

Reviewers: #sourcecontrol, ericsumner

Reviewed By: #sourcecontrol, ericsumner

Differential Revision: https://phabricator.fb.com/D2419968
2015-09-07 18:49:31 -07:00
Durham Goode
853592d228 Add retry to mysql connection
Summary:
We've been seeing the occasional spurious connection issues. Let's do some basic
retrying when connecting to mysql.

Test Plan: It's probably fine!

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2419942
2015-09-07 18:18:37 -07:00
Durham Goode
e26a7089e1 Log sql lock hold times to blackbox
To help us debug push perf issues, let's record how long the sql lock was held
for each write operation on the server.
2015-09-07 17:39:01 -07:00
Durham Goode
679ef89c06 Enable batch inserts
Summary:
Previously we did a mysql insert for every revision that was being inserted.
This involves a lot of round trips and slows the push down dramatically. Let's
batch up inserts instead.

Test Plan:
Ran the tests and already deployed it to a hg server and pushed to
it.

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2396633
2015-08-28 20:10:02 -07:00
Durham Goode
ff04a6d5e2 Fetch verification data in batches
Summary:
I was seeing errors where mysql timed about because it was fetching too much
data during the rev validation stage. So let's change it to verify the revs in
batches.

Test Plan:
Tests pass, I also deployed it to a hg server and pushed changes to
it.

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2396632
2015-08-28 18:14:19 -07:00
Durham Goode
3a211d0cce Don't eat exceptions during sqlunlock
OperationalError exceptions were being hidden by "MySQL server has gone away"
exceptions caused by the unlock. We already accounted for certain kinds of
exceptions, so we just need to add the other type to the list.

Tested it by hotfixing a server.
2015-08-27 21:27:53 -07:00
Durham Goode
41ea00a0cc Handle exceptions on the background fetch thread
Summary:
We've seen some hg processes hang on the Mercurial servers because
mysql throws an exception on the background thread and the foreground thread
just waits forever.

This patch catches background exceptions and sends them to the foreground for
throwing.

Test Plan:
I'm not sure how to cause an exception via test, but I manually
verified it works by adding a 'raise Exception(...)' to the background thread
and running the tests and verifying the exception showed up.

Reviewers: #sourcecontrol

Differential Revision: https://phabricator.fb.com/D2337535
2015-08-12 11:10:25 -07:00
Durham Goode
1782ee1a10 Update revlog.addgroup to match upstream
Upstream added a new addrevisioncb parameter to addgroup. This just updates our
code to use it.
2015-07-22 15:50:55 -07:00
Ryan McElroy
cf0e8435a3 Use FORCE INDEX to prevent full table scans
Summary:
We suspect that these delete statements sometimes cause full table
scans which lock up the table and cause lock wait timeout errors from other
repositories. By forcing the index, we should prevent the scan and prevent
these errors from happening, increasing hgsql throughput.

Test Plan: ./run-tests.py

Reviewers: #sourcecontrol, durham, ebergen

Reviewed By: ebergen

Subscribers: durham

Differential Revision: https://phabricator.fb.com/D2235417

Tasks: 7462484

Signature: t1:2235417:1436579460:b0d0193db4c2d799c8263fb2225b70b8762eb06f
2015-07-10 18:39:01 -07:00
Durham Goode
98db3044fd Remove use of sopener
Upstream has removed repo.sopener, so switch it.  Also fix a flakey test.
2015-07-08 21:23:54 -07:00
Durham Goode
235a2d4a1a hooks: disable all pretxnclose hooks during sync
Summary:
pretxnclose hooks were incorrectly running during the sql sync. Since the
commits have already been committed (on another machine), there's no need to run
the hooks, so let's disable them.

Test Plan: Added a test

Reviewers: pyd, lcharignon, akushner, rmcelroy

Differential Revision: https://phabricator.fb.com/D2173886
2015-06-19 12:45:16 -07:00
Durham Goode
e12f55e798 hgssh: override readonly hook in hgssh
Summary:
The hg-ssh wrapper prevents all transactions from running, including the one we
use to sync our repositories from hgsql. This patch updates hgsql to override
that hook temporarily during syncs.

This wasn't an issue before because upstream mercurial used to hook on
prechangegroup which hgsql did not fire.

Test Plan: Added a test. Verified it failed before, and passes now.

Reviewers: pyd, lcharignon, rmcelroy, sid0

Differential Revision: https://phabricator.fb.com/D2148580
2015-06-11 14:37:03 -07:00
Ryan McElroy
ef88578b6b hgsql: always commit before refreshing references
Summary:
This can't hurt anything, and it makes sure we have a fresh
transaction for the references which have been timing out, strangely.

Test Plan: run_tests.py

Reviewers: durham

Reviewed By: durham

Subscribers: durham

Differential Revision: https://phabricator.fb.com/D2137954

Signature: t1:2137954:1433811692:1234cde23efd2fa8a95ad87453390aa8edeb335d
2015-06-08 17:35:09 -07:00