Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/129
Some datasources will throw exceptions but this does not indicate a problem with the datasource itself. This can make the statistics difficult to use to track actual problems versus problems with the way the datasource is used.
Here we allow the datasource to classify some failures to be ignored by the stats collection. They are not simply ignored however - but stored in a new field `fetchIgnoredFailures`
Reviewed By: josefs
Differential Revision: D23475953
fbshipit-source-id: a35ee0fc44ae98db86ae56573f5e7462e0355709
Summary:
This unit test demonstrates 2 things:
1) It is possible for Haxl computation to be interrupted in ways that block the scheduler indefinitely
2) Calling `sanitizeEnv` on the env from such a computation allows us to reuse it for future computations. If we don't do that, future computations can still block even without any exception thrown during the 2nd run.
Reviewed By: DylanZA
Differential Revision: D22397981
fbshipit-source-id: 48dfca49ab3485693bc772ff346945779809d9e8
Summary:
Add profile information as to what datasources the scheduler was blocked on when waiting for completions that were scheduled with BackgroundFetch.
This will not give any information on SyncFetch/AsyncFetch fetches, but it is easier to reason about those using the datasource stats.
Reviewed By: watashi
Differential Revision: D21903376
fbshipit-source-id: 42df4c567619b7e2dd6ac6acc36bcdafa85dcbe7
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/120
This adds tracking of memo/fetches per label by a unique id for each. Using this we can track exactly where time was spent, and where it was shared
Reviewed By: simonmar
Differential Revision: D20792435
fbshipit-source-id: 55c1e778d313d103a910c6dd5be512f95125acce
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/119
Currently profiling with labels does not track full stacks.
This is problematic in the case where all top level methods end up calling a common method with a label, as we have no way of attributing to the top level methods.
Eg if A/B/C all call X when running work then we cannot tell which work is expensive, as we have no connection from the work through X
Reviewed By: simonmar
Differential Revision: D20384255
fbshipit-source-id: f9aa0462904c17dee32d37a659b491e8d252d6db
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/118
* add ability to set reporting flag for benchmarking
* add argument parser
* add a test for using lots of labels
Reviewed By: simonmar
Differential Revision: D20792436
fbshipit-source-id: c87bd8e996397cb2ef229cf59927530a4dac20df
Summary:
The standard semantics test only tests the easy case when there is no blocking involved.
Also, change `sync_test` so that the data fetches are not cached. That confused me quite a lot during a debugging session.
Reviewed By: simonmar
Differential Revision: D19142099
fbshipit-source-id: 89697dbb896a1696aa916e3fcf659bf6a031f076
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/116
Add backgroundFetch methods that run a batch in the background. The Seq method will run the batch sequentially, the par method will run each request in the batch in parallel
Reviewed By: simonmar
Differential Revision: D20029453
fbshipit-source-id: d66a7959dbe09468ff67981fc3adf51704925165
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/114
When processing batches it would be quite easy to double count the allocations from child threads. This diff fixes it by setting the counter to zero after taking the current allocation count
Reviewed By: malmerey
Differential Revision: D19580472
fbshipit-source-id: 4b9a97f75e82052f4c5d94e1a6762a862a907ffb
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/115
Add a new API that makes it convenient to call backgroundFetchAcquireRelease from Haskell without going through the C APIs
Reviewed By: simonmar
Differential Revision: D19580473
fbshipit-source-id: 408f2a8c50381ddf98b35f946fbace2cd1194e55
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/112
Include what datasource and what type of fetches are included in the fetch round for tracing.
Reviewed By: simonmar
Differential Revision: D19554548
fbshipit-source-id: 747ea86ab355c9208bad1dcd938a0eec5b08dd72
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/111
Right now BackgroundFetches produce multiple FetchStats for the same batch, but it is not possible to link these together to get an idea of how big the batch was.
This introduces a field to FetchStats that can be used to link batches together as well as a utility method to do this in a default manner
Reviewed By: watashi
Differential Revision: D19469048
fbshipit-source-id: fce687c49ac4cbdc7cbd6804f37b6f120d7efad3
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/110
The test would fail under large concurrency as sometimes the result got processed before the injected exception.
Reviewed By: ahaym
Differential Revision: D19454121
fbshipit-source-id: eb2953c14c75c0233248a152aea027266f3e0e69
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/109
FutureFetch is unused (except for one test) and overall has not proven itself to be a useful fetch type. It adds a new waiting point (the others being BackgroundFetch and Async/Sync fetches) which can add latency. For example if all three are dispatched in one round how would the scheduler know ahead of time which one to wait on in order to make forward progress.
Reviewed By: simonmar
Differential Revision: D19410093
fbshipit-source-id: 40c900fbff9e06098acb2a21fc59b49adefadc5b
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/107
Add a simple mechanism (similar to asyncFetchAcquireRelease) to allow simple converting of data sources from AsyncFetch to BackgroundFetch
Reviewed By: simonmar
Differential Revision: D19272624
fbshipit-source-id: 3aec107de26fb59a4be3b2818b4f769f3404b15f
Summary: Switch to use `BasicHashTable` from `hashtables` package instead of `Data.HashMap.Strict` for `DataCache`
Reviewed By: simonmar
Differential Revision: D19219048
fbshipit-source-id: a2a8df0e715c202ca79a1f3a1c0dad133788e1eb
Summary: Bump the minimum Haxl version to GHC 8.2, which at this point is 2.5 years old but more importantly has many features that are really helpful in Haxl (such as the hs_try_put_mvar API function, which is really useful for BackgroundFetch)
Reviewed By: josefs
Differential Revision: D19327952
fbshipit-source-id: f635068fe9fb8f1d1f0d83ccbf9c3c04947183a0
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/105
`GenHaxl` doesn't implement the `MonadFail` class. This causes the ParallelTests to fail on ghc 8.6 and above.
This diff shuffles the code so we don't need the `MonadFail` class.
Reviewed By: awalterschulze
Differential Revision: D18749419
fbshipit-source-id: 98398dae9cb687076c7aaec62260dd21ca83ef3e
Summary:
Make `pAnd`/`pOr` symmetric in their shortcutting behavior. Before this diff they only returned faster if the first argument returned `False`/`True` respectively. They can now also return faster on the second argument.
The combinator `biselect` from selective applicative functors is also introduced and has a symmetric shortcutting behavior just like `pAnd` and `pOr`.
Reviewed By: simonmar
Differential Revision: D17313992
fbshipit-source-id: edf4f0000d2e6146195107f362486197f78cc4df
Summary:
This diff provides a way to do logs from a haxl computation, but not
have them memoized. This is a better alternative than doing arbitrary IO from
haxl.
Reviewed By: xich
Differential Revision: D17809398
fbshipit-source-id: 1cfe6fe24df09f12d352a18a1a78486b0a9976f8
Summary:
This test is flaky and failed in recent travis ci run:
https://travis-ci.org/facebook/Haxl/jobs/560169278
2ms is too small and a context switch or gc can affect the order of events.
Let's enlarge this to 100ms, which is similar to what we have in
FullyAsyncTest.
Reviewed By: anubhav94N
Differential Revision: D16522892
fbshipit-source-id: d27dd0b185115fd2ab0df9496b0811066e731305
Summary:
We have define this in tests and other libraries. Let's simply expose this
functionality from `Haxl.Prelude` as it can be useful in a few cases.
Reviewed By: anubhav94N
Differential Revision: D16522569
fbshipit-source-id: b35726e9ad172a36d76b755498fbb53d9a9db188
Summary:
Pull Request resolved: https://github.com/facebook/Haxl/pull/99
This adds unit tests to haxl, to make sure we are tracking the outgone
fetches correctly..
Reviewed By: simonmar
Differential Revision: D14683672
fbshipit-source-id: 49a318f0b8aa38c2af154fcbe0946122e70b9565
Summary:
Add benchmarks for sequential and parallel writes.
Ran the benchmark for 10^6 writes
parallel - 0.19s
seq - 0.07s
Running in parallel probably has more contention for the IORef
Reviewed By: simonmar
Differential Revision: D14386951
fbshipit-source-id: 164972e714eac14406fc106df073474d141e9ca0
Summary:
Most important thing I want to test is that when a memoized
computation happens again - the writes are indeed duplicated in the
environment. This diff adds tests for different ways memoized computation can
happen in conjunction with non-memoized computation..
Reviewed By: simonmar
Differential Revision: D14386667
fbshipit-source-id: a03a9a41697def968bf6e11ad66b9dd9f3a9a7f1
Summary:
Expose a conveniece wrapper `runHaxlWithWrites` which returns the writes along with
the result of the `Haxl` computation.
Reviewed By: simonmar
Differential Revision: D14386668
fbshipit-source-id: 95757916691f7b9b1291c7dceae7eafe8738cfca
Summary:
Here I try to populate the writes done as part of a Haxl computation
in an IORef inside the Environment.
`IVar` which is the synchornisation point,
also acts as the point where we store intermediate writes for Haxl
computations, so they can be memoized and reused whenever a memoized
computation is done again. This is done inside `getIVarWithWrites` function.
This works, because we create new IVars when running a memo computation
or a data fetch, and it is only at these places where we need to create a new
environment with empty writes to run the computation in. So I run every memoized
computation in a new environment (with empty writes) and populate the writes in this new
environment. At the end of the memoized computation, I look up these writes from the `IVar`
and also add them to the original environment. This way ultimately all writes are correctly
propagated upwards to the top level environment user passes to `runHaxl`.
This logic lives inside `execMemoNow`.
Reviewed By: simonmar
Differential Revision: D14342181
fbshipit-source-id: a410dae1a477f27b480804b67b2212e7500997ab
Summary:
This diff removes the scuba field as described in the task, as well as removing numRounds from Stats. This involved removing the numRounds assertion from expectRounds* functions, which I chose to rename to expectResult* (lmk if you prefer something different there).
Within Stats, I merely deleted the numRounds function. I didn't go looking for anything deeper to clean up because it looked like `rs` was used in other functions.
Reviewed By: zilberstein
Differential Revision: D8963298
fbshipit-source-id: d367b53007be03bd290222c676539680acd9f929
Summary: I noticed this test was broken in `cabal test` recently.
Reviewed By: mic47
Differential Revision: D6857296
fbshipit-source-id: ca7d15ba841f1dc79acccf1cd4999e8fcea994c8
Summary:
This isn't pretty, but it's the least intrusive and most efficient way
I could find to do it.
The tricky part is that when doing multiple putResults in the same
child thread, we have to ensure the *last* one (and only the last one)
is putResultFromChildThread.
Reviewed By: xich
Differential Revision: D6519631
fbshipit-source-id: 1c3c40f311031ac4cc8ed82daefcb7740b91541e
Summary:
Mainly aimed at
* lowering the barrier to entry by enabling some simple use cases
* providing more example code
* providing a basis for some examples in a future blog post
Reviewed By: zilberstein
Differential Revision: D6172870
fbshipit-source-id: 9493cec7ccd78c32b54cb24923f9c574e877c529
Summary:
Now that the Haxl 2 diff has landed, I wanted to take the opportunity to reorganise the codebase.
I split parts of `Haxl.Core.Types` out into
* `Haxl.Core.Flags`, the `Flags` type and functions
* `Haxl.Core.DataSource`: the `DataSource` class and related stuff
and I split the massive `Haxl.Core.Monad` module into smaller modules:
* the base `Haxl.Core.Monad` with the types and instances
* `Haxl.Core.Fetch`: data-fetching functionality
* `Haxl.Core.Run`: the scheduler, and `runHaxl`
* `Haxl.Core.Profile`: profiling
* `Haxl.Core.Parallel`: `pAnd` and `pOr`
* I've also moved all the memoization support into `Haxl.Core.Memo`.
This commit also fixes the build on GHC 7.8.x, 7.10.x, and 8.2.x, all the Travis builds are green again.
Closes https://github.com/facebook/Haxl/pull/79
Reviewed By: zilberstein
Differential Revision: D6030246
Pulled By: simonmar
fbshipit-source-id: 5a0dc708cf72f8ed0906f1e99000976dbfbc89e2
Summary:
The problem was that we could lose the correct Env if a continuation
got blocked and restarted.
Reviewed By: niteria
Differential Revision: D5985280
fbshipit-source-id: f8afdb9d4db38781b33a8bddde46c031a133dec1
Summary:
This is a complete reworking of the way that Haxl schedules I/O. The
main benefits are:
* Data fetches are no longer organised into rounds, but can be
arbitrarily overlapped with each other and with computation. The
scheduler supports an arbitrary queue of work items which it can
evaluate while data-fetching is taking place in the background. To
take advantage of this, data sources must implement a new form of
`PerformFetch`, namely `BackgroundFetch`. The old forms of
`PerformFetch` are still supported, but won't benefit from any
additional concurrency.
* It is now possible to specify on a per-data-source basis whether
fetching should be optimised for batching or for latency. A request
to a data source that doesn't benefit from batching can be submitted
immediately. This is done with the new `schedulerHint` method of
`DataSource`.
Reviewed By: niteria
Differential Revision: D4938005
fbshipit-source-id: 96f12ad05ee62d62474ee4cc1215f19d0a6fcdf3
Summary: This provides a way to override test file locations for other build systems.
Reviewed By: yfeldblum
Differential Revision: D5218507
fbshipit-source-id: 7087ca13afb105b727ecf3f7dfdaecd26d27ea81
Summary:
These two operators are subtly non-deterministic, but can potentially
improve performance significantly in cases where
* We don't want to use .|| because it's too sequential
* We don't want to use || because it forces us to do wasted work
(and equivalently for &&).
The implementation is a bit subtle, see Note [tricky pOr/pAnd]
Reviewed By: xich
Differential Revision: D4611809
fbshipit-source-id: 832ace29dfc44e48c14cc5d4f52a0114ee326c92
Summary:
I didn't look into this too deeply but I'm guessing it was because the
constant expression had been lifted out, so I made it not a constant
expression.
Reviewed By: JonCoens
Differential Revision: D4521430
fbshipit-source-id: 687075d8486b38743b3bd8b9a9f26aa198b2d258
Summary:
Labels which throw Haxl exceptions are recorded, but pure exceptions bubble up
and labels are lost. This test demonstrates this.
Reviewed By: simonmar
Differential Revision: D3672479
fbshipit-source-id: fab10878e7eb067e0c65bcf401d75604c333007f
Summary:
This revision generalizes the existing memoization framework to 1-ary and 2-ary
functions (namely functions of type (a -> GenHaxl u b) and (a -> b _> GenHaxl u c)).
For every support arity (currently 0, 1, and 2), a family of functions {
newMemoWithX, prepareMemoX and runMemoX } are provided. newMemo itself is
generic across all arities.
Reviewed By: simonmar
Differential Revision: D3555791
fbshipit-source-id: 010a9889d42327607c8b03a5f7a609ee0c70de49
Summary:
This revision refactors cachedComputation to only contain logic relevant to
where the request-scope memo lives; memo creation and running logic is delegated
to newMemo(with) and runMemo.
Comments in cachedComputation have been moved over to newMemo/runMemo, and a
benchmark for cachedComputation has been added to monadbench. Surprisingly,
performance might have improved, albeit very slightly.
Reviewed By: simonmar
Differential Revision: D3514791
fbshipit-source-id: b2f0627824adc79b766e4f4e28c4af957ff00a00
Summary:
This diff adds the createMemo and updateMemo helper functions, which abstract
the memoization reference management logic of cachedComputation. This separates
the work of *how* a memoized computation is created/updated, from *where* the
memo reference lives, allowing the same code to be used to manage request-scope
and feature-scope memos simultaneously.
A refactor of cachedComputation to use this abstraction is forthcoming.
Reviewed By: simonmar
Differential Revision: D3492803
fbshipit-source-id: 9dadd3860d5bec3bf776eef7c1bd610c25283729