A Haskell library that simplifies access to remote data, such as databases or web-based services.
Go to file
Andrew Farmer 9d5db2ae63 Optimize Haxl Monad
Summary:
This diff does two things:

1. Claws back performance lost to lightweight profiling, and then some.
Haxl monad with lightweight profiling is now faster than it was before
lightweight profiling was added.

par1 and tree are ~20% faster.
seqr is ~10% faster.
par2 and seql are unchanged.

2. Eliminate redundant constraints on some exported functions.
Wherever types on exported functions changed, they became less
constrained with no loss of functionality. Notably, the *WithShow
functions no longer require pointless Show constraints.

Now the gory details:

Monadbench on master (before lightweight profiling):

  par1
  10000 reqs: 0.01s
  100000 reqs: 0.11s
  1000000 reqs: 1.10s
  par2
  10000 reqs: 0.02s
  100000 reqs: 0.41s
  500000 reqs: 2.02s
  seql
  10000 reqs: 0.04s
  100000 reqs: 0.50s
  500000 reqs: 2.65s
  seqr
  200000 reqs: 0.02s
  2000000 reqs: 0.19s
  20000000 reqs: 1.92s
  tree
  17 reqs: 0.48s
  18 reqs: 0.99s
  19 reqs: 2.04s

After D3316018, par1 and tree got faster (surprise win), but par2 got worse, and seql got much worse:

  par1
  10000 reqs: 0.01s
  100000 reqs: 0.08s
  1000000 reqs: 0.91s
  par2
  10000 reqs: 0.03s
  100000 reqs: 0.42s
  500000 reqs: 2.29s
  seql
  10000 reqs: 0.04s
  100000 reqs: 0.61s
  500000 reqs: 3.89s
  seqr
  200000 reqs: 0.02s
  2000000 reqs: 0.19s
  20000000 reqs: 1.83s
  tree
  17 reqs: 0.39s
  18 reqs: 0.77s
  19 reqs: 1.58s

Looked at the core (-ddump-prep) for Monad module.
Main observation is that GHC is really bad at optimizing the 'Request r a' constraint because it is a tuple.

To see why:

  f :: Request r a => ...
  f = ... g ... h ...

  g :: Show (r a) => ...
  h :: Request r a => ...

GHC will end up with something like:

  f $dRequest =
    let $dShow = case $dRequest of ... in
    let $dEq = case $dRequest of ... in
    ... etc for Typeable, Hashable, and the other Show ...
    let g' = g $dShow ... in
    let req_tup = ($dShow, $dEq, ... etc ...) in
    h req_tup ...

That is, it unboxes each of the underlying dictionaries lazily, even though it only needs the single Show dictionary.
It then reboxes them all in order to call 'h', meaning none of the unboxed ones are dead code.
I couldn't figure out how to get it to do the sane thing (unbox the one it needs and pass the original dictionary onwards).
We should investigate improving the optimizer.

To avoid the problem, I tightened up the constraints in several places to be only what is necessary (instead of all of Request).

Notably:

Removed Request constraint from ShowReq, as it was completely unnecessary.
All the *WithShow variants do not take Show constraints at all. Doing so seemed to violate their purpose.
The crucial *WithInsert functions only take the bare constraints they need, avoiding the reboxing.
Since *WithInsert are used by *WithShow, I had to explicitly pass a show function in places.
See Note [showFn] for an explanation.

This gave us back quite a bit on seql, and a bit on seqr:

  par1
  10000 reqs: 0.01s
  100000 reqs: 0.08s
  1000000 reqs: 0.90s
  par2
  10000 reqs: 0.02s
  100000 reqs: 0.36s
  500000 reqs: 2.18s
  seql
  10000 reqs: 0.04s
  100000 reqs: 0.55s
  500000 reqs: 3.00s
  seqr
  200000 reqs: 0.02s
  2000000 reqs: 0.18s
  20000000 reqs: 1.73s
  tree
  17 reqs: 0.39s
  18 reqs: 0.79s
  19 reqs: 1.54s

Finally, addProfileFetch was getting inlined into dataFetchWithInsert.
This caused some let-bound stuff to float out and get allocated before the flag test.
Adding a NOINLINE prevented this, getting about 10% speedup on par2 and seql.
Doing the constraint work above enabled this, because otherwise the call to
addProfileFetches was creating the reboxing issue where it didn't exist before.

  par1
  10000 reqs: 0.01s
  100000 reqs: 0.08s
  1000000 reqs: 0.89s
  par2
  10000 reqs: 0.02s
  100000 reqs: 0.35s
  500000 reqs: 1.98s
  seql
  10000 reqs: 0.04s
  100000 reqs: 0.53s
  500000 reqs: 2.72s
  seqr
  200000 reqs: 0.02s
  2000000 reqs: 0.17s
  20000000 reqs: 1.67s
  tree
  17 reqs: 0.39s
  18 reqs: 0.82s
  19 reqs: 1.65s

Reviewed By: simonmar

Differential Revision: D3378141

fbshipit-source-id: 4b9dbe0c347f924805a7ed4c526c4e7c9aeef077
2016-06-04 15:20:42 -07:00
example Remove TARGETS files 2016-05-27 08:22:40 -07:00
Haxl Optimize Haxl Monad 2016-06-04 15:20:42 -07:00
tests Count rounds/fetches for profiling labels. 2016-06-04 15:20:42 -07:00
.gitignore Initial open source import 2014-06-10 02:47:59 -07:00
.travis.yml Try to compile with GHC 8.0.1 2016-02-09 19:52:25 +02:00
changelog.md Overhaul docs; bump to 0.3.0.0; add changelog 2015-10-12 06:23:49 -07:00
haxl.cabal Count rounds/fetches for profiling labels. 2016-06-04 15:20:42 -07:00
LICENSE Update haxl copyright headers 2015-03-11 12:42:01 -07:00
logo.png Initial open source import 2014-06-10 02:47:59 -07:00
logo.svg Add SVG logo 2014-06-16 06:48:28 -07:00
PATENTS Update to PATENTS version 2 2015-04-15 09:07:41 -07:00
readme.md Merge pull request #1 from oreoshake/mixed_content_in_readme 2014-06-10 21:08:29 +01:00
Setup.hs Initial open source import 2014-06-10 02:47:59 -07:00

Haxl Logo

Haxl

Haxl is a Haskell library that simplifies access to remote data, such as databases or web-based services. Haxl can automatically

  • batch multiple requests to the same data source,
  • request data from multiple data sources concurrently,
  • cache previous requests.

Having all this handled for you behind the scenes means that your data-fetching code can be much cleaner and clearer than it would otherwise be if it had to worry about optimizing data-fetching. We'll give some examples of how this works in the pages linked below.

There are two Haskell packages here:

  • haxl: The core Haxl framework
  • haxl-facebook (in example/facebook): An (incomplete) example data source for accessing the Facebook Graph API

To use Haxl in your own application, you will likely need to build one or more data sources: the thin layer between Haxl and the data that you want to fetch, be it a database, a web API, a cloud service, or whatever. The haxl-facebook package shows how we might build a Haxl data source based on the existing fb package for talking to the Facebook Graph API.

Where to go next?