sapling/lib/indexedlog
Jun Wu 40a88364be indexedlog: replace div with shr to make checksum faster
Summary:
Spot `div` slowness using Linux's `perf` tool.

        |    Disassembly of section .text:
        |
        |    0000000000018990 <indexedlog::checksum_table::ChecksumTable::check_range>:
        |    _ZN10indexedlog14checksum_table13ChecksumTable11check_range17h2303c96b1e035e20E():
   1.36 |      push   %rax
   0.18 |      mov    %rdx,%r8
   0.54 |      mov    $0x1,%cl
        |      test   %r8,%r8
        |      je     60
   0.54 |      add    %rsi,%r8
   0.72 |      cmp    0x30(%rdi),%r8
        |      ja     64
   0.27 |      mov    0x28(%rdi),%r9
   0.27 |      test   %r9,%r9
        |      je     6a
   0.36 |      add    $0xffffffffffffffff,%r8
   0.18 |      xor    %edx,%edx
   0.45 |      mov    %rsi,%rax
   0.36 |      div    %r9
  43.72 |      mov    %rax,%rsi
        |      xor    %edx,%edx
        |      mov    %r8,%rax
   0.18 |      div    %r9
  42.82 |      add    $0x1,%rax
   0.09 |      cmp    %rax,%rsi
        |      jae    60
   2.17 |      cmpq   $0x0,0x60(%rdi)
        |      je     78
        |      mov    0x50(%rdi),%rcx
        |      cmpb   $0x0,(%rcx)
   1.63 |      sete   %cl
   0.18 |      xchg   %ax,%ax
        |50:   test   $0x1,%cl
        |      je     64
   0.45 |      add    $0x1,%rsi
   0.81 |      mov    $0x1,%cl
   0.09 |      cmp    %rax,%rsi
        |      jb     50
        |60:   mov    %ecx,%eax
        |      pop    %rcx
   2.62 |      retq
        |64:   xor    %ecx,%ecx
        |      mov    %ecx,%eax
        |      pop    %rcx
        |      retq
        |6a:   lea    panic_loc.a.llvm.9800112514578621117,%rdi
        |      callq  core::panicking::panic
        |      ud2
        |78:   lea    panic_bounds_check_loc.7.llvm.9800112514578621117,%rdi
        |      xor    %esi,%esi
        |      xor    %edx,%edx
        |      callq  core::panicking::panic_bounds_check
        |      ud2

Change `chunk_size` to `chunk_size_log`. Replace `div` with `shr` to make it
significantly faster:

Before:

  index lookup (memory)           1.118 ms
  index lookup (disk, no verify)  2.078 ms
  index lookup (disk, verified)   7.687 ms

After:

  index lookup (memory)           1.066 ms
  index lookup (disk, no verify)  1.992 ms
  index lookup (disk, verified)   3.591 ms

Reviewed By: DurhamG, markbt

Differential Revision: D7554992

fbshipit-source-id: c24189ced722d880af6ca0d64967eb762363d9e3
2018-04-17 18:54:39 -07:00
..
benches indexedlog: use OpenOptions 2018-04-13 21:51:46 -07:00
src indexedlog: replace div with shr to make checksum faster 2018-04-17 18:54:39 -07:00
Cargo.toml indexedlog: separate benchmarks 2018-04-13 21:51:42 -07:00