sapling/.hgignore
Jun Wu f1be725a71 traceprof: new extension to perform accurate profiling
Summary:
Instead of sampling periodically like the default profiler `statprof`, trace
every function call so the report is accurate. The output is designed to be
similar to `statprof`.

Written in C++ and inline assembly to minimize overhead. Cython is used as a
glue layer.

Comparing to `lsprof`, the overhead is similarly small. The major difference is
this profiler cares about "frame object" and records `frame->f_back` (parent
frame) so it could draw a statprof-like tree of the callgraph. `lsprof` records
"code object" instead and could have difficulty to deal with recursive calls.

The following code could demostrate the difference:

```
def fib(x):
    if x < 2:
        return x
    return fib(x - 1) + fib(x - 2)

# lsprof - recorded 1 code object for fib and no way to figure out callgraph
import _lsprof
p = _lsprof.Profiler()
p.enable(subcalls=True)
fib(5)
p.disable()
for s in p.getstats():
    print(s)

'''output:
_lsprof.profiler_entry(code="<method 'disable' of '_lsprof.Profiler' objects>", callcount=1, reccallcount=0, totaltime=0.0, inlinetime=0.0, calls=None)
_lsprof.profiler_entry(code=<code object fib at 0x7f27ca587530, file "a.py", line 1>, callcount=15, reccallcount=14, totaltime=8e-06, inlinetime=8e-06, calls=[_lsprof.profiler_subentry(code=<code object fib at 0x7f27ca587530, file "a.py", line 1>, callcount=14, reccallcount=12, totaltime=6e-06, inlinetime=6e-06)])
'''

# traceprof (with timethreshold set to 0) - callgraph drawn correctly
import sys
from hgext3rd import traceprof
with traceprof.profile(None, sys.stderr):
    fib(5)

'''output:
    | <module>                                  a.py:1
  0  \ fib                                      a.py:1
  0    \ fib                                    a.py:1
  0      \ fib                                  a.py:1
  0        \ fib                                a.py:1
  0         | fib                               a.py:1
  0        \ fib                                a.py:1
  0      \ fib                                  a.py:1
  0       | fib                                 a.py:1
  0    \ fib                                    a.py:1
  0     | fib                                   a.py:1
     \ __exit__                                 contextlib.py:21
'''
```

There are some performance / correctness tradeoffs. The current implementation
takes `frame` address and several (but not theoretically complete) fields to
hash a frame.  That could in theory lead to inaccurate frame information being
recorded. But hashing all fields recursively incurs significant overhead. The
related logic is the `hashframe` function, which could be tweaked later.

This is an initial version. It probably does not cover all corner cases like
exception handling well. And it may duplicate a same frame multiple times
because of the way it hashes a frame. The duplicated frame issue could be
solved by adding post-profiling filtering though.

Here is a real-world example of `hg log -r .` on `hg-committed`:

```
      | <module>                                hg:10
      | run                                     dispatch.py:81
      | dispatch                                dispatch.py:132
      | _runcatch                               dispatch.py:202
      | _callcatch                              dispatch.py:309
      | callcatch                               scmutil.py:133
      | _runcatchfunc                           dispatch.py:215
      | _dispatch                               dispatch.py:744
    2  \ repository                             hg.py:166
    2   | _peerorrepo                           hg.py:151
    2   | instance                              localrepo.py:2025
    2   | __init__                              localrepo.py:275
  863  \ runcommand                             dispatch.py:656
  863   | _runcommand                           dispatch.py:926
  863   | <lambda>                              dispatch.py:918
  863   | check                                 util.py:1059
  863   | log                                   commands.py:3262
  480    \ getlogrevs                           cmdutil.py:2236
  479     | _logrevs                            cmdutil.py:2190
  479     | revrange                            scmutil.py:429
  479     | anyrevs                             localrepo.py:622
  478     | mfunc                               revset.py:2275
  478     | __init__ (4 times)                  smartset.py:1033
  478     | __init__ (4 times)                  smartset.py:919
  478     | __len__ (4 times)                   localrepo.py:587
  478     | changelog                           repoview.py:309
  477     | filterrevs (11 times)               repoview.py:260
  477     | computehidden                       repoview.py:171
  455      \ hideablerevs                       repoview.py:24
  455       | getrevs                           obsolete.py:1114
  455       | _computeobsoleteset               obsolete.py:1143
    3        \ __get__ (2 times)                localrepo.py:76
    7        \ node (5760 times)                changelog.py:359
    4         | node (8938 times)               revlog.py:482
    9        \ __get__ (5760 times)             localrepo.py:76
  428        \ __get__                          util.py:798
  428         | successors                      obsolete.py:672
  225          \ __get__                        util.py:798
  224           | _all                          obsolete.py:662
    6            \ tryread                      vfs.py:32
    6             | read                        vfs.py:60
  200            \ _readmarkers                 obsolete.py:442
* 199             | _fm1readmarkers             obsolete.py:430
   15            \ _checkinvalidmarkers         obsolete.py:523
* 203          \ _addsuccessors                 obsolete.py:504
    3      \ tryreadcache                       repoview.py:166
    3       | cachehash                         repoview.py:98
    3       | heads                             localrepo.py:1884
   17      \ <genexpr> (5225 times)             repoview.py:191
   15       | __contains__ (5226 times)         ancestor.py:334
    5       | parentrevs (4010 times)           changelog.py:371
    2       | parentrevs (4332 times)           revlog.py:479
    9    \ pager                                ui.py:843
    9     | _runpager                           ui.py:906
    8     | __init__                            subprocess.py:330
    8     | _execute_child                      subprocess.py:880
    5     | _eintr_retry_call                   subprocess.py:118
  374    \ show                                 cmdutil.py:1308
  374     | _show                               cmdutil.py:1316
  365      \ _changesetlabels                   cmdutil.py:1266
  365       | troubled                          context.py:225
   12        \ unstable (2 times)               context.py:207
   12         | getrevs                         obsolete.py:1114
   12         | _computeunstableset             obsolete.py:1154
    3          \ set (322 times)                localrepo.py:610
    3            \ revs (3 times)               localrepo.py:593
    8            \ __getitem__ (963 times)      localrepo.py:566
    4          \ parents (321 times)            context.py:246
    4           | __get__ (321 times)           util.py:798
    4           | _parents (321 times)          context.py:562
    3          \ <genexpr> (642 times)          obsolete.py:1164
    3           | obsolete (321 times)          context.py:199
    2           | getrevs (321 times)           obsolete.py:1114
  319        \ bumped (2 times)                 context.py:211
  319         | getrevs                         obsolete.py:1114
  319         | _computebumpedset               obsolete.py:1181
    4          \ set (322 times)                localrepo.py:610
  312          \ allprecursors (2021 times)     obsolete.py:850
  308           | __get__                       util.py:798
  308           | precursors                    obsolete.py:678
* 308           | _addprecursors                obsolete.py:509
   34        \ divergent (2 times)              context.py:218
   34         | getrevs                         obsolete.py:1114
   33         | _computedivergentset            obsolete.py:1204
    4          \ set (322 times)                localrepo.py:610
   24          \ successorssets (1368 times)    obsolete.py:899
    3            \ __get__ (1368 times)         localrepo.py:76
    4            \ __contains__ (200 times)     localrepo.py:575
    9      \ names (2 times)                    namespaces.py:184
    9       | <lambda>                          namespaces.py:43
    9       | nodetags                          localrepo.py:741
    8       | __get__                           util.py:798
    8       | _tagscache                        localrepo.py:646
    8       | _findtags                         localrepo.py:685
    7       | findglobaltags                    tags.py:170
    7       | _readtagcache                     tags.py:370
    6       | filteredhash                      scmutil.py:307
       \ __exit__                               contextlib.py:21
        | maybeprofile                          profiling.py:199
        | __exit__                              contextlib.py:21
        | profile                               profiling.py:148
        | __exit__                              contextlib.py:21
Total time: 867 ms
```

Some example conclusions from reading the above output include:

  1. `_fm1readmarkers`, `_addsuccessors`, `_addprecursors` are slow
  2. `changelog.node` and `changelog.parentrevs` are called thousands of times,
     but they are just a few milliseconds.
  3. Reading the `obsstore` file (`vfs.read`) takes 5ms exactly

While `statprof` may get 1 right, `lsprof` may get 2 right, neither of them is
helpful for 3. `statprof` is not accurate enough to be confident about 5ms,
`lsprof` will include other `vfs.read` calls so it's impossible to see just the
read for obsstore.


Test Plan:
With the upstream [patch](https://patchwork.mercurial-scm.org/patch/20822/), run:

```
 HGPROF=traceprof hg id --config extensions.traceprof= --profile
```

Reviewers: #mercurial, rmcelroy

Reviewed By: rmcelroy

Subscribers: rmcelroy, mjpieters, jiangyuan

Differential Revision: https://phabricator.intern.facebook.com/D5104851

Signature: t1:5104851:1495480386:666847ea08bb6a94658bc10d7f0e91c225d56836
2017-05-22 15:49:45 -07:00

20 lines
271 B
Plaintext

syntax: regexp
\.pyc$
\.o$
\.so$
\.err$
^build/
^.*\.egg-info/
^packages/
^dist/
^tests/getdb\.sh
^tests/report\.json
.idea
.testtimes*
^hgext3rd/.*\.c$
^hgext3rd/traceprof\.c.*$
^cdatapack/cdatapack_dump$
subinclude:cfastmanifest/.hgignore
subinclude:linelog/.hgignore