mirror of
https://github.com/jlfwong/speedscope.git
synced 2024-12-01 16:16:44 +03:00
741fdeb427
This attempts to improve the quality of the on-CPU profiles stackprof provides. Rather than weighing samples by their timestamp deltas, which, in our opinion, are only valid in wall-clock mode, this weighs callchains by: ``` S = number of samples P = sample period in nanoseconds W = S * P ``` The difference after this change is quite substantial, specially in profiles that previously were showing up with heavy IO frames: * Total profile weight is almost down by 90%, which actually makes sense for an on-CPU profile if the app is relatively idle * Certain callchains that blocked in syscalls / IO are now much lower weight. This was what I was expecting to find. Here is an example of the latter point. In delta mode, we see an io select taking a long time, it is a significant portion of the profile: <img width="1100" alt="236936508-709bee01-d616-4246-ba74-ab004331dcd3" src="https://github.com/dalehamel/speedscope/assets/4398256/39140f1e-50a9-4f33-8a61-ec98b6273fd4"> But in period scaling mode, it is only a couple of sample periods ultimately: <img width="206" alt="236936693-9d44304e-a1c2-4906-b3c8-50e19e6f9f27" src="https://github.com/dalehamel/speedscope/assets/4398256/7d19077f-ef25-4d79-980b-cfa1775d928d"> |
||
---|---|---|
.. | ||
cpp | ||
go | ||
javascript | ||
ruby |