Commit Graph

5 Commits

Author SHA1 Message Date
Jun Wu
616306543b codemod: use explicit versions in Cargo.toml
Summary:
This is done by running `fix-code.py`. Note that those strings are
semvers so they do not pin down the exact version. An API-compatiable upgrade
is still possible.

Reviewed By: ikostia

Differential Revision: D10213073

fbshipit-source-id: 82f90766fb7e02cdeb6615ae3cb7212d928ed48d
2018-11-15 18:54:06 -08:00
Jun Wu
3adc813687 codemod: add copyright headers
Summary: This is just the result of running `./contrib/fix-code.py $(hg files .)`

Reviewed By: ikostia

Differential Revision: D10213075

fbshipit-source-id: 88577c9b9588a5b44fcf1fe6f0082815dfeb363a
2018-10-26 15:09:12 -07:00
Jun Wu
7d346e6bc2 ignore: support global gitignore configs
Summary:
Change the Rust ignore matcher to accept an extra list of gitignore files.
Parse "git:" entries of "ui.ignore" to be git ignore files.

Reviewed By: DurhamG

Differential Revision: D8863905

fbshipit-source-id: 0cd5e29e01f01496ff61c81b89f7876202f18a98
2018-08-02 20:22:47 -07:00
Jun Wu
3ffa0f28e2 gitignore: avoid quadratic behavior
Summary:
The correct gitignore matcher needs O(N^2) time to check a path which is N
directory deep. For example, to check "a/b/c/d", it needs to check:

  - Whether .gitignore matches a/b/c/d
  - Whether a/.gitignore matches b/c/d
  - Whether a/b/.gitignore matches c/d
  - Whether a/b/c/.gitignore matches d

  - Whether .gitignore matches a/b/c
  - Whether a/.gitignore matches b/c
  - Whether a/b/.gitignore matches c

  - Whether .gitignore matches a/b
  - Whether a/.gitignore matches b

  - Whether .gitignore matches a

It might not look that bad because N=4 for the above example. But when N is
larger (ex. node_modules/../node_modules/../node_modules/..), things get much
worse.

This patch adds "caching" about whether a directory is ignored or not. For
example, if "a/b/" is ignored, the new code would skip checking subdirectories
(ex. "a/b/c/"). The time complexity is now roughly O(N) gitignore tests instead
of O(N^2), since we only did a gitignore check for a parent directory of a path
being tested once, and then cache the parent directory result in a boolean
value.

To be clear, for the first time checking a path which is not ignored, it still
needs O(N^2) for initializing the trees. But once it's initialized, the next
time checking a file in a same directory, will be O(N).

`LruCache` is replaced by `HashMap` since it does not support `.get` and the
code needs that to work.

The perf issue was previously documented as a "PERF" comment.
This diff removes it.

Reviewed By: DurhamG

Differential Revision: D7496058

fbshipit-source-id: f10895b8f0d7dcdde6faf9daeec5cd78a1f15a2b
2018-04-13 21:51:48 -07:00
Jun Wu
283b8d130d pathmatcher: initial Rust matcher that handles gitignore lazily
Summary:
The "pathmatcher" crate is intended to eventually cover more "matcher"
abilities so all Python "matcher" related logic can be handled by Rust.
For now, it only contains a gitignore matcher.

The gitignore matcher is designed to work in a repo (no need to create
multiple gitignore matchers for a repo from a higher layer), and be lazy
i.e. be tree-aware, and do not parse ".gitignore" unless necessary.

Worth mentioning that the gitignore logic provided by the "ignore" crate
seems decent in time complexity - it uses regular expression, which uses state
machines to achieve "testing against multiple patterns at once", instead of
testing patterns one-by-one like what git currently does.

Note: The "ignore" crate provides a nice "Walker" interface but that does
not fit very well with the required laziness here. So the walker interface
is not used.

Reviewed By: markbt

Differential Revision: D7319609

fbshipit-source-id: ebd131adf45a38f83acdf653f5e49d0624012152
2018-04-13 21:51:40 -07:00