Basic aliases (e.g. `admin?`/`be_admin`) can be represented easily with simple
wildcards, but more complex transformations require a different mechanism.
Instead of using `%s` to represent strings that can be replaced 1:1, this
introduces a syntax inspired by https://github.com/tpope/vim-projectionist, as
such:
- name: Rails
aliases:
- from: "*Validator"
to: "{snakecase}"
This would find `AbsoluteUriValidator` and also match `absolute_uri`, which
would be found if the validation was in use.
This currently supports the `camelcase` and `snakecase` transformations,
as well as no transformation.
Closes#18
This introduces behavior searching for an available program to calculate
digests across directories. OS X ships with md5, but on *nix-based
systems, it's md5sum. The output is largely the same, apart from the
final digest calculation, which includes a "file path":
da52a1a5d5a3c9672371746e4d32708a -
This strips the trailing whitespace and dash:
da52a1a5d5a3c9672371746e4d32708a
Closes#49
This creates a new "list" output format that includes a certain number
of git SHAs per token. This allows for perusal of the most recent
changes for a given token to understand what changed.
This introduces a monad transformer stack to cover our reader (options
from the CLI) and except (for handling failure cases, initially missing
tags or invalid config).
This ensures errors are bubbled up appropriately (and halt program
execution) and the Options are available in the correct locations within
the app.
This also separates options parsing (which remains in app/Main.) from
translating those options into the correctly executed runner and
generated output.
This enables per-user and per-project configs, located in:
* ~/.unused.yml
* APP_ROOT/.unused.yml
Configurations stack upon each other, not replace; unused provides a
very base config, but additional configurations can be defined.
Per-user configs are best used to suit common types of projects at a
generic level. For example, a developer commonly working in Rails
applications might have a config at ~/.unused.yml for patterns like
Policy objects from Pundit, ActiveModel::Serializers, etc.
Per-project config would be less-generic patterns, ones where re-use
isn't likely or applicable.
See unused's global config:
https://github.com/joshuaclayton/unused/blob/master/data/config.yml
The structure is as follows:
- name: Rails
autoLowLikelihood:
- name: Pundit
pathStartsWith: app/policies
pathEndsWith: .rb
termEndsWith: Policy
classOrModule: true
- name: Pundit Helpers
pathStartsWith: app/policies
allowedTerms:
- Scope
- index?
- new?
- create?
- show?
- edit?
- destroy?
- resolve
- name: Other Language
autoLowLikelihood:
- name: Thing
pathEndsWith: .ex
classOrModule: true
Name each item, and include an autoLowLikelihood key with multiple named
matchers. Each matcher can look for various formatting aspects,
including termStartsWith, termEndsWith, pathStartsWith, pathEndsWith,
classOrModule, and allowedTerms.
Why?
====
Dynamic languages, and Rails in particular, support some fun method
creation. One common pattern is, within RSpec, to create matchers
dynamically based on predicate methods. Two common examples are:
* `#admin?` gets converted to the matcher `#be_admin`
* `#has_active_todos?` gets converted to the matcher `#have_active_todos`
This especially comes into play when writing page objects with predicate
methods.
This change introduces the concept of aliases, a way to describe the
before/after for these transformations. This introduces a direct swap
with a wildcard value (%s), although this may change in the future to
support other transformations for pluralization, camel-casing, etc.
Externally, aliases are not grouped together by term; however, the
underlying counts are summed together, increasing the total occurrences
and likely pushing the individual method out of "high" likelihood into
"medium" or "low" likelihood.
Closes#19.
Why?
====
Parsec is overkill when all that's really needed is splitting on
semicolons and converting a string to a non-negative Int.
One side-effect of this is to convert the caching mechanism from flat
text to CSV, with cassava handling (de-)serialization.
Additional
==========
Introduce ReaderT to calculate sha once per cache interaction
Previously, we were calculating the fingerprint (SHA) for match results
potentially twice, once when reading from the cache, and a second time
if no cache was found. This introduces a ReaderT to manage cache
interaction with a single fingerprint calculation.
This also abstracts what's being cached to only care about the fact that
the data can be converted to/from csv.
Why?
====
Because a .gitignore file captures a fair number of project-specific
directories and files to ignore, we can use this list to reduce the
number of files to look at when determining a fingerprint for a project.
Because the fingerprint should be based on files we care about changing,
the project-specific .gitignore is a great place to start.
This drastically reduces fingerprint timing - for larger projects, or
projects with a massive number of files (e.g. anything doing anything
significant with NPM and a front-end framework), this will help make
caching usable. For normal projects, this cuts fingerprint
calculation to 10%-20% of what it was previously.
Closes#38
Why?
====
Frequency of a tool's usage is determined by how easy it is to use the
tool. By having to pipe in ctags files all the time, and not provide any
guidance to the user, this program is merely a toy, since it's hard to
get right, and harder to explore.
This modifies the default behavior to look for a ctags file in a few
common locations, and lets the user choose a custom location if she so
chooses.
Resolves#35
Why?
====
Handling low likelihood configuration was previously a huge pain,
because the syntax in Haskell was fairly terse. This introduces a yaml
format internally that ships with the app covering basic cases for
Rails, Phoenix, and Haskell. I could imagine getting baselines in here
for other languages and frameworks (especially ones I've used and am
comfortable with) as a baseline.
This also paves the way for searching for user-provided additions and
loading those configurations in addition to what we have here.
At some point, this also needs to md5 the tags list itself and factor
that in (since if the tagging algorithm changes, and new tokens get
uncovered, it'd invalidate the cache)
Why?
====
ag supports using regular expressions for searches; however, the -Q
flag, which was previously always used, resulted in literal search
results.
By searching literal matches, it would potentially return too many
results. For example, with a `me` method in a controller, it'd match
words like `awesome` or `method`.
This introduces a check where, if the token being searched is only
composed of word characters (`[A-Za-z0-9_]`), it'll switch over to use
regular expressions with ag and surround the token with non-word matches
on either end. The goal here is to reduce false-positives in matches.
Why?
====
Grouping results can be helpful to view information differently, e.g. to
see highest-offending files or to remove grouping entirely.
This introduces a flag to allow overriding the default group (two levels
of directory)
Why?
====
Searching hundreds or thousands of tokens with ag can be slow; this
introduces parallel processing of search so results are returned more
quickly.
Why?
====
Parsing lines of results was somewhat unreliable, and terms with odd
characters were causing problems. This:
* extracts parsing into an Unused.Parser.Internal module for ease of
testing
* fixes cases where certain tokens weren't matching
Why?
====
A simple calculation ("yes, this should be removed" or "no, this is
probably fine") is frankly not enough information for someone evaluating
their codebase to understand why we made the decision.
This introduces a removal reason, so a user understands why we ranked it
the way we did, and adds additional logic around a method and its tests
to determine if a method exists and is only being used in the tests (if
so, it should probably be deleted).
This is done with an Occurrances record, which is created for total
files, test code, and non-test code. The test code logic is somewhat
naive but works in most cases. It doesn't ensure a particular directory,
in the case that tests live alongside source code (e.g. Go), and
captures RSpec cases as well.
Why?
====
Formatting each column requires context on the column, as well as
information on alignment. This extracts the column formatting logic to a
specific formatter.
ColumnFormatter is coupled to the order of columns/data displayed to the
user.
Why?
====
Unused hides the cursor and potentially does other things to the window that
may leave it in an odd state. This introduces a hook to run any state
cleanup, including re-enabling the cursor, when a user sends a SIGINT to
the program.