Commit Graph

119 Commits

Author SHA1 Message Date
Kyle Caverly
2f44055079
Semantic index eval (#2988)
v0 of the Semantic Index evaluate test suite

Release Notes:

- Added eval.rs as an example to the semantic-index crates
- Generates test metrics for two small projects, as a starting point to
systematically evaluate retrieval quality
2023-09-19 19:17:06 -04:00
KCaverly
11b3bfdc99 fix warnings 2023-09-19 19:05:26 -04:00
KCaverly
25cb79e475 remove git2 dependency for repository cloning in semantic_index eval 2023-09-19 18:55:15 -04:00
KCaverly
d85acceeec move git2 to workspace dependency globally 2023-09-19 16:13:47 -04:00
KCaverly
4f1a59ebf5 formatting 2023-09-19 12:27:33 -04:00
KCaverly
fc8dd8433c remove release channel flags in semantic_index 2023-09-19 12:20:59 -04:00
KCaverly
183758a7c5 fix Cargo.lock for merge 2023-09-19 11:44:51 -04:00
KCaverly
25bd357426 add recall and precision to semantic index 2023-09-18 18:25:02 -04:00
KCaverly
566bb9f71b add map to evaluation suite for semantic_index 2023-09-18 09:57:52 -04:00
KCaverly
1433160a08 enable include based filtering for search inside open and modified buffers 2023-09-15 15:16:20 -04:00
KCaverly
04bd107ada add ndcg@k to evaluate metrics 2023-09-15 10:36:21 -04:00
KCaverly
3a661c5977 catchup with main 2023-09-15 09:31:33 -04:00
Antonio Scandurra
ae85a520f2 Refactor semantic searching of modified buffers 2023-09-15 12:12:20 +02:00
KCaverly
796bdd3da7 update searching in modified buffers to accomodate for excluded paths 2023-09-14 19:42:06 -04:00
KCaverly
c19c8899fe add initial search inside modified buffers 2023-09-14 14:58:34 -04:00
Antonio Scandurra
f86e5a987f WIP 2023-09-14 17:42:30 +02:00
Antonio Scandurra
6a271617b4 Make path optional when parsing file
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-14 17:09:08 +02:00
KCaverly
137dda3ee6 wip eval framework for semantic index 2023-09-14 09:30:19 -04:00
KCaverly
0c1b2e5aa6 cleaned up warnings 2023-09-13 20:04:53 -04:00
KCaverly
eff44f9aa4 semantic index eval, indexing appropriately 2023-09-13 20:02:15 -04:00
KCaverly
6f29582fb0 progress on eval 2023-09-13 10:32:36 -04:00
KCaverly
d4fbe99052 add eval for gpt-engineer 2023-09-12 21:27:35 -04:00
KCaverly
0d14bbbf5b add eval values for tree-sitter 2023-09-12 20:36:06 -04:00
KCaverly
66c967da88 start work on eval script for semantic_index 2023-09-12 16:25:31 -04:00
KCaverly
e678c7d9ee swap
SystemTime for Instant throughout rate_limit_expiry tracking
2023-09-11 10:26:14 -04:00
KCaverly
7df21f86dd move cx notify observe for rate_limit_expiry into ProjectState in the semantic index
Co-authored-by: Antonio <antonio@zed.dev>
2023-09-11 10:11:40 -04:00
KCaverly
37915ec4f2 updated notify to accomodate for updated countdown 2023-09-08 16:53:16 -04:00
KCaverly
bf43f93197 updated semantic_index reset status to leverage target reset system time as opposed to duration 2023-09-08 15:04:50 -04:00
KCaverly
a5ee8fc805 initial outline for rate limiting status updates 2023-09-08 12:35:15 -04:00
KCaverly
cf5d1d91a4 update semantic search to go to no results if search query is blank 2023-09-07 14:43:41 -04:00
Antonio Scandurra
eda7e00645 Implement SemanticIndex::status and use it in project search
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-07 19:39:30 +02:00
Antonio Scandurra
65e17e212d Eagerly index project on workspace creation if it was indexed before
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-07 18:51:55 +02:00
Antonio Scandurra
a45c8c380f 💄 2023-09-07 15:25:23 +02:00
Antonio Scandurra
757a285852 Keep dropping the documents table if it exists
This is because we renamed `documents` to `spans`.

Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-07 15:15:16 +02:00
Antonio Scandurra
93b889a93b Merge remote-tracking branch 'origin/main' into semantic-search-watch-worktrees 2023-09-07 15:07:46 +02:00
Antonio Scandurra
3ad1befb11 Remove unneeded logging
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-07 15:07:21 +02:00
KCaverly
265d02a583 update request timeout for open ai embeddings 2023-09-06 15:09:46 -04:00
KCaverly
17237f748c update token_count for OpenAIEmbeddings to accomodate for truncation 2023-09-06 15:09:15 -04:00
Antonio Scandurra
ce62173534 Rename Document to Span 2023-09-06 17:03:08 +02:00
Antonio Scandurra
de0f53b39f Ensure SemanticIndex::search waits for indexing to complete 2023-09-06 11:40:59 +02:00
Antonio Scandurra
c802680084 Clip ranges returned by SemanticIndex::search
The files may have changed since the last time they were parsed, so the
ranges returned by `SemanticIndex::search` may be out of bounds.
2023-09-06 09:41:51 +02:00
Antonio Scandurra
95b72a73ad Re-index project when a worktree is registered
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-05 17:17:58 +02:00
Antonio Scandurra
3c70b127bd Simplify SemanticIndex::index_project
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-05 16:54:48 +02:00
Antonio Scandurra
6b1dc63fc0 Retrieve embeddings based on pending files
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-05 16:16:12 +02:00
Antonio Scandurra
7b5a41dda2 Move retrieval of embeddings from the db into reindex_changed_files
Co-Authored-By: Kyle Caverly <kyle@zed.dev>
2023-09-05 16:09:24 +02:00
Antonio Scandurra
d4cff68475 🎨 2023-09-05 15:52:36 +02:00
KCaverly
8dbc0fe033 update pragma settings for improved database performance 2023-09-01 17:07:20 -04:00
KCaverly
54235f4fb1 updated embeddings background delay to 5 minutes
Co-authored-by: Max <max@zed.dev>
2023-09-01 13:04:09 -04:00
KCaverly
e86964eb5d optimize insert file in vector database
Co-authored-by: Max <max@zed.dev>
2023-09-01 13:01:37 -04:00
KCaverly
524533cfb2 flush embeddings queue when no files are parsed for 250 milliseconds
Co-authored-by: Antonio <antonio@zed.dev>
2023-09-01 11:24:08 -04:00