Use fuzzy-matching to locate symbols when resolving edit steps (#15447)

Release Notes:

- N/A

---------

Co-authored-by: Nathan <nathan@zed.dev>
This commit is contained in:
Antonio Scandurra 2024-07-29 20:21:19 +02:00 committed by GitHub
parent 5e1aa898d4
commit 2b871a631a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 330 additions and 131 deletions

9
Cargo.lock generated
View File

@ -442,6 +442,7 @@ dependencies = [
"settings",
"similar",
"smol",
"strsim 0.11.1",
"telemetry_events",
"terminal",
"terminal_view",
@ -2232,7 +2233,7 @@ dependencies = [
"anstream 0.5.0",
"anstyle",
"clap_lex",
"strsim",
"strsim 0.10.0",
]
[[package]]
@ -10355,6 +10356,12 @@ version = "0.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "73473c0e59e6d5812c5dfe2a064a6444949f089e20eec9a2e5506596494e4623"
[[package]]
name = "strsim"
version = "0.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
[[package]]
name = "strum"
version = "0.25.0"

View File

@ -401,6 +401,7 @@ similar = "1.3"
simplelog = "0.12.2"
smallvec = { version = "1.6", features = ["union"] }
smol = "1.2"
strsim = "0.11"
strum = { version = "0.25.0", features = ["derive"] }
subtle = "2.5.0"
sys-locale = "0.3.1"

View File

@ -6,7 +6,7 @@ Guidelines:
- Don't create and then update a file.
- We'll create it in one shot.
- Prefer updating symbols lower in the syntax tree if possible.
- Never include operations on a parent symbol and one of its children in the same <operations> block.
- Never include operations on a parent symbol and one of its children in the same operations block.
- Never nest an operation with another operation or include CDATA or other content. All operations are leaf nodes.
- Include a description attribute for each operation with a brief, one-line description of the change to perform.
- Descriptions are required for all operations except delete.
@ -14,66 +14,73 @@ Guidelines:
- Avoid referring to the location in the description. Focus on the change to be made, not the location where it's made. That's implicit with the symbol you provide.
- Don't generate multiple operations at the same location. Instead, combine them together in a single operation with a succinct combined description.
The available operation types are:
1. <update>: Modify an existing symbol in a file.
2. <create_file>: Create a new file.
3. <insert_sibling_after>: Add a new symbol as sibling after an existing symbol in a file.
4. <append_child>: Add a new symbol as the last child of an existing symbol in a file.
5. <prepend_child>: Add a new symbol as the first child of an existing symbol in a file.
6. <delete>: Remove an existing symbol from a file. The `description` attribute is invalid for delete, but required for other ops.
All operations *require* a path.
Operations that *require* a symbol: <update>, <insert_sibling_after>, <delete>
Operations that don't allow a symbol: <create>
Operations that have an *optional* symbol: <prepend_child>, <append_child>
Example 1:
User:
```rs src/rectangle.rs
struct Rectangle {
width: f64,
height: f64,
}
```rs src/rectangle.rs
struct Rectangle {
width: f64,
height: f64,
}
impl Rectangle {
fn new(width: f64, height: f64) -> Self {
Rectangle { width, height }
}
}
```
impl Rectangle {
fn new(width: f64, height: f64) -> Self {
Rectangle { width, height }
}
}
```
Symbols for src/rectangle.rs:
- struct Rectangle
- impl Rectangle
- impl Rectangle fn new
<step>Add new methods 'calculate_area' and 'calculate_perimeter' to the Rectangle struct</step>
<step>Implement the 'Display' trait for the Rectangle struct</step>
<step>Add new methods 'calculate_area' and 'calculate_perimeter' to the Rectangle struct</step>
<step>Implement the 'Display' trait for the Rectangle struct</step>
What are the operations for the step: <step>Add a new method 'calculate_area' to the Rectangle struct</step>
What are the operations for the step: <step>Add a new method 'calculate_area' to the Rectangle struct</step>
Assistant (wrong):
<operations>
<append_child path="src/shapes.rs" symbol="impl Rectangle" description="Add calculate_area method" />
<append_child path="src/shapes.rs" symbol="impl Rectangle" description="Add calculate_perimeter method" />
</operations>
A (wrong):
{
"operations": [
{
"kind": "AppendChild",
"path": "src/shapes.rs",
"symbol": "impl Rectangle",
"description": "Add calculate_area method"
},
{
"kind": "AppendChild",
"path": "src/shapes.rs",
"symbol": "impl Rectangle",
"description": "Add calculate_perimeter method"
}
]
}
This demonstrates what NOT to do. NEVER append multiple children at the same location.
Assistant (corrected):
<operations>
<append_child path="src/shapes.rs" symbol="impl Rectangle" description="Add calculate area and perimeter methods" />
</operations>
A (corrected):
{
"operations": [
{
"kind": "AppendChild",
"path": "src/shapes.rs",
"symbol": "impl Rectangle",
"description": "Add calculate area and perimeter methods"
}
]
}
User:
What are the operations for the step: <step>Implement the 'Display' trait for the Rectangle struct</step>
Assistant:
<operations>
<insert_sibling_after path="src/shapes.rs" symbol="impl Rectangle" description="Implement Display trait for Rectangle"/>
</operations>
A:
{
"operations": [
{
"kind": "InsertSiblingAfter",
"path": "src/shapes.rs",
"symbol": "impl Rectangle",
"description": "Implement Display trait for Rectangle"
}
]
}
Example 2:
@ -96,32 +103,36 @@ impl User {
}
```
Symbols for src/user.rs:
- struct User
- struct User pub name
- struct User age
- struct User email
- impl User
- impl User fn new
- impl User pub fn print_info
<step>Update the 'print_info' method to use formatted output</step>
<step>Remove the 'email' field from the User struct</step>
What are the operations for the step: <step>Update the 'print_info' method to use formatted output</step>
Assistant:
<operations>
<update path="src/user.rs" symbol="impl User fn print_info" description="Use formatted output" />
</operations>
A:
{
"operations": [
{
"kind": "Update",
"path": "src/user.rs",
"symbol": "impl User pub fn print_info",
"description": "Use formatted output"
}
]
}
User:
What are the operations for the step: <step>Remove the 'email' field from the User struct</step>
Assistant:
<operations>
<delete path="src/user.rs" symbol="struct User email" description="Remove the email field" />
</operations>
A:
{
"operations": [
{
"kind": "Delete",
"path": "src/user.rs",
"symbol": "struct User email"
}
]
}
Example 3:
@ -144,32 +155,36 @@ impl Vehicle {
}
```
Symbols for src/vehicle.rs:
- struct Vehicle
- struct Vehicle make
- struct Vehicle model
- struct Vehicle year
- impl Vehicle
- impl Vehicle fn new
- impl Vehicle fn print_year
<step>Add a 'use std::fmt;' statement at the beginning of the file</step>
<step>Add a new method 'start_engine' in the Vehicle impl block</step>
What are the operations for the step: <step>Add a 'use std::fmt;' statement at the beginning of the file</step>
Assistant:
<operations>
<prepend_child path="src/vehicle.rs" description="Add 'use std::fmt' statement" />
</operations>
A:
{
"operations": [
{
"kind": "PrependChild",
"path": "src/vehicle.rs",
"description": "Add 'use std::fmt' statement"
}
]
}
User:
What are the operations for the step: <step>Add a new method 'start_engine' in the Vehicle impl block</step>
Assistant:
<operations>
<insert_sibling_after path="src/vehicle.rs" symbol="impl Vehicle fn new" description="Add start_engine method"/>
</operations>
A:
{
"operations": [
{
"kind": "InsertSiblingAfter",
"path": "src/vehicle.rs",
"symbol": "impl Vehicle fn new",
"description": "Add start_engine method"
}
]
}
Example 4:
@ -198,44 +213,188 @@ impl Employee {
}
```
Symbols for src/employee.rs:
- struct Employee
- struct Employee name
- struct Employee position
- struct Employee salary
- struct Employee department
- impl Employee
- impl Employee fn new
- impl Employee fn print_details
- impl Employee fn give_raise
<step>Make salary an f32</step>
What are the operations for the step: <step>Make salary an f32</step>
A (wrong):
<operations>
<update path="src/employee.rs" symbol="struct Employee" description="Change the type of salary to an f32" />
<update path="src/employee.rs" symbol="struct Employee salary" description="Change the type to an f32" />
</operations>
{
"operations": [
{
"kind": "Update",
"path": "src/employee.rs",
"symbol": "struct Employee",
"description": "Change the type of salary to an f32"
},
{
"kind": "Update",
"path": "src/employee.rs",
"symbol": "struct Employee salary",
"description": "Change the type to an f32"
}
]
}
This example demonstrates what not to do. `struct Employee salary` is a child of `struct Employee`.
A (corrected):
<operations>
<update path="src/employee.rs" symbol="struct Employee salary" description="Change the type to an f32" />
</operations>
{
"operations": [
{
"kind": "Update",
"path": "src/employee.rs",
"symbol": "struct Employee salary",
"description": "Change the type to an f32"
}
]
}
User:
What are the correct operations for the step: <step>Remove the 'department' field and update the 'print_details' method</step>
What are the correct operations for the step: <step>Remove the 'department' field and update the 'print_details' method</step>
A:
<operations>
<delete path="src/employee.rs" symbol="struct Employee department" />
<update path="src/employee.rs" symbol="impl Employee fn print_details" description="Don't print the 'department' field" />
</operations>
{
"operations": [
{
"kind": "Delete",
"path": "src/employee.rs",
"symbol": "struct Employee department"
},
{
"kind": "Update",
"path": "src/employee.rs",
"symbol": "impl Employee fn print_details",
"description": "Don't print the 'department' field"
}
]
}
Now generate the operations for the following step.
Output only valid XML containing valid operations with their required attributes.
NEVER output code or any other text inside <operation> tags. If you do, you will replaced with another model.
Your response *MUST* begin with <operations> and end with </operations>:
Example 5:
User:
```rs src/game.rs
struct Player {
name: String,
health: i32,
pub score: u32,
}
impl Player {
pub fn new(name: String) -> Self {
Player { name, health: 100, score: 0 }
}
}
struct Game {
players: Vec<Player>,
}
impl Game {
fn new() -> Self {
Game { players: Vec::new() }
}
}
```
<step>Add a 'level' field to Player and update the 'new' method</step>
A:
{
"operations": [
{
"kind": "InsertSiblingAfter",
"path": "src/game.rs",
"symbol": "struct Player pub score",
"description": "Add level field to Player"
},
{
"kind": "Update",
"path": "src/game.rs",
"symbol": "impl Player pub fn new",
"description": "Initialize level in new method"
}
]
}
Example 6:
User:
```rs src/config.rs
use std::collections::HashMap;
struct Config {
settings: HashMap<String, String>,
}
impl Config {
fn new() -> Self {
Config { settings: HashMap::new() }
}
}
```
<step>Add a 'load_from_file' method to Config and import necessary modules</step>
A:
{
"operations": [
{
"kind": "PrependChild",
"path": "src/config.rs",
"description": "Import std::fs and std::io modules"
},
{
"kind": "AppendChild",
"path": "src/config.rs",
"symbol": "impl Config",
"description": "Add load_from_file method"
}
]
}
Example 7:
User:
```rs src/database.rs
pub(crate) struct Database {
connection: Connection,
}
impl Database {
fn new(url: &str) -> Result<Self, Error> {
let connection = Connection::connect(url)?;
Ok(Database { connection })
}
async fn query(&self, sql: &str) -> Result<Vec<Row>, Error> {
self.connection.query(sql, &[])
}
}
```
<step>Add error handling to the 'query' method and create a custom error type</step>
A:
{
"operations": [
{
"kind": "PrependChild",
"path": "src/database.rs",
"description": "Import necessary error handling modules"
},
{
"kind": "InsertSiblingBefore",
"path": "src/database.rs",
"symbol": "pub(crate) struct Database",
"description": "Define custom DatabaseError enum"
},
{
"kind": "Update",
"path": "src/database.rs",
"symbol": "impl Database async fn query",
"description": "Implement error handling in query method"
}
]
}
Now generate the operations for the following step:

View File

@ -64,6 +64,7 @@ serde_json.workspace = true
settings.workspace = true
similar.workspace = true
smol.workspace = true
strsim.workspace = true
telemetry_events.workspace = true
terminal.workspace = true
terminal_view.workspace = true

View File

@ -488,6 +488,23 @@ impl Debug for EditStepOperations {
}
/// A description of an operation to apply to one location in the codebase.
///
/// This object represents a single edit operation that can be performed on a specific file
/// in the codebase. It encapsulates both the location (file path) and the nature of the
/// edit to be made.
///
/// # Fields
///
/// * `path`: A string representing the file path where the edit operation should be applied.
/// This path is relative to the root of the project or repository.
///
/// * `kind`: An enum representing the specific type of edit operation to be performed.
///
/// # Usage
///
/// `EditOperation` is used within a code editor to represent and apply
/// programmatic changes to source code. It provides a structured way to describe
/// edits for features like refactoring tools or AI-assisted coding suggestions.
#[derive(Clone, Debug, PartialEq, Eq, Deserialize, JsonSchema)]
pub struct EditOperation {
/// The path to the file containing the relevant operation
@ -527,7 +544,10 @@ impl EditOperation {
let candidate = outline
.path_candidates
.iter()
.find(|item| item.string == symbol)
.max_by(|a, b| {
strsim::jaro_winkler(&a.string, symbol)
.total_cmp(&strsim::jaro_winkler(&b.string, symbol))
})
.with_context(|| {
format!(
"symbol {:?} not found in path {:?}.\ncandidates: {:?}.\nparse status: {:?}. text:\n{}",
@ -607,51 +627,62 @@ impl EditOperation {
#[derive(Clone, Debug, PartialEq, Eq, Deserialize, JsonSchema)]
#[serde(tag = "kind")]
pub enum EditOperationKind {
/// Rewrite the specified symbol in its entirely based on the given description.
/// Rewrites the specified symbol entirely based on the given description.
/// This operation completely replaces the existing symbol with new content.
Update {
/// A full path to the symbol to be rewritten from the provided list.
/// A fully-qualified reference to the symbol, e.g. `mod foo impl Bar pub fn baz` instead of just `fn baz`.
/// The path should uniquely identify the symbol within the containing file.
symbol: String,
/// A brief one-line description of the change that should be applied.
/// A brief description of the transformation to apply to the symbol.
description: String,
},
/// Create a new file with the given path based on the given description.
/// Creates a new file with the given path based on the provided description.
/// This operation adds a new file to the codebase.
Create {
/// A brief one-line description of the change that should be applied.
/// A brief description of the file to be created.
description: String,
},
/// Insert a new symbol based on the given description before the specified symbol.
/// Inserts a new symbol based on the given description before the specified symbol.
/// This operation adds new content immediately preceding an existing symbol.
InsertSiblingBefore {
/// A full path to the symbol to be rewritten from the provided list.
/// A fully-qualified reference to the symbol, e.g. `mod foo impl Bar pub fn baz` instead of just `fn baz`.
/// The new content will be inserted immediately before this symbol.
symbol: String,
/// A brief one-line description of the change that should be applied.
/// A brief description of the new symbol to be inserted.
description: String,
},
/// Insert a new symbol based on the given description after the specified symbol.
/// Inserts a new symbol based on the given description after the specified symbol.
/// This operation adds new content immediately following an existing symbol.
InsertSiblingAfter {
/// A full path to the symbol to be rewritten from the provided list.
/// A fully-qualified reference to the symbol, e.g. `mod foo impl Bar pub fn baz` instead of just `fn baz`.
/// The new content will be inserted immediately after this symbol.
symbol: String,
/// A brief one-line description of the change that should be applied.
/// A brief description of the new symbol to be inserted.
description: String,
},
/// Insert a new symbol as a child of the specified symbol at the start.
/// Inserts a new symbol as a child of the specified symbol at the start.
/// This operation adds new content as the first child of an existing symbol (or file if no symbol is provided).
PrependChild {
/// An optional full path to the symbol to be rewritten from the provided list.
/// If not provided, the edit should be applied at the top of the file.
/// An optional fully-qualified reference to the symbol after the code you want to insert, e.g. `mod foo impl Bar pub fn baz` instead of just `fn baz`.
/// If provided, the new content will be inserted as the first child of this symbol.
/// If not provided, the new content will be inserted at the top of the file.
symbol: Option<String>,
/// A brief one-line description of the change that should be applied.
/// A brief description of the new symbol to be inserted.
description: String,
},
/// Insert a new symbol as a child of the specified symbol at the end.
/// Inserts a new symbol as a child of the specified symbol at the end.
/// This operation adds new content as the last child of an existing symbol (or file if no symbol is provided).
AppendChild {
/// An optional full path to the symbol to be rewritten from the provided list.
/// If not provided, the edit should be applied at the top of the file.
/// An optional fully-qualified reference to the symbol before the code you want to insert, e.g. `mod foo impl Bar pub fn baz` instead of just `fn baz`.
/// If provided, the new content will be inserted as the last child of this symbol.
/// If not provided, the new content will be applied at the bottom of the file.
symbol: Option<String>,
/// A brief one-line description of the change that should be applied.
/// A brief description of the new symbol to be inserted.
description: String,
},
/// Delete the specified symbol.
/// Deletes the specified symbol from the containing file.
Delete {
/// A full path to the symbol to be rewritten from the provided list.
/// An fully-qualified reference to the symbol to be deleted, e.g. `mod foo impl Bar pub fn baz` instead of just `fn baz`.
symbol: String,
},
}