2016-04-21 19:43:21 +03:00
/** @babel */
import TokenizedBufferIterator from '../src/tokenized-buffer-iterator'
import { Point } from 'text-buffer'
describe ( 'TokenizedBufferIterator' , ( ) => {
2016-08-02 21:19:45 +03:00
describe ( 'seek(position)' , function ( ) {
2016-09-06 20:24:57 +03:00
it ( 'seeks to the leftmost tag boundary greater than or equal to the given position and returns the containing tags' , function ( ) {
2016-08-02 21:19:45 +03:00
const tokenizedBuffer = {
tokenizedLineForRow ( row ) {
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
if ( row === 0 ) {
return {
2016-09-06 20:24:57 +03:00
tags : [ - 1 , - 2 , - 3 , - 4 , - 5 , 3 , - 3 , - 4 , - 6 , - 5 , 4 , - 6 , - 3 , - 4 ] ,
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
text : 'foo bar' ,
openScopes : [ ]
}
} else {
return null
2016-08-02 21:19:45 +03:00
}
2016-08-03 00:12:18 +03:00
} ,
2016-04-21 19:43:21 +03:00
2016-08-03 00:12:18 +03:00
grammar : {
scopeForId ( id ) {
return {
'-1' : 'foo' , '-2' : 'foo' ,
'-3' : 'bar' , '-4' : 'bar' ,
'-5' : 'baz' , '-6' : 'baz'
} [ id ]
}
2016-07-28 00:09:17 +03:00
}
2016-04-21 19:43:21 +03:00
}
2016-08-03 00:12:18 +03:00
const iterator = new TokenizedBufferIterator ( tokenizedBuffer )
2016-08-02 21:19:45 +03:00
expect ( iterator . seek ( Point ( 0 , 0 ) ) ) . toEqual ( [ ] )
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 0 ) )
2016-08-02 21:19:45 +03:00
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'foo' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'foo' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'bar' ] )
expect ( iterator . seek ( Point ( 0 , 1 ) ) ) . toEqual ( [ 'baz' ] )
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 3 ) )
2016-08-02 21:19:45 +03:00
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ ] )
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'bar' ] )
2016-04-21 19:43:21 +03:00
2016-08-02 21:19:45 +03:00
iterator . moveToSuccessor ( )
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 3 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'bar' , 'baz' ] )
2016-09-06 20:24:57 +03:00
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'baz' ] )
2016-04-21 19:43:21 +03:00
2016-08-02 21:19:45 +03:00
expect ( iterator . seek ( Point ( 0 , 3 ) ) ) . toEqual ( [ 'baz' ] )
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 3 ) )
2016-08-02 21:19:45 +03:00
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'bar' ] )
2016-04-21 19:43:21 +03:00
2016-08-02 21:19:45 +03:00
iterator . moveToSuccessor ( )
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 3 ) )
2016-08-02 21:19:45 +03:00
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'bar' , 'baz' ] )
2016-09-06 20:24:57 +03:00
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'baz' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 7 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'baz' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'bar' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 7 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'bar' ] )
2016-08-02 21:19:45 +03:00
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ ] )
Clip to next boundary when seeking iterator to the middle of a text tag
Previously, when calling `TokenizedBufferIterator.seek` with a position
that lied within a text tag, we advanced the iterator by the extent of
that tag without, however, consuming it. Hence, when calling
`moveToSuccessor` afterward, we would consume that tag and advance the
iterator again, thus effectively moving it twice and making its position
inaccurate.
An option could be to clip to the left of the textual tag without
consuming it. However, this would be a little odd with respect to the
current contract between (`DisplayLayer` and) `seek`, whose promise is
to move the iterator to a position that is greater or equal than the one
asked by the caller.
Therefore, with this commit, we are changing the behavior of `seek` in
this particular scenario to consume the tag in question and process all
its siblings until a tag boundary is finally found. This ensures that
the above contract is always respected, while still preserving the "seek
to leftmost tag boundary" semantics (i.e. notice how in the changed test
case, calling `seek` with `Point(0, 1)` is the same as calling it with
`Point(0, 3)`).
2016-09-06 18:33:14 +03:00
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 1 , 0 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ ] )
2016-09-06 20:24:57 +03:00
expect ( iterator . seek ( Point ( 0 , 5 ) ) ) . toEqual ( [ 'baz' ] )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 7 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'baz' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'bar' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 7 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'bar' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ ] )
2016-08-02 21:19:45 +03:00
} )
2016-04-21 19:43:21 +03:00
} )
Report boundary when next line's `openScopes` don't match containingTags
Sometimes, when performing an edit, a change on some row can cause
another row's tokenization to be affected: the classic example is
opening a multi-line comment on a line, thereby causing subsequent lines
to become commented out without changing the buffer's contents at those
locations. We call this technique "spill detection".
Since the amount of affected lines can grow quite large, Atom tokenizes
synchronously only those lines where the edit occurred, triggering
background (i.e. `setInterval`) tokenization for all the other lines
that need to be refreshed because of a "spill".
As predictable, this approach causes a temporary inconsistency in the
stored tokenized lines. In particular, suppose we had two tokenized
lines, and that there's an open tag in the middle of the first one which
closes on the second one. If we perform an edit that causes that tag to
be deleted, when reading the second tokenized line we now have a
dangling close tag.
This didn't matter much in the `DisplayBuffer` version, because for each
line we reopened all the tags found in the stored `openScopes` property,
and closed all the tags starting on such line right at the end of it.
In the `DisplayLayer` world, however, we don't read tags from each
tokenized line, but we let `TokenizedBufferIterator` report tag
boundaries and their respective location: since this is an
iterator-based approach, we were not reading `openScopes` for each
`TokenizedLine`, making the dangling close tag example showed above
evident (e.g. close and open tags didn't match anymore, and exceptions
were being thrown all over the place).
To solve this issue I have considered several approaches:
1. Recompute all the lines where a spill occurs synchronously when the
buffer changes. For large files, this can be pretty onerous, and we
don't want to regress in terms of performance.
2. Let `TokenizedBuffer.tokenizedLineForRow(bufferRow)` recompute
potential invalid lines lazily (starting from the first invalid line,
down to the requested buffer row). When editing the first lines of a
long file and causing a spill to occur, Atom (or any other package,
for that matter) could request a line down in the file, causing this
method to recompute lots and lots of lines.
3. Let `DisplayLayer` deal with closing an un-opened tag. This is nice
because we already keep track of containing tags there. However, it
also feels like the wrong spot where to put this logic, as display
layers shouldn't deal with grammar-related stuff.
4. Keep track of containing tags in `TokenizedBufferIterator`, and
report a boundary at the end of the line when the subsequent one's
`openScopes` property doesn't match the `containingTags` that the
iterator has been keeping track of.
Of all these solutions I've chosen 4), because it's the most performant
and clean in terms of code.
2016-04-27 18:47:49 +03:00
2016-08-02 21:19:45 +03:00
describe ( 'moveToSuccessor()' , function ( ) {
it ( 'reports two boundaries at the same position when tags close, open, then close again without a non-negative integer separating them (regression)' , ( ) => {
const tokenizedBuffer = {
tokenizedLineForRow ( ) {
Report boundary when next line's `openScopes` don't match containingTags
Sometimes, when performing an edit, a change on some row can cause
another row's tokenization to be affected: the classic example is
opening a multi-line comment on a line, thereby causing subsequent lines
to become commented out without changing the buffer's contents at those
locations. We call this technique "spill detection".
Since the amount of affected lines can grow quite large, Atom tokenizes
synchronously only those lines where the edit occurred, triggering
background (i.e. `setInterval`) tokenization for all the other lines
that need to be refreshed because of a "spill".
As predictable, this approach causes a temporary inconsistency in the
stored tokenized lines. In particular, suppose we had two tokenized
lines, and that there's an open tag in the middle of the first one which
closes on the second one. If we perform an edit that causes that tag to
be deleted, when reading the second tokenized line we now have a
dangling close tag.
This didn't matter much in the `DisplayBuffer` version, because for each
line we reopened all the tags found in the stored `openScopes` property,
and closed all the tags starting on such line right at the end of it.
In the `DisplayLayer` world, however, we don't read tags from each
tokenized line, but we let `TokenizedBufferIterator` report tag
boundaries and their respective location: since this is an
iterator-based approach, we were not reading `openScopes` for each
`TokenizedLine`, making the dangling close tag example showed above
evident (e.g. close and open tags didn't match anymore, and exceptions
were being thrown all over the place).
To solve this issue I have considered several approaches:
1. Recompute all the lines where a spill occurs synchronously when the
buffer changes. For large files, this can be pretty onerous, and we
don't want to regress in terms of performance.
2. Let `TokenizedBuffer.tokenizedLineForRow(bufferRow)` recompute
potential invalid lines lazily (starting from the first invalid line,
down to the requested buffer row). When editing the first lines of a
long file and causing a spill to occur, Atom (or any other package,
for that matter) could request a line down in the file, causing this
method to recompute lots and lots of lines.
3. Let `DisplayLayer` deal with closing an un-opened tag. This is nice
because we already keep track of containing tags there. However, it
also feels like the wrong spot where to put this logic, as display
layers shouldn't deal with grammar-related stuff.
4. Keep track of containing tags in `TokenizedBufferIterator`, and
report a boundary at the end of the line when the subsequent one's
`openScopes` property doesn't match the `containingTags` that the
iterator has been keeping track of.
Of all these solutions I've chosen 4), because it's the most performant
and clean in terms of code.
2016-04-27 18:47:49 +03:00
return {
2016-08-02 21:19:45 +03:00
tags : [ - 1 , - 2 , - 1 , - 2 ] ,
Report boundary when next line's `openScopes` don't match containingTags
Sometimes, when performing an edit, a change on some row can cause
another row's tokenization to be affected: the classic example is
opening a multi-line comment on a line, thereby causing subsequent lines
to become commented out without changing the buffer's contents at those
locations. We call this technique "spill detection".
Since the amount of affected lines can grow quite large, Atom tokenizes
synchronously only those lines where the edit occurred, triggering
background (i.e. `setInterval`) tokenization for all the other lines
that need to be refreshed because of a "spill".
As predictable, this approach causes a temporary inconsistency in the
stored tokenized lines. In particular, suppose we had two tokenized
lines, and that there's an open tag in the middle of the first one which
closes on the second one. If we perform an edit that causes that tag to
be deleted, when reading the second tokenized line we now have a
dangling close tag.
This didn't matter much in the `DisplayBuffer` version, because for each
line we reopened all the tags found in the stored `openScopes` property,
and closed all the tags starting on such line right at the end of it.
In the `DisplayLayer` world, however, we don't read tags from each
tokenized line, but we let `TokenizedBufferIterator` report tag
boundaries and their respective location: since this is an
iterator-based approach, we were not reading `openScopes` for each
`TokenizedLine`, making the dangling close tag example showed above
evident (e.g. close and open tags didn't match anymore, and exceptions
were being thrown all over the place).
To solve this issue I have considered several approaches:
1. Recompute all the lines where a spill occurs synchronously when the
buffer changes. For large files, this can be pretty onerous, and we
don't want to regress in terms of performance.
2. Let `TokenizedBuffer.tokenizedLineForRow(bufferRow)` recompute
potential invalid lines lazily (starting from the first invalid line,
down to the requested buffer row). When editing the first lines of a
long file and causing a spill to occur, Atom (or any other package,
for that matter) could request a line down in the file, causing this
method to recompute lots and lots of lines.
3. Let `DisplayLayer` deal with closing an un-opened tag. This is nice
because we already keep track of containing tags there. However, it
also feels like the wrong spot where to put this logic, as display
layers shouldn't deal with grammar-related stuff.
4. Keep track of containing tags in `TokenizedBufferIterator`, and
report a boundary at the end of the line when the subsequent one's
`openScopes` property doesn't match the `containingTags` that the
iterator has been keeping track of.
Of all these solutions I've chosen 4), because it's the most performant
and clean in terms of code.
2016-04-27 18:47:49 +03:00
text : '' ,
openScopes : [ ]
}
2016-08-03 00:12:18 +03:00
} ,
Report boundary when next line's `openScopes` don't match containingTags
Sometimes, when performing an edit, a change on some row can cause
another row's tokenization to be affected: the classic example is
opening a multi-line comment on a line, thereby causing subsequent lines
to become commented out without changing the buffer's contents at those
locations. We call this technique "spill detection".
Since the amount of affected lines can grow quite large, Atom tokenizes
synchronously only those lines where the edit occurred, triggering
background (i.e. `setInterval`) tokenization for all the other lines
that need to be refreshed because of a "spill".
As predictable, this approach causes a temporary inconsistency in the
stored tokenized lines. In particular, suppose we had two tokenized
lines, and that there's an open tag in the middle of the first one which
closes on the second one. If we perform an edit that causes that tag to
be deleted, when reading the second tokenized line we now have a
dangling close tag.
This didn't matter much in the `DisplayBuffer` version, because for each
line we reopened all the tags found in the stored `openScopes` property,
and closed all the tags starting on such line right at the end of it.
In the `DisplayLayer` world, however, we don't read tags from each
tokenized line, but we let `TokenizedBufferIterator` report tag
boundaries and their respective location: since this is an
iterator-based approach, we were not reading `openScopes` for each
`TokenizedLine`, making the dangling close tag example showed above
evident (e.g. close and open tags didn't match anymore, and exceptions
were being thrown all over the place).
To solve this issue I have considered several approaches:
1. Recompute all the lines where a spill occurs synchronously when the
buffer changes. For large files, this can be pretty onerous, and we
don't want to regress in terms of performance.
2. Let `TokenizedBuffer.tokenizedLineForRow(bufferRow)` recompute
potential invalid lines lazily (starting from the first invalid line,
down to the requested buffer row). When editing the first lines of a
long file and causing a spill to occur, Atom (or any other package,
for that matter) could request a line down in the file, causing this
method to recompute lots and lots of lines.
3. Let `DisplayLayer` deal with closing an un-opened tag. This is nice
because we already keep track of containing tags there. However, it
also feels like the wrong spot where to put this logic, as display
layers shouldn't deal with grammar-related stuff.
4. Keep track of containing tags in `TokenizedBufferIterator`, and
report a boundary at the end of the line when the subsequent one's
`openScopes` property doesn't match the `containingTags` that the
iterator has been keeping track of.
Of all these solutions I've chosen 4), because it's the most performant
and clean in terms of code.
2016-04-27 18:47:49 +03:00
2016-08-03 00:12:18 +03:00
grammar : {
scopeForId ( ) {
return 'foo'
Report boundary when next line's `openScopes` don't match containingTags
Sometimes, when performing an edit, a change on some row can cause
another row's tokenization to be affected: the classic example is
opening a multi-line comment on a line, thereby causing subsequent lines
to become commented out without changing the buffer's contents at those
locations. We call this technique "spill detection".
Since the amount of affected lines can grow quite large, Atom tokenizes
synchronously only those lines where the edit occurred, triggering
background (i.e. `setInterval`) tokenization for all the other lines
that need to be refreshed because of a "spill".
As predictable, this approach causes a temporary inconsistency in the
stored tokenized lines. In particular, suppose we had two tokenized
lines, and that there's an open tag in the middle of the first one which
closes on the second one. If we perform an edit that causes that tag to
be deleted, when reading the second tokenized line we now have a
dangling close tag.
This didn't matter much in the `DisplayBuffer` version, because for each
line we reopened all the tags found in the stored `openScopes` property,
and closed all the tags starting on such line right at the end of it.
In the `DisplayLayer` world, however, we don't read tags from each
tokenized line, but we let `TokenizedBufferIterator` report tag
boundaries and their respective location: since this is an
iterator-based approach, we were not reading `openScopes` for each
`TokenizedLine`, making the dangling close tag example showed above
evident (e.g. close and open tags didn't match anymore, and exceptions
were being thrown all over the place).
To solve this issue I have considered several approaches:
1. Recompute all the lines where a spill occurs synchronously when the
buffer changes. For large files, this can be pretty onerous, and we
don't want to regress in terms of performance.
2. Let `TokenizedBuffer.tokenizedLineForRow(bufferRow)` recompute
potential invalid lines lazily (starting from the first invalid line,
down to the requested buffer row). When editing the first lines of a
long file and causing a spill to occur, Atom (or any other package,
for that matter) could request a line down in the file, causing this
method to recompute lots and lots of lines.
3. Let `DisplayLayer` deal with closing an un-opened tag. This is nice
because we already keep track of containing tags there. However, it
also feels like the wrong spot where to put this logic, as display
layers shouldn't deal with grammar-related stuff.
4. Keep track of containing tags in `TokenizedBufferIterator`, and
report a boundary at the end of the line when the subsequent one's
`openScopes` property doesn't match the `containingTags` that the
iterator has been keeping track of.
Of all these solutions I've chosen 4), because it's the most performant
and clean in terms of code.
2016-04-27 18:47:49 +03:00
}
}
}
2016-08-03 00:12:18 +03:00
const iterator = new TokenizedBufferIterator ( tokenizedBuffer )
2016-08-02 21:19:45 +03:00
iterator . seek ( Point ( 0 , 0 ) )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 0 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'foo' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 0 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'foo' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'foo' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'foo' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ ] )
} )
it ( "reports a boundary at line end if the next line's open scopes don't match the containing tags for the current line" , ( ) => {
const tokenizedBuffer = {
tokenizedLineForRow ( row ) {
if ( row === 0 ) {
return {
tags : [ - 1 , 3 , - 2 , - 3 ] ,
text : 'bar' ,
openScopes : [ ]
}
} else if ( row === 1 ) {
return {
tags : [ 3 ] ,
text : 'baz' ,
openScopes : [ - 1 ]
}
} else if ( row === 2 ) {
return {
tags : [ - 2 ] ,
text : '' ,
openScopes : [ - 1 ]
}
}
2016-08-03 00:12:18 +03:00
} ,
grammar : {
scopeForId ( id ) {
if ( id === - 2 || id === - 1 ) {
return 'foo'
} else if ( id === - 3 ) {
return 'qux'
}
2016-07-28 00:09:17 +03:00
}
Report boundary when next line's `openScopes` don't match containingTags
Sometimes, when performing an edit, a change on some row can cause
another row's tokenization to be affected: the classic example is
opening a multi-line comment on a line, thereby causing subsequent lines
to become commented out without changing the buffer's contents at those
locations. We call this technique "spill detection".
Since the amount of affected lines can grow quite large, Atom tokenizes
synchronously only those lines where the edit occurred, triggering
background (i.e. `setInterval`) tokenization for all the other lines
that need to be refreshed because of a "spill".
As predictable, this approach causes a temporary inconsistency in the
stored tokenized lines. In particular, suppose we had two tokenized
lines, and that there's an open tag in the middle of the first one which
closes on the second one. If we perform an edit that causes that tag to
be deleted, when reading the second tokenized line we now have a
dangling close tag.
This didn't matter much in the `DisplayBuffer` version, because for each
line we reopened all the tags found in the stored `openScopes` property,
and closed all the tags starting on such line right at the end of it.
In the `DisplayLayer` world, however, we don't read tags from each
tokenized line, but we let `TokenizedBufferIterator` report tag
boundaries and their respective location: since this is an
iterator-based approach, we were not reading `openScopes` for each
`TokenizedLine`, making the dangling close tag example showed above
evident (e.g. close and open tags didn't match anymore, and exceptions
were being thrown all over the place).
To solve this issue I have considered several approaches:
1. Recompute all the lines where a spill occurs synchronously when the
buffer changes. For large files, this can be pretty onerous, and we
don't want to regress in terms of performance.
2. Let `TokenizedBuffer.tokenizedLineForRow(bufferRow)` recompute
potential invalid lines lazily (starting from the first invalid line,
down to the requested buffer row). When editing the first lines of a
long file and causing a spill to occur, Atom (or any other package,
for that matter) could request a line down in the file, causing this
method to recompute lots and lots of lines.
3. Let `DisplayLayer` deal with closing an un-opened tag. This is nice
because we already keep track of containing tags there. However, it
also feels like the wrong spot where to put this logic, as display
layers shouldn't deal with grammar-related stuff.
4. Keep track of containing tags in `TokenizedBufferIterator`, and
report a boundary at the end of the line when the subsequent one's
`openScopes` property doesn't match the `containingTags` that the
iterator has been keeping track of.
Of all these solutions I've chosen 4), because it's the most performant
and clean in terms of code.
2016-04-27 18:47:49 +03:00
}
}
2016-08-02 21:19:45 +03:00
2016-08-03 00:12:18 +03:00
const iterator = new TokenizedBufferIterator ( tokenizedBuffer )
2016-08-02 21:19:45 +03:00
iterator . seek ( Point ( 0 , 0 ) )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 0 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'foo' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 3 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'foo' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'qux' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 0 , 3 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'qux' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 1 , 0 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ 'foo' ] )
iterator . moveToSuccessor ( )
expect ( iterator . getPosition ( ) ) . toEqual ( Point ( 2 , 0 ) )
expect ( iterator . getCloseTags ( ) ) . toEqual ( [ 'foo' ] )
expect ( iterator . getOpenTags ( ) ) . toEqual ( [ ] )
} )
Report boundary when next line's `openScopes` don't match containingTags
Sometimes, when performing an edit, a change on some row can cause
another row's tokenization to be affected: the classic example is
opening a multi-line comment on a line, thereby causing subsequent lines
to become commented out without changing the buffer's contents at those
locations. We call this technique "spill detection".
Since the amount of affected lines can grow quite large, Atom tokenizes
synchronously only those lines where the edit occurred, triggering
background (i.e. `setInterval`) tokenization for all the other lines
that need to be refreshed because of a "spill".
As predictable, this approach causes a temporary inconsistency in the
stored tokenized lines. In particular, suppose we had two tokenized
lines, and that there's an open tag in the middle of the first one which
closes on the second one. If we perform an edit that causes that tag to
be deleted, when reading the second tokenized line we now have a
dangling close tag.
This didn't matter much in the `DisplayBuffer` version, because for each
line we reopened all the tags found in the stored `openScopes` property,
and closed all the tags starting on such line right at the end of it.
In the `DisplayLayer` world, however, we don't read tags from each
tokenized line, but we let `TokenizedBufferIterator` report tag
boundaries and their respective location: since this is an
iterator-based approach, we were not reading `openScopes` for each
`TokenizedLine`, making the dangling close tag example showed above
evident (e.g. close and open tags didn't match anymore, and exceptions
were being thrown all over the place).
To solve this issue I have considered several approaches:
1. Recompute all the lines where a spill occurs synchronously when the
buffer changes. For large files, this can be pretty onerous, and we
don't want to regress in terms of performance.
2. Let `TokenizedBuffer.tokenizedLineForRow(bufferRow)` recompute
potential invalid lines lazily (starting from the first invalid line,
down to the requested buffer row). When editing the first lines of a
long file and causing a spill to occur, Atom (or any other package,
for that matter) could request a line down in the file, causing this
method to recompute lots and lots of lines.
3. Let `DisplayLayer` deal with closing an un-opened tag. This is nice
because we already keep track of containing tags there. However, it
also feels like the wrong spot where to put this logic, as display
layers shouldn't deal with grammar-related stuff.
4. Keep track of containing tags in `TokenizedBufferIterator`, and
report a boundary at the end of the line when the subsequent one's
`openScopes` property doesn't match the `containingTags` that the
iterator has been keeping track of.
Of all these solutions I've chosen 4), because it's the most performant
and clean in terms of code.
2016-04-27 18:47:49 +03:00
} )
2016-04-21 19:43:21 +03:00
} )