* Added support for for-from loop, see #3832
* for-from: remove extra newline and add support for ranges
* for-from: tidy up the lexer
* for-from: add support for patterns
* for-from: fix bad alignment
* for-from: add two more tests
* for-from: fix test "for-from loops over generators"
See explanation here: https://github.com/jashkenas/coffeescript/pull/4306#issuecomment-257066877
* for-from: delete leftover console.log
* Refactor the big `if` block in the lexer to be as minimal a change from `master` as we can get away with
* Cleanup to make more idiomatic, remove trailing whitespace, minor performance improvements
* for-from: move code from one file to another
* for-from: clean up whitespace
* for-from: lexer bikeshedding
* Move "own is not supported in for-from loops" test into error_messages.coffee; improve error message so that "own" is underlined
* Revert unnecessary changes, to minimize the lines of code modified by this PR
This is an upstream port of https://github.com/decaffeinate/coffeescript/pull/10
See that PR for links to the issues that this fixes.
Just like OUTDENT and CALL_END tokens, close-curly-brace tokens can be generated
without having a real location, and if that position overlaps with a later
token, it can cause the AST to have bad location data. Just like the other two
token types, we now give `}` tokens the position of the previous real token,
which makes all AST nodes have reasonable locations.
This is an upstream port of https://github.com/decaffeinate/coffeescript/pull/9
The existing logic for computing the end location of a string was to take the
end of the string contents, then add the delimiter length to last_column. For
example, `"""abc"""` would have an end position three characters after the `c`.
However, if a string ended in a newline, then the end location for the string
contents would be one line above the end location for the string, so the proper
fix is to move the end location to the next line, not just to shift it to the
right.
This avoids a bug where the location data would sometimes reference a
non-existent location (one past the end of its line). It fixes the AST location
data, although as far as I know, it never has caused correctness issues in the
CoffeeScript output.
This is an upstream port for the patch https://github.com/decaffeinate/coffeescript/pull/8
See https://github.com/decaffeinate/decaffeinate/issues/291 for the bug that this fixed.
For the most part, CoffeeScript and JavaScript have the same precedence rules,
but in some cases, the intermediate AST format didn't represent the actual
evaluation order. For example, in the expression `a or b and c`, the `and` is
evaluated first, but the parser treated the two operators with equal precedence.
This was still correct end-to-end because CoffeeScript simply emitted the result
without parens, but any intermediate tools using the CoffeeScript parser could
become confused.
Here are the JS operator precedence rules:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence
For the most part, CoffeeScript already follows these. `COMPARE` operators
already behave differently due to chained comparisons, so I think we don't need
to worry about following JS precedence for those. So I think the only case where
it was behaving differently in an important way was for the binary/bitwise
operators that are being changed here.
As part of this change, I also introduced a new token tag, `BIN?`, for the
binary form of the `?` operator.
Fixes https://github.com/decaffeinate/decaffeinate/issues/446
In addition to OUTDENT tokens, CALL_END tokens can also be virtual tokens
without a real location, and sometimes they end up with a location that's
incorrect.
This commit adds another post-processing step after normal lexing that sets the
locationData on all OUTDENT tokens to be at the last character of the previous
token. This does feel like a little bit of a hack. Ideally the location data
would be set correctly in the first place and not in a post-processing step, but
I tried that and some temporary intermediate tokens were causing problems, so I
decided to set the location data once those intermediate tokens were removed.
Also, having this as a separate processing step makes it more robust and
isolated.
This fixes the problem in https://github.com/decaffeinate/decaffeinate/issues/371 .
In that issue, the CoffeeScript tokens had three OUTDENT tokens in a row, and
the last two overlapped with the `]`. Since at least one of those OUTDENT tokens
was considered part of the function body, the function expression had an ending
position just after the end of the `]`.
OUTDENT tokens are sort of a weird case in the lexer anyway, since they often
don't correspond to an actual location in the source code. It seems like the
code in `lexer.coffee` makes an attempt at finding a good place for them, but in
some cases, it has a bad result. This seems hard to avoid in the general case.
For example, in this code:
```coffee
[->
a]
```
There must be an OUTDENT between the `a` and the `]`, but CoffeeScript tokens
have an inclusive start and end, so they must always be at least one character
wide (I think). In this case, the lexer was choosing the `]` as the location,
and the parser ended up generating correct location data, I believe because
it ignores the outermost INDENT and OUTDENT tokens. However, with multiple
OUTDENT tokens in a row, the parser ends up producing location data that is
wrong.
It seems to me like there isn't a solid answer to "what location do OUTDENT
tokens have", since it hasn't mattered much, but for this commit, I'm defining
it: they always have the location of the last character of the previous token.
This should hopefully be fairly safe because tokens are still in the same order
relative to each other. Also, it's worth noting that this makes the start
location for OUTDENT tokens awkward. However, OUTDENT tokens are always used to
mark the end of something, so their `last_line` and `last_column` values are
always what matter when determining AST node bounds, so it is most important for
those to be correct.
`"""` (and `"`) strings are lexed into an array of tokens, consisting of
strings and interpolations. Previously, the minimum indententation
inside `"""` strings was stripped from the beginning of _all_ of those
string tokens. Usually, the indentation is longer than any other
sequence of spaces in a `"""` string, so the problem didn't occur in
most cases. This commit makes sure to only strip indentation after
newlines.
Fixes#4314.
Very large decimal number literals, binary number literals and octal
literals are lexed into an INFINITY token (instead of a NUMBER token)
and compiled into `2e308`. That is is supposed to be the case for very
large hexdecimal dumber literals as well, but previously wasn't.
Before:
$ node -p 'require("./").tokens(`0x${Array(256 + 1).join("f")}`)[0][0]'
NUMBER
After:
$ node -p 'require("./").tokens(`0x${Array(256 + 1).join("f")}`)[0][0]'
INFINITY
This commit also cleans up `numberToken` in lexer.coffee a bit.
`isLiteralArguments` mistakenly looked at `Literal`s instead of
`IdentifierLiteral`s.
This also gets rid of the ugly `.asKey` hack in nodes.coffee.
Fixes#4320.
Before:
```
$ cat tmp.coffee.md
test
a
$ ./bin/coffee tmp.coffee.md
ReferenceError: a is not defined
at Object.<anonymous> (/src/coffee-script/tmp.coffee.md:2:3)
...
```
Note how the line and column numbers (2 and 3, respectively) are not
correct.
After:
```
$ ./bin/coffee tmp.coffee.md
ReferenceError: a is not defined
at Object.<anonymous> (/home/lydell/forks/coffee-script/tmp.coffee.md:3:5)
...
```
Line 3, column 5 is the actual position of the `a` in tmp.coffee.md.
Supersedes and fixes#4204.