As evidenced in #3804, commit 8ab15d7 broke the CoffeeScript API. The REPL uses
those APIs, but wasn't updated in that commit. Still, that shouldn't have
_broken_ the REPL. The reason it broke is because the added _option_
'referencedVars' wasn't actually _optional;_ if it was omitted code that relies
on it being set broke. This commit defaults that option to an empty array, which
makes things behave exactly like before when the 'referencedVars' option is
omitted.
Supersedes #3805. Here is a comparison of master, #3805 and this commit:
# master
$ bin/coffee
coffee> 1 %% 2
TypeError: Array.prototype.indexOf called on null or undefined
# #3805
$ bin/coffee
coffee> 1 %% 2
1
coffee> (_results = null; i) for i in [1, 2, 3]
TypeError: Cannot call method 'push' of null
# this commit
$ bin/coffee
coffee> 1 %% 2
1
coffee> (_results = null; i) for i in [1, 2, 3]
[ 1, 2, 3 ]
Instead of compiling to `"" + + (+"-");`, `"#{+}-"'` now gives an appropriate
error message:
[stdin]:1:5: error: unexpected end of interpolation
"#{+}-"
^
This is done by _always_ (instead of just sometimes) wrapping the interpolations
in parentheses in the lexer. Unnecessary parentheses won't be output anyway.
I got tired of updating the tests in test/location.coffee (which I had enough of
in #3770), which relies on implementation details (the exact amount of tokens
generated for a given string of code) to do their testing, so I refactored them
to be less fragile.
Since zaach/jison commit 3548861b, `parser.lexer` is never modified anymore (a
copy of it is made, and that copy is modified instead). CoffeeScript itself
modifies `parser.lexer` and then accesses those modifications in the custom
`parser.yy.parseError` function, but that of course does not work anymore. This
commit puts the data that `parser.yy.parseError` needs directly on the `parser`
so that it is not lost.
Supersedes #3603. Fixes#3608 and zaach/jison#243.
Using the static property `Scope.root` for the top-level scope of a file is a
hack, which makes it impossible to have several independent `Scope` instances
at the same time (should we ever need that).
This commit makes every instance have a reference to its root instead.
Any variables generated by CoffeeScript are now made sure to be named to
something not present in the source code being compiled. This way you can no
longer interfere with them, either on purpose or by mistake. (#1500, #1574)
For example, `({a}, _arg) ->` now compiles correctly. (#1574)
As opposed to the somewhat complex implementations discussed in #1500, this
commit takes a very simple approach by saving all used variables names using a
single pass over the token stream. Any generated variables are then made sure
not to exist in that list.
`(@a) -> a` used to be equivalent to `(@a) -> @a`, but now throws a runtime
`ReferenceError` instead (unless `a` exists in an upper scope of course). (#3318)
`(@a) ->` used to compile to `(function(a) { this.a = a; })`. Now it compiles to
`(function(_at_a) { this.a = _at_a; })`. (But you cannot access `_at_a` either,
of course.)
Because of the above, `(@a, a) ->` is now valid; `@a` and `a` are not duplicate
parameters.
Duplicate this-parameters with a reserved word, such as `(@case, @case) ->`,
used to compile but now throws, just like regular duplicate parameters.
Allow the `by c` part in `for [a..b] by c then`.
Continue disallowing a `when d` part, since it makes no sense having a guard
that isn't given access to anything that changes on every iteration.
A regex may not follow a specific set of tokens. These were already known before
in the `NOT_REGEX` and `NOT_SPACED_REGEX` arrays. (However, I've refactored them
to be more correct and to add a few missing tokens). In all other cases (except
after a spaced callable) a slash is the start of a regex, and may now start with
a space or an equals sign. It’s really that simple!
A slash after a spaced callable is the only ambigous case. We cannot know if
that's division or function application with a regex as the argument. The
spacing determines which is which:
Space on both sides:
- `a / b/i` -> `a / b / i`
- `a /= b/i` -> `a /= b / i`
No spaces:
- `a/b/i` -> `a / b / i`
- `a/=b/i` -> `a /= b / i`
Space on the right side:
- `a/ b/i` -> `a / b / i`
- `a/= b/i` -> `a /= b / i`
Space on the left side:
- `a /b/i` -> `a(/b/i)`
- `a /=b/i` -> `a(/=b/i)`
The last case used to compile to `a /= b / i`, but that has been changed to be
consistent with the `/` operator. The last case really looks like a regex, so it
should be parsed as one.
Moreover, you may now also space the `/` and `/=` operators with other
whitespace characters than a space (such as tabs and non-breaking spaces) for
consistency.
Lastly, unclosed regexes are now reported as such, instead of generating some
other confusing error message.
It should perhaps also be noted that apart from escaping (such as `a /\ b/`) you
may now also use parentheses to disambiguate division and regex: `a (/ b/)`. See
https://github.com/jashkenas/coffeescript/issues/3182#issuecomment-26688427.
Before commit c056c93e `Op::isComplex()` used to return true always. As far as I
understand, that commit attempts to exclude code such as `+1` and `-2` from
being marked as complex (and thus getting cached into `_ref` variables
sometimes). CoffeeScript is supposed to generate readable output so that choice
is understandable. However, it also excludes code such as `+a` (by mistake I
believe), which can cause `a` to be coerced multiple times. This commit fixes
this by only excluding unary + and - ops followed by a number.
It is possible to match only valid JavaScript identifiers with a really long
regex (like coco and CoffeeScriptRedux does), but CoffeeScript uses a much
simpler one, which allows a bit too much.
Quoting jashkenas/coffeescript#1718 #issuecomment-2152464 @jashkenas:
> But it still seems very much across the "worth it" line. You'll get the
> SyntaxError as soon as it hits JS, and performance aside -- even the increase
> in filesize for our browser coffee-script.js lib seems too much, considering
> this is something no one ever does, apart from experimentation.
In short, CoffeeScript treats any non-ASCII character as part of an identifier.
However, unicode spaces should be excluded since having blank characters as part
of a _word_ is very confusing. This commit does so, while still keeping the
regex really simple.
Previously such errors pointed at the end of the input, which wasn't very
helpful. This is also consistent with unclosed strings, where the errors point
at the opening quote.
Note that this includes unclosed #{ (interpolations).
- Fix#3394: Unclosed single-quoted strings (both regular ones and heredocs)
used to pass through the lexer, causing a parsing error later, while
double-quoted strings caused an error already in the lexing phase. Now both
single and double-quoted unclosed strings error out in the lexer (which is the
more logical option) with consistent error messages. This also fixes the last
comment by @satyr in #3301.
- Similar to the above, unclosed heregexes also used to pass through the lexer
and not error until in the parsing phase, which resulted in confusing error
messages. This has been fixed, too.
- Fix#3348, by adding passing tests.
- Fix#3529: If a string starts with an interpolation, an empty string is no
longer emitted before the interpolation (unless it is needed to coerce the
interpolation into a string).
- Block comments cannot contain `*/`. Now the error message also shows exactly
where the offending `*/`. This improvement might seem unrelated, but I had to
touch that code anyway to refactor string and regex related code, and the
change was very trivial. Moreover, it's consistent with the next two points.
- Regexes cannot start with `*`. Now the error message also shows exactly where
the offending `*` is. (It might actually not be exatly at the start in
heregexes.) It is a very minor improvement, but it was trivial to add.
- Octal escapes in strings are forbidden in CoffeeScript (just like in
JavaScript strict mode). However, this used to be the case only for regular
strings. Now they are also forbidden in heredocs. Moreover, the errors now
point at the offending octal escape.
- Invalid regex flags are no longer allowed. This includes repeated modifiers
and unknown ones. Moreover, invalid modifiers do not stop a heregex from
being matched, which results in better error messages.
- Fix#3621: `///a#{1}///` compiles to `RegExp("a" + 1)`. So does
`RegExp("a#{1}")`. Still, those two code snippets used to generate different
tokens, which is a bit weird, but more importantly causes problems for
coffeelint (see clutchski/coffeelint#340). This required lots of tests in
test/location.coffee to be updated. Note that some updates to those tests are
unrelated to this point; some have been updated to be more consistent (I
discovered this because the refactored code happened to be seemingly more
correct).
- Regular regex literals used to erraneously allow newlines to be escaped,
causing invalid JavaScript output. This has been fixed.
- Heregexes may now be completely empty (`//////`), instead of erroring out with
a confusing message.
- Fix#2388: Heredocs and heregexes used to be lexed simply, which meant that
you couldn't nest a heredoc within a heredoc (double-quoted, that is) or a
heregex inside a heregex.
- Fix#2321: If you used division inside interpolation and then a slash later in
the string containing that interpolation, the division slash and the latter
slash was erraneously matched as a regex. This has been fixed.
- Indentation inside interpolations in heredocs no longer affect how much
indentation is removed from each line of the heredoc (which is more
intuitive).
- Whitespace is now correctly trimmed from the start and end of strings in a few
edge cases.
- Last but not least, the lexing of interpolated strings now seems to be more
efficient. For a regular double-quoted string, we used to use a custom
function to find the end of it (taking interpolations and interpolations
within interpolations etc. into account). Then we used to re-find the
interpolations and recursively lex their contents. In effect, the same string
was processed twice, or even more in the case of deeper nesting of
interpolations. Now the same string is processed just once.
- Code duplication between regular strings, heredocs, regular regexes and
heregexes has been reduced.
- The above two points should result in more easily read code, too.