# html-tools
A lightweight HTML tokenizer and parser which outputs to the HTMLjs
object representation. Special hooks allow the syntax to be extended
to parse an HTML-like template language like Spacebars.
```
HTMLTools.parseFragment("
Hello World
")
=> HTML.DIV({'class':'greeting'}, "Hello", HTML.BR(), "World"))
```
This package is used by the Spacebars compiler, which normally only
runs at bundle time but can also be used at runtime on the client or
server.
## Invoking the Parser
`HTMLTools.parseFragment(input, options)` - Takes an input string or Scanner object and returns HTMLjs.
In the basic case, where no options are passed, `parseFragment` will consume the entire input (the full string or the rest of the Scanner).
The options are as follows:
#### getTemplateTag
This option extends the HTML parser to parse template tags such as `{{foo}}`.
`getTemplateTag: function (scanner, templateTagPosition) { ... }` - A function for the parser to call after every HTML token and at various positions within tags. If the function returns an instanceof `HTMLTools.TemplateTag`, it is inserted into the HTMLjs tree at the appropriate location. The constructor is `HTMLTools.TemplateTag(props)`, where props is an object whose properties are copied to the `TemplateTag` instance. You can also call the constructor with no arguments and assign whatever properties you want, or you can subclass `TemplateTag`.
There are four possible outcomes when `getTemplateTag` is called:
* Not a template tag - Leave the scanner as is, and return `null`. A quick peek at the next character should bail to this case if the start of a template tag is not seen.
* Bad template tag - Call `scanner.fatal`, which aborts parsing completely. Once the beginning of a template tag is seen, `getTemplateTag` will generally want to commit, and either succeed or fail trying).
* Good template tag - Advance the scanner to the end of the template tag and return an `HTMLTools.TemplateTag` object.
* Comment tag - Advance the scanner and return `null`. For example, a Spacebars comment is `{{! foo}}`.
The `templateTagPosition` argument to `getTemplateTag` is one of:
* `HTMLTools.TEMPLATE_TAG_POSITION.ELEMENT` - At "element level," meaning somewhere an HTML tag could be.
* `HTMLTools.TEMPLATE_TAG_POSITION.IN_START_TAG` - Inside a start tag, as in `
`, where you might otherwise find `name=value`.
* `HTMLTools.TEMPLATE_TAG_POSITION.IN_ATTRIBUTE` - Inside the value of an HTML attribute, as in `
`.
* `HTMLTools.TEMPLATE_TAG_POSITION.IN_RCDATA` - Inside a TEXTAREA or a block helper inside an attribute, where character references are allowed ("replaced character data") but not tags.
* `HTMLTools.TEMPLATE_TAG_POSITION.IN_RAWTEXT` - In a context where character references are not parsed, such as a script tag, style tag, or markdown helper.
It's completely normal for `getTemplateTag` to invoke `HTMLTools.parseFragment` recursively on the same scanner (see `shouldStop`). If it does so, the same value of `getTemplateTag` must be passed to the second invocation.
At the moment, template tags must begin with `{`. The parser does not try calling `getTemplateTag` for every character of an HTML document, only at token boundaries, and it knows to always end a token at `{`.
#### textMode
The `textMode` option, if present, causes the parser to parse text (such as the contents of a `