# html-tools A lightweight HTML tokenizer and parser which outputs to the HTMLjs object representation. Special hooks allow the syntax to be extended to parse an HTML-like template language like Spacebars. ``` HTMLTools.parseFragment("
Hello
World
") => HTML.DIV({'class':'greeting'}, "Hello", HTML.BR(), "World")) ``` This package is used by the Spacebars compiler, which normally only runs at bundle time but can also be used at runtime on the client or server. ## Invoking the Parser `HTMLTools.parseFragment(input, options)` - Takes an input string or Scanner object and returns HTMLjs. In the basic case, where no options are passed, `parseFragment` will consume the entire input (the full string or the rest of the Scanner). The options are as follows: #### getTemplateTag This option extends the HTML parser to parse template tags such as `{{foo}}`. `getTemplateTag: function (scanner, templateTagPosition) { ... }` - A function for the parser to call after every HTML token and at various positions within tags. If the function returns an instanceof `HTMLTools.TemplateTag`, it is inserted into the HTMLjs tree at the appropriate location. The constructor is `HTMLTools.TemplateTag(props)`, where props is an object whose properties are copied to the `TemplateTag` instance. You can also call the constructor with no arguments and assign whatever properties you want, or you can subclass `TemplateTag`. There are four possible outcomes when `getTemplateTag` is called: * Not a template tag - Leave the scanner as is, and return `null`. A quick peek at the next character should bail to this case if the start of a template tag is not seen. * Bad template tag - Call `scanner.fatal`, which aborts parsing completely. Once the beginning of a template tag is seen, `getTemplateTag` will generally want to commit, and either succeed or fail trying). * Good template tag - Advance the scanner to the end of the template tag and return an `HTMLTools.TemplateTag` object. * Comment tag - Advance the scanner and return `null`. For example, a Spacebars comment is `{{! foo}}`. The `templateTagPosition` argument to `getTemplateTag` is one of: * `HTMLTools.TEMPLATE_TAG_POSITION.ELEMENT` - At "element level," meaning somewhere an HTML tag could be. * `HTMLTools.TEMPLATE_TAG_POSITION.IN_START_TAG` - Inside a start tag, as in `
`, where you might otherwise find `name=value`. * `HTMLTools.TEMPLATE_TAG_POSITION.IN_ATTRIBUTE` - Inside the value of an HTML attribute, as in `
`. * `HTMLTools.TEMPLATE_TAG_POSITION.IN_RCDATA` - Inside a TEXTAREA or a block helper inside an attribute, where character references are allowed ("replaced character data") but not tags. * `HTMLTools.TEMPLATE_TAG_POSITION.IN_RAWTEXT` - In a context where character references are not parsed, such as a script tag, style tag, or markdown helper. It's completely normal for `getTemplateTag` to invoke `HTMLTools.parseFragment` recursively on the same scanner (see `shouldStop`). If it does so, the same value of `getTemplateTag` must be passed to the second invocation. At the moment, template tags must begin with `{`. The parser does not try calling `getTemplateTag` for every character of an HTML document, only at token boundaries, and it knows to always end a token at `{`. #### textMode The `textMode` option, if present, causes the parser to parse text (such as the contents of a `