Since parsing is done in a separate thread, and it works with pointers, we need to ensure that the grammar is retained for at least as long as the thread lives.
It seems to be a general trend with network file systems to return wrong errors when they do not support a certain feature (like extended attributes and atomic swap).
A grammar_t instance now deep-copies potential grammars it includes and each call to parse_grammar() returns a new unique instance.
The latter allows mutating the grammar (by the parser) and the former ensures that grammars are not left with expired pointers (to other grammars) when bundle items are updated.
Using GCD actually makes the code slower — it might have to do with locking overhead from std::shared_ptr and onig_region_new/region_free.
Worth trying again once use of std::shared_ptr has been removed from the parser, and oniguruma regions are preallocated.
Previously we had to test if the patterns contained \A, \G, or \z, and if so, rewrite those anchors based on wether or not the current line/match position could match them.
This is instead of keeping a std::set with rule identifiers. Keeping the information in the grammar is a lot faster (about 25%) as we can update the status in O(1) without any memory allocation.
The downside is that the grammar is now being mutated by the parser. This is currently safe because only a single thread is used for parsing. When we switch to allowing multiple threads to perform parsing, we should make a copy of the grammar for each instance.
Another downside is that we only tag rules that have begin/match patterns, so rules that are wrappers for a set of rules, or rules that are including another rule, are never rejected, even if already visited, but the target rules they resolve to will be, though if an include (indirectly) include itself, we will no longer break such cycle (though it is clearly a bug in the grammar, if this happens, and we could preprocess the grammar to catch it).
This makes loading time roughly twice as fast, although some of the speed gain is because we no longer need to convert CFPropertyListRef → plist::any_t.
Previously this had to be done via global constructor functions but it would seem the execution of these may happen before initialization of global data.
Building with clang 3.2 (binary distribution) results in “illegal instruction” when adding 3 seconds to the steady clock, so I just replaced the code.
The reason the code was using std::chrono was to be able to provide better debug output, as dispatch_time_t is an abstracted time representation (according to the documentation).