Skip to content

Commit

Permalink
Option tmGrammar -> overrides.allowOrphanBackrefs
Browse files Browse the repository at this point in the history
  • Loading branch information
slevithan committed Nov 22, 2024
1 parent 0b7ebb4 commit dc7f9e5
Show file tree
Hide file tree
Showing 7 changed files with 36 additions and 20 deletions.
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,10 @@ type OnigurumaToEsOptions = {
global?: boolean;
hasIndices?: boolean;
maxRecursionDepth?: number | null;
overrides?: {
allowOrphanBackrefs?: boolean;
};
target?: 'auto' | 'ES2025' | 'ES2024' | 'ES2018';
tmGrammar?: boolean;
verbose?: boolean;
};
```
Expand Down Expand Up @@ -215,6 +217,12 @@ Since recursion isn't infinite-depth like in Oniguruma, use of recursion also re
Using a high limit has a small impact on performance. Generally, this is only a problem if the regex has an existing issue with runaway backtracking that recursion exacerbates. Higher limits have no effect on regexes that don't use recursion, so you should feel free to increase this if helpful.
</details>

### `overrides`

Advanced options that take precedence over standard error checking and flags.

- `allowOrphanBackrefs`: Useful with TextMate grammar processors that merge backreferences across `begin` and `end` patterns.

### `target`

One of `'auto'` *(default)*, `'ES2025'`, `'ES2024'`, or `'ES2018'`.
Expand All @@ -235,12 +243,6 @@ JavaScript version used for generated regexes. Using `auto` detects the best val
- Generated regexes might use features that require Node.js 23 or a 2024-era browser (except Safari, which lacks support for flag groups).
</details>

### `tmGrammar`

*Default: `false`.*

Leave disabled unless the regex will be used in a TextMate grammar processor that merges backreferences across `begin` and `end` patterns.

### `verbose`

*Default: `false`.*
Expand Down Expand Up @@ -940,7 +942,7 @@ The following features don't yet have any support, and throw errors. They're all
- Grapheme boundaries: <code>\y</code>, <code>\Y</code>.
- Grapheme boundary options (flags <code>y{g}</code>, <code>y{w}</code>).
- Whole-pattern options: don't capture <code>(?C)</code>, ignore-care is ASCII <code>(?I)</code>, find longest <code>(?L)</code>.
- Absent repeater <code>(?~…)</code>, expression <code>(?~|…|…)</code>, and range cutter <code>(?~|…)</code>.
- Absent repeater <code>(?\~…)</code>, expression <code>(?\~|…|…)</code>, and range cutter <code>(?\~|…)</code>.
- Conditionals: <code>(?(…)…)</code>, <code>(?(…)…|…)</code>.
- Code point sequences: <code>\x{H H …H}</code>, <code>\o{O O …O}</code>.
- Additional, extremely rare ways to specify characters.
Expand Down
4 changes: 2 additions & 2 deletions demo/demo.css
Original file line number Diff line number Diff line change
Expand Up @@ -111,8 +111,8 @@ label .tip-lg {
}

label .tip-xl {
width: 320px;
margin-left: -160px;
width: 300px;
margin-left: -150px;
}

label .tip :is(code, kbd) {
Expand Down
8 changes: 8 additions & 0 deletions demo/demo.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ const state = {
global: getValue('option-global'),
hasIndices: getValue('option-hasIndices'),
maxRecursionDepth: getValue('option-maxRecursionDepth'),
overrides: {
allowOrphanBackrefs: getValue('option-allowOrphanBackrefs'),
},
target: getValue('option-target'),
verbose: getValue('option-verbose'),
},
Expand Down Expand Up @@ -218,3 +221,8 @@ function setOption(option, value) {
state.opts[option] = value;
showTranspiled();
}

function setOverride(option, value) {
state.opts.overrides[option] = value;
showTranspiled();
}
6 changes: 3 additions & 3 deletions demo/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -118,9 +118,9 @@ <h2>Try it</h2>
</p>
<p>
<label>
<input type="checkbox" id="option-tmGrammar" onchange="setOption('tmGrammar', this.checked)">
<code>tmGrammar</code>
<span class="tip tip-xl">Leave disabled unless the regex will be used in a TextMate grammar processor that merges backrefs across <code>begin</code> and <code>end</code> patterns</span>
<input type="checkbox" id="option-allowOrphanBackrefs" onchange="setOverride('allowOrphanBackrefs', this.checked)">
<code>allowOrphanBackrefs</code>
<span class="tip tip-xl">Useful with TextMate grammar processors that merge backrefs across <code>begin</code> and <code>end</code> patterns</span>
</label>
</p>
</div>
Expand Down
6 changes: 4 additions & 2 deletions src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,10 @@ import {recursion} from 'regex-recursion';
global?: boolean;
hasIndices?: boolean;
maxRecursionDepth?: number | null;
overrides?: {
allowOrphanBackrefs?: boolean;
};
target?: keyof Target;
tmGrammar?: boolean;
verbose?: boolean;
}} OnigurumaToEsOptions
*/
Expand All @@ -48,7 +50,7 @@ function toDetails(pattern, options) {
const opts = getOptions(options);
const tokenized = tokenize(pattern, opts.flags);
const onigurumaAst = parse(tokenized, {
skipBackrefValidation: opts.tmGrammar,
skipBackrefValidation: opts.overrides.allowOrphanBackrefs,
verbose: opts.verbose,
});
const regexAst = transform(onigurumaAst, {
Expand Down
10 changes: 7 additions & 3 deletions src/options.js
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,16 @@ function getOptions(options) {
// your environment. Later targets allow faster processing, simpler generated source, and
// support for additional features.
target: 'auto',
// Leave disabled unless the regex will be used in a TextMate grammar processor that merges
// backreferences across `begin` and `end` patterns.
tmGrammar: false,
// Disables optimizations that simplify the pattern when it doesn't change the meaning.
verbose: false,
...options,
// Advanced options that take precedence over standard error checking and flags.
overrides: {
// Useful with TextMate grammar processors that merge backreferences across `begin` and `end`
// patterns.
allowOrphanBackrefs: false,
...(options?.overrides),
},
};
if (opts.target === 'auto') {
opts.target = (envSupportsDuplicateNames && envSupportsFlagGroups) ?
Expand Down
4 changes: 2 additions & 2 deletions src/transform.js
Original file line number Diff line number Diff line change
Expand Up @@ -505,7 +505,7 @@ const ThirdPassVisitor = {
Backreference({node, replaceWith}, state) {
if (node.orphan) {
state.highestOrphanBackref = Math.max(state.highestOrphanBackref, node.ref);
// Don't renumber; used with option `tmGrammar`
// Don't renumber; used with `allowOrphanBackrefs`
return;
}
const reffedNodes = state.reffedNodesByReferencer.get(node);
Expand Down Expand Up @@ -566,7 +566,7 @@ const ThirdPassVisitor = {
// exist within `end`. This presents a dilemma since both Oniguruma and JS (with flag u/v)
// error for backrefs to undefined captures. So adding captures to the end is a solution that
// doesn't change what the regex matches, and lets invalid numbered backrefs through. Note:
// Orphan backrefs are only allowed if the `tmGrammar` option is used
// Orphan backrefs are only allowed if `allowOrphanBackrefs` is enabled
const numCapsNeeded = Math.max(state.highestOrphanBackref - state.numCapturesToLeft, 0);
for (let i = 0; i < numCapsNeeded; i++) {
const emptyCapture = createCapturingGroup();
Expand Down

0 comments on commit dc7f9e5

Please sign in to comment.