Skip to content

Commit

Permalink
feature #4487 Support underscores in number literals (smnandre)
Browse files Browse the repository at this point in the history
This PR was squashed before being merged into the 3.x branch.

Discussion
----------

Support underscores in number literals

```twig
{{ 1000 == 1_000 ? 'yes' : 'no' }}
# now: syntax error
# this PR: "yes"
```

> As of PHP 7.4.0, integer literals may contain underscores (_) between digits, for better readability of literals. These underscores are removed by PHP's scanner.

https://www.php.net/manual/en/language.types.integer.php

This PR replicates that behaviour, using the regexp to match the literals and then remove the "_".

I'm targeting **Twig4** but maybe 3.x would be ok?

I cannot think of a real case that
- does not trigger an error currently
- would work differently after this PR

Commits
-------

da4d966 Support underscores in number literals
  • Loading branch information
fabpot committed Dec 2, 2024
2 parents 6cc64f2 + da4d966 commit d894f92
Show file tree
Hide file tree
Showing 6 changed files with 52 additions and 9 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 3.17.0 (2024-XX-XX)

* n/a
* Support underscores in number literals

# 3.16.0 (2024-11-29)

Expand Down
5 changes: 3 additions & 2 deletions doc/templates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -612,7 +612,8 @@ exist:

* ``42`` / ``42.23``: Integers and floating point numbers are created by
writing the number down. If a dot is present the number is a float,
otherwise an integer.
otherwise an integer. Underscores can be used as digits separator to
improve readability (``-3_141.592_65`` is equivalent to ``-3141.59265``).

* ``["first_name", "last_name"]``: Sequences are defined by a sequence of expressions
separated by a comma (``,``) and wrapped with squared brackets (``[]``).
Expand Down Expand Up @@ -1144,4 +1145,4 @@ Twig can be extended. If you want to create your own extensions, read the
.. _`Modern Twig`: https://marketplace.visualstudio.com/items?itemName=Stanislav.vscode-twig
.. _`Twig Language Server`: https://github.com/kaermorchen/twig-language-server/tree/master/packages/language-server
.. _`Twiggy`: https://marketplace.visualstudio.com/items?itemName=moetelo.twiggy
.. _`PHP spaceship operator documentation`: https://www.php.net/manual/en/language.operators.comparison.php
.. _`PHP spaceship operator documentation`: https://www.php.net/manual/en/language.operators.comparison.php
6 changes: 6 additions & 0 deletions phpstan-baseline.neon
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,9 @@ parameters:
identifier: parameter.phpDocType
count: 5
path: src/Node/Node.php

- # Adding 0 to the string representation of a number is valid and what we want here
message: '#^Binary operation "\+" between 0 and string results in an error\.$#'
identifier: binaryOp.invalid
count: 1
path: src/Lexer.php
16 changes: 10 additions & 6 deletions src/Lexer.php
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,16 @@ class Lexer
public const STATE_INTERPOLATION = 4;

public const REGEX_NAME = '/[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*/A';
public const REGEX_NUMBER = '/[0-9]+(?:\.[0-9]+)?([Ee][\+\-][0-9]+)?/A';
public const REGEX_STRING = '/"([^#"\\\\]*(?:\\\\.[^#"\\\\]*)*)"|\'([^\'\\\\]*(?:\\\\.[^\'\\\\]*)*)\'/As';

public const REGEX_NUMBER = '/(?(DEFINE)
(?<LNUM>[0-9]+(_[0-9]+)*) # Integers (with underscores) 123_456
(?<FRAC>\.(?&LNUM)) # Fractional part .456
(?<EXPONENT>[eE][+-]?(?&LNUM)) # Exponent part E+10
(?<DNUM>(?&LNUM)(?:(?&FRAC))?) # Decimal number 123_456.456
)(?:(?&DNUM)(?:(?&EXPONENT))?) # 123_456.456E+10
/Ax';

public const REGEX_DQ_STRING_DELIM = '/"/A';
public const REGEX_DQ_STRING_PART = '/[^#"\\\\]*(?:(?:\\\\.|#(?!\{))[^#"\\\\]*)*/As';
public const REGEX_INLINE_COMMENT = '/#[^\n]*/A';
Expand Down Expand Up @@ -346,11 +354,7 @@ private function lexExpression(): void
}
// numbers
elseif (preg_match(self::REGEX_NUMBER, $this->code, $match, 0, $this->cursor)) {
$number = (float) $match[0]; // floats
if (ctype_digit($match[0]) && $number <= \PHP_INT_MAX) {
$number = (int) $match[0]; // integers lower than the maximum
}
$this->pushToken(Token::NUMBER_TYPE, $number);
$this->pushToken(Token::NUMBER_TYPE, 0 + str_replace('_', '', $match[0]));
$this->moveCursor($match[0]);
}
// punctuation
Expand Down
24 changes: 24 additions & 0 deletions tests/Fixtures/expressions/underscored_numbers.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
--TEST--
Twig compile numbers literals with underscores correctly
--TEMPLATE--
{{ 0_0 is same as 0 ? 'ok' : 'ko' }}
{{ 1_23 is same as 123 ? 'ok' : 'ko' }}
{{ 12_3 is same as 123 ? 'ok' : 'ko' }}
{{ 1_2_3 is same as 123 ? 'ok' : 'ko' }}
{{ -1_2 is same as -12 ? 'ok' : 'ko' }}
{{ 1_2.3_4 is same as 12.34 ? 'ok' : 'ko' }}
{{ -1_2.3_4 is same as -12.34 ? 'ok' : 'ko' }}
{{ 1.2_3e-4 is same as 1.23e-4 ? 'ok' : 'ko' }}
{{ -1.2_3e+4 is same as -1.23e+4 ? 'ok' : 'ko' }}
--DATA--
return []
--EXPECT--
ok
ok
ok
ok
ok
ok
ok
ok
ok
8 changes: 8 additions & 0 deletions tests/Fixtures/expressions/underscored_numbers_error.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
--TEST--
Twig does not allow to use 2 underscored between digits in numbers
--TEMPLATE--
{{ 1__2 }}
--DATA--
return []
--EXCEPTION--
Twig\Error\SyntaxError: Unexpected token "name" of value "__2" ("end of print statement" expected) in "index.twig" at line 2.

0 comments on commit d894f92

Please sign in to comment.