Closed
Description
Some Unicode characters like ₊ (U+208A) and other subscripts are rejected by Clang 14. These characters are in the allowed ranges for identifiers in the [lex.name]
section of the C++ Standard. Recent versions of GCC and older versions of Clang do not raise any errors.
For example:
double foo(double xₖ, double xₖ₊₁) {
return xₖ₊₁ - xₖ;
}
$ clang++-14 -c unicode.cpp -std=c++20
unicode.cpp:1:36: error: character <U+208A> not allowed in an identifier
double foo(double xₖ, double xₖ₊₁) {
^
unicode.cpp:1:39: error: character <U+2081> not allowed in an identifier
double foo(double xₖ, double xₖ₊₁) {
^
unicode.cpp:2:14: error: character <U+208A> not allowed in an identifier
return xₖ₊₁ - xₖ;
^
unicode.cpp:2:17: error: character <U+2081> not allowed in an identifier
return xₖ₊₁ - xₖ;
^
4 errors generated.
$ clang++-14 --version
Ubuntu clang version 14.0.1-++20220402053234+23d08271a4b2-1~exp1~20220402053315.111
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Is this a deliberate change or a regression bug from Clang 13 to 14?