From bc952b1a4bbe92e1091e98ba76b9c5f69429b3ac Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 23 Jan 2025 17:57:42 -0800 Subject: [PATCH] Document numeric type literals in lexical conventions. (#4842) Also improve the precision of some other nearby documentation. --- docs/design/expressions/literals.md | 4 +-- docs/design/lexical_conventions/words.md | 45 ++++++++++++++++++++---- 2 files changed, 40 insertions(+), 9 deletions(-) diff --git a/docs/design/expressions/literals.md b/docs/design/expressions/literals.md index 0e58234f07594..009a02060c7d6 100644 --- a/docs/design/expressions/literals.md +++ b/docs/design/expressions/literals.md @@ -14,7 +14,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Numeric literals](#numeric-literals) - [Numeric literal syntax](#numeric-literal-syntax) - - [Defined Types](#defined-types) + - [Defined types](#defined-types) - [Implicit conversions](#implicit-conversions) - [Examples](#examples) - [Alternatives Considered](#alternatives-considered) @@ -70,7 +70,7 @@ Numeric literal syntax is covered in the doc. Both Integer and Real-Number syntax is defined, with decimal, hexadecimal and binary integer literals, and decimal and hexadecimal real number literals. -### Defined Types +### Defined types The following types are defined in the Carbon prelude: diff --git a/docs/design/lexical_conventions/words.md b/docs/design/lexical_conventions/words.md index 854e59be2ce60..2b5dae3a08316 100644 --- a/docs/design/lexical_conventions/words.md +++ b/docs/design/lexical_conventions/words.md @@ -12,6 +12,8 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Overview](#overview) - [Keywords](#keywords) + - [Type literals](#type-literals) + - [Identifiers](#identifiers) - [Raw identifiers](#raw-identifiers) - [Alternatives considered](#alternatives-considered) - [References](#references) @@ -21,7 +23,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception ## Overview A _word_ is a lexical element formed from a sequence of letters or letter-like -characters, such as `fn` or `Foo` or `Int`. +characters, such as `fn` or `Foo` or `Int`, optionally preceded by `r#`. The exact lexical form of words has not yet been settled. However, Carbon will follow lexical conventions for identifiers based on @@ -98,20 +100,47 @@ The following words are interpreted as keywords: - `where` - `while` +### Type literals + +A word starting with `i`, `u`, or `f`, followed by a decimal integer, is a +[_numeric type literal_](/docs/design/expressions/literals.md#numeric-type-literals). + +### Identifiers + +A word is interpreted as an _identifier_ if it is neither a keyword nor a type +literal. + ### Raw identifiers -A raw identifier looks like `r#`. This can be used for identifiers -which have the same spelling as keywords; for example, `r#impl`. It can help -when using C++ code with identifiers that are keywords in Carbon. +A _raw identifier_ is a word starting with `r#`. A raw identifier is equivalent +to the word following the `r#` prefix, except that it is always interpreted as +an identifier, even if it would otherwise be a keyword or type literal. + +Raw identifiers can be used to specify identifiers which have the same spelling +as keywords; for example, `r#impl`. This can be useful when interoperating with +C++ code that uses identifiers that are keywords in Carbon, and when migrating +between versions of Carbon. -The identifier doesn't need to be a keyword, in order to support forwards -compatibility when a keyword is planned to be added. When `` is not -a keyword, it will refer to the same entity as `r#`. +The word doesn't need to be a keyword, in order to support forwards +compatibility when a keyword is planned to be added. If `word` is an identifier, +then `word` and `r#word` have the same meaning. ## Alternatives considered +Overview: + - [Character encoding: We could restrict words to ASCII.](/proposals/p0142.md#character-encoding-1) - [Normalization form alternatives considered](/proposals/p0142.md#normalization-forms) + +Type literals: + +- [Use C++ type keywords with LP64 convention](/proposals/p2015.md#c-lp64-convention) +- [Use full type name with length suffix](/proposals/p2015.md#type-name-with-length-suffix) +- [Use uppercase for type names](/proposals/p2015.md#uppercase-suffixes) +- [Support additional bit widths](/proposals/p2015.md#additional-bit-sizes) + +Raw identifiers: + - [Other raw identifier syntaxes](/proposals/p3797.md#other-raw-identifier-syntaxes) - [Restrict raw identifier syntax to current and future keywords](/proposals/p3797.md#restrict-raw-identifier-syntax-to-current-and-future-keywords) - [Don't require syntax for references to raw identifiers](/proposals/p3797.md#dont-require-syntax-for-references-to-raw-identifiers) @@ -121,5 +150,7 @@ a keyword, it will refer to the same entity as `r#`. - Proposal [#142: Unicode source files](https://github.com/carbon-language/carbon-lang/pull/142) +- Proposal + [#2015: Numeric type literal syntax](https://github.com/carbon-language/carbon-lang/pull/2015) - Proposal [#3797: Raw identifier syntax](https://github.com/carbon-language/carbon-lang/pull/3797)