Skip to content

Commit

Permalink
Differentiate raw string / string literals. Enforce single line liter…
Browse files Browse the repository at this point in the history
…al. Update string docs.
  • Loading branch information
fubark committed Jan 13, 2024
1 parent 5765cde commit d9f355d
Show file tree
Hide file tree
Showing 21 changed files with 274 additions and 182 deletions.
122 changes: 75 additions & 47 deletions docs/docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ import math
var worlds = ['World', '世界', 'दुनिया', 'mundo']
worlds.append(math.random())
for worlds -> w:
print 'Hello, $(w)!'
print "Hello, $(w)!"
```

# Syntax.
Expand Down Expand Up @@ -407,8 +407,14 @@ CYON or the Cyber object notation is similar to JSON. The format uses the same l
* [Floats.](#floats)
* [Big Numbers.](#big-numbers)
* [Strings.](#strings)
* [Raw string literal.](#raw-string-literal)
* [String literal.](#string-literal)
* [Escape sequences.](#escape-sequences)
* [String operations.](#string-operations)
* [String interpolation.](#string-interpolation)
* [String formatting.](#string-formatting)
* [Line-join literal.](#line-join-literal)
* [Mutable strings.](#mutable-strings)
* [Arrays.](#arrays)
* [Bracket literals.](#bracket-literals)
* [Lists.](#lists)
Expand Down Expand Up @@ -508,59 +514,55 @@ var b = float(a)
> _Planned Feature_

## Strings.
The `string` type represents a sequence of UTF-8 codepoints, also known as `runes`. Each rune is stored internally as 1-4 bytes and can be represented as an `int`. See [`type string`](#type-string).
The `string` type represents a sequence of validated UTF-8 codepoints, also known as `runes`. Each rune is stored internally as 1-4 bytes and can be represented as an `int`. See [`type string`](#type-string).

Under the hood, there are multiple string implementations to make operations faster by default.
Strings are **immutable**, so operations that do string manipulation return a new string. By default, short strings are interned to reduce memory footprint.

Strings are **immutable**, so operations that do string manipulation return a new string. By default, small strings are interned to reduce memory footprint.
Under the hood, there are multiple string implementations to make operations faster by default using SIMD.

To mutate an existing string, use [MutString](#mutstring). *Planned Feature*
### Raw string literal.
A raw string doesn't allow any escape sequences or string interpolation.

A string is always UTF-8 validated. Using an [Array](#arrays) to represent raw bytes of a string is faster but you'll have to validate them and take care of indexing.

A single line string literal is surrounded in single quotes.
Single quotes are used to delimit a single line literal:
```cy
var apple = 'a fruit'
var fruit = 'apple'
var str = 'abc🦊xyz🐶'
```

You can escape the single quote inside the literal or use double quotes.
Since raw strings interprets the sequence of characters as is, a single quote character can not be escaped:
```cy
var apple = 'Bob\'s fruit'
apple = "Bob's fruit"
var fruit = 'Miso's apple' -- ParseError.
```

Concatenate two strings together with the `+` operator or the method `concat`.
Triple single quotes are used to delimit a multi-line literal. It also allows single quotes and double single quotes in the string:
```cy
var res = 'abc' + 'xyz'
res = res.concat('end')
var fruit = '''Miso's apple'''
var greet = '''Hello
World'''
```

Strings are UTF-8 encoded.
```cy
var str = 'abc🦊xyz🐶'
```
### String literal.
A string literal allows escape sequences and string interpolation.

Use double quotes to surround a multi-line string.
Double quotes are used to delimit a single line literal:
```cy
var str = "line a
line b
line c"
var fruit = "apple"
var sentence = "The $(fruit) is tasty."
var doc = "A double quote can be escaped: \""
```

You can escape double quotes inside the literal or use triple quotes.
Triple double quotes are used to delimit a multi-line literal:
```cy
var str = "line a
line \"b\"
line c"

-- Using triple quotes.
str = '''line a
var title = "last"
var doc = """A double quote " doesn't need to be escaped."""
str = """line a
line "b"
line c
'''
line $(title)
"""
```

The following escape sequences are supported:
### Escape sequences.
The following escape sequences are supported in string literals:

| Escape Sequence | Code | Description |
| --- | --- | --- |
Expand All @@ -571,12 +573,13 @@ The following escape sequences are supported:
| \r | 0x0d | Carriage return character. |
| \t | 0x09 | Horizontal tab character. |

The boundary of each line can be set with a vertical line character. This makes it easier to see the whitespace.
*Planned Feature*
### String operations.
See [`type string`](#type-string) for all available methods.

Concatenate two strings together with the `+` operator or the method `concat`.
```cy
var poem = "line a
| two spaces from the left
| indented further"
var res = 'abc' + 'xyz'
res = res.concat('end')
```

Using the index operator will return the UTF-8 rune at the given index as a slice. This is equivalent to calling the method `sliceAt()`.
Expand All @@ -601,12 +604,12 @@ i += 1
print(i + str[i..].findRune(`c`)) -- "5"
```

### String Interpolation.
### String interpolation.
Expressions can be embedded into string templates with `$()`:
```cy
var name = 'Bob'
var points = 123
var str = 'Scoreboard: $(name) $(points)'
var str = "Scoreboard: $(name) $(points)"
```
String templates can not contain nested string templates.

Expand All @@ -617,11 +620,36 @@ var file = os.openFile('data.bin', .read)
var bytes = file.readToEnd()

-- Dump contents in hex.
print '$(bytes.fmt(.x))'
print "$(bytes.fmt(.x))"
```

### Line-join literal.
The line-join literal joins string literals with the new line character `\n`. *Planned Feature*

This has several properties:
* Ensures the use of a consistent line separator: `\n`
* Allows lines to have a mix of raw string or string literals.
* Single quotes and double quotes do not need to be escaped.
* Allows each line to be indented along with the surrounding syntax.
* The starting whitespace for each line is made explicit.

```cy
var paragraph = [
\'the line-join literal
\'hello\nworld
\"hello $(name)
\'last line
\'
]
```

### Mutable strings.
To mutate an existing string, use [type MutString](#mutstring). *Planned Feature*

## Arrays.
An `array` is an immutable sequence of bytes. It can be used to represent strings but it won't automatically validate their encoding and indexing returns the n'th byte rather than a UTF-8 rune. See [`type array`](#type-array).
An `array` is an immutable sequence of bytes.
It can be a more performant way to represent strings but it won't automatically validate their encoding and indexing returns the n'th byte rather than a UTF-8 rune.
See [`type array`](#type-array).

```cy
var a = array('abcd')
Expand Down Expand Up @@ -740,7 +768,7 @@ map.remove 123

-- Iterating a map.
for map -> [val, key]:
print '$(key) -> $(val)'
print "$(key) -> $(val)"
```

### Map block.
Expand Down Expand Up @@ -1015,7 +1043,7 @@ for map -> entry:
Use the destructure syntax to extract the key and value into two separate variables:
```cy
for map -> [ key, val ]:
print 'key $(key) -> value $(val)'
print "key $(key) -> value $(val)"
```

### `for` each with index.
Expand All @@ -1024,7 +1052,7 @@ A counting index can be declared after the each variable. The count starts at 0
var list = [1, 2, 3, 4, 5]

for list -> val, i:
print 'index $(i), value $(val)'
print "index $(i), value $(val)"
```

### Exit loop.
Expand Down Expand Up @@ -1059,7 +1087,7 @@ case 200:
case 300, 400:
print 'combined case'
else:
print 'val is $(val)'
print "val is $(val)"
```
Note that the `switch` block must be empty and requires at least one `case` block or an `else` block to come after it.

Expand Down Expand Up @@ -1472,7 +1500,7 @@ import os

var map = os.getEnvAll()
for map -> [k, v]:
print '$(k) -> $(v)'
print "$(k) -> $(v)"
```

<!-- os.start -->
Expand Down
44 changes: 22 additions & 22 deletions docs/gen-docs.cy
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ var args = os.parseArgs([
genDocsModules()

var curDir = os.dirName(#modUri)
var src = os.readFile('$(curDir)/docs-modules.md')
var src = os.readFile("$(curDir)/docs-modules.md")
var csrc = os.cstr(src)
var csrcLen = array(src).len()

Expand All @@ -41,23 +41,23 @@ parser.set(56, .voidPtr, nullptr)

var res = md.md_parse(csrc, csrcLen, parser, none)
if res != 0:
print 'parse error: $(res)'
print "parse error: $(res)"
os.exit(1)

var tocLinksHtml = []
for tocLinks -> link:
tocLinksHtml.append('<li><a href="$(link.href)">$(link.text)</a></li>')
tocLinksHtml.append("""<li><a href="$(link.href)">$(link.text)</a></li>""")

var simpleCSS = os.readFile('$(curDir)/simple.css')
var hljsCSS = os.readFile('$(curDir)/hljs.min.css')
var hljsJS = os.readFile('$(curDir)/highlight.min.js')
var simpleCSS = os.readFile("$(curDir)/simple.css")
var hljsCSS = os.readFile("$(curDir)/hljs.min.css")
var hljsJS = os.readFile("$(curDir)/highlight.min.js")

var stylePart = '<link rel="stylesheet" href="./style.css">'
if !args['import-style']:
var styleCSS = os.readFile('$(curDir)/style.css')
stylePart = '<style>$(styleCSS)</style>'
var styleCSS = os.readFile("$(curDir)/style.css")
stylePart = "<style>$(styleCSS)</style>"

var html = '''<html lang="en">
var html = """<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
Expand Down Expand Up @@ -122,10 +122,10 @@ hljs.registerLanguage('cy', function() {
hljs.highlightAll();
</script>
</body>
</html>'''
</html>"""
-- print out
print 'Done.'
os.writeFile('$(curDir)/docs.html', html)
os.writeFile("$(curDir)/docs.html", html)

var .out = ''
var .htmlContent = ''
Expand Down Expand Up @@ -388,10 +388,10 @@ func genDocsModules():

var curDir = os.dirName(#modUri)
-- var md = os.readFile('$(curDir)/../modules.md')
var md = os.readFile('$(curDir)/docs.md')
var md = os.readFile("$(curDir)/docs.md")

for modules -> mod:
var src = os.readFile('$(curDir)/$(mod.path)')
var src = os.readFile("$(curDir)/$(mod.path)")
var decls = parseCyber(src)['decls']
var gen = '\n'
for decls -> decl:
Expand All @@ -401,29 +401,29 @@ func genDocsModules():
var params = []
for decl.header.params -> param:
var typeSpec = (param.typeSpec != '') ? param.typeSpec else 'any'
params.append('$(param.name) $(typeSpec)')
params.append("$(param.name) $(typeSpec)")
var paramsStr = params.join(', ')
gen = gen + '> `func $(decl.header.name)($(paramsStr)) $(decl.header.ret)`\n>\n>$(docLine)\n\n'
gen = gen + "> `func $(decl.header.name)($(paramsStr)) $(decl.header.ret)`\n>\n>$(docLine)\n\n"
case 'variable':
var docLine = decl.docs ? decl.docs else ''
var typeSpec = (decl.typeSpec != '') ? decl.typeSpec else 'any'
gen = gen + '> `var $(decl.name) $(typeSpec)`\n>\n>$(docLine)\n\n'
gen = gen + "> `var $(decl.name) $(typeSpec)`\n>\n>$(docLine)\n\n"
case 'object':
gen = gen + '### `type $(decl.name)`\n\n'
gen = gen + "### `type $(decl.name)`\n\n"
for decl.children -> child:
if child.type == 'funcInit':
var docLine = child.docs ? child.docs else ''
var params = []
for child.header.params -> param:
var typeSpec = (param.typeSpec != '') ? param.typeSpec else 'any'
params.append('$(param.name) $(typeSpec)')
params.append("$(param.name) $(typeSpec)")
var paramsStr = params.join(', ')
gen = gen + '> `func $(child.header.name)($(paramsStr)) $(child.header.ret)`\n>\n>$(docLine)\n\n'
gen = gen + "> `func $(child.header.name)($(paramsStr)) $(child.header.ret)`\n>\n>$(docLine)\n\n"

-- Replace section in modules.md.
var needle = '<!-- $(mod.section).start -->'
var needle = "<!-- $(mod.section).start -->"
var startIdx = (md.find(needle) as int) + needle.len()
var endIdx = md.find('<!-- $(mod.section).end -->')
var endIdx = md.find("<!-- $(mod.section).end -->")
md = md[0..startIdx] + gen + md[endIdx..]

os.writeFile('$(curDir)/docs-modules.md', md)
os.writeFile("$(curDir)/docs-modules.md", md)
2 changes: 1 addition & 1 deletion examples/account.cy
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ type Account:
balance -= amt

func show(title):
print '$(title or ''), $(name), $(balance)'
print "$(title or ''), $(name), $(balance)"

func Account.new(name) Account:
return [Account name: name, balance: 0.0]
Expand Down
21 changes: 8 additions & 13 deletions src/cyon.zig
Original file line number Diff line number Diff line change
Expand Up @@ -746,21 +746,16 @@ test "decodeMap" {
\\[
\\ name: 'project',
\\ list: [
\\ [
\\ field: 1
\\ ],
\\ [
\\ field: 2
\\ ]
\\ [ field: 1 ],
\\ [ field: 2 ],
\\ ],
\\ map: [
\\ 1: 'foo',
\\ 2: 'bar',
\\ 3: 'ba\'r',
\\ 4: "bar
\\bar",
\\ 5: "bar `bar`
\\bar"
\\ 3: "ba\"r",
\\ 4: """ba"r""",
\\ 5: """bar `bar`
\\bar"""
\\ ]
\\]
);
Expand All @@ -782,9 +777,9 @@ test "decodeMap" {
try t.eq(root.map[1].id, 2);
try t.eqStr(root.map[1].val, "bar");
try t.eq(root.map[2].id, 3);
try t.eqStr(root.map[2].val, "ba'r");
try t.eqStr(root.map[2].val, "ba\"r");
try t.eq(root.map[3].id, 4);
try t.eqStr(root.map[3].val, "bar\nbar");
try t.eqStr(root.map[3].val, "ba\"r");
try t.eq(root.map[4].id, 5);
try t.eqStr(root.map[4].val, "bar `bar`\nbar");
}
Expand Down
Loading

0 comments on commit d9f355d

Please sign in to comment.