Skip to content

Commit

Permalink
first cut at ZSON to Super JSON rename in docs (#5376)
Browse files Browse the repository at this point in the history
This commit renames ZSON to Super JSON throughout the docs.
The code and tests have not been updated, e.g., `-f zson` has not
yet been changed to `-f jsup`.
  • Loading branch information
mccanne authored Oct 26, 2024
1 parent ac7a06c commit 3deb4c9
Show file tree
Hide file tree
Showing 26 changed files with 303 additions and 305 deletions.
6 changes: 3 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@
* The [Zed Language Overview docs](docs/language/overview.md) have been split into multiple sections (#4576)
* Add support for [user-defined operators](docs/language/statements.md#operator-statements) (#4417, #4635, #4646, #4644, #4663, #4674, #4698, #4702, #4716)
* Add experimental support to the [`get` operator](docs/language/operators/get.md) for customized methods, headers, and body (#4572)
* Allow float decorators on integers in [ZSON](docs/formats/zson.md) (#4654)
* Allow float decorators on integers in [ZSON](docs/formats/jsup.md) (#4654)
* The [shaping docs](docs/language/shaping.md) have been expanded with a new section on [error handling](docs/language/shaping.md#error-handling) (#4686)
* `zq` no longer attaches positional command line file inputs directly to [`join`](docs/language/operators/join.md) inputs (use [`file`](docs/language/operators/file.md) within a Zed program instead) (#4689)
* [Zeek](https://zeek.org/)-related docs have been moved to the Integrations area of the [Zed docs site](https://zed.brimdata.io/docs) (#4694, #4696)
Expand Down Expand Up @@ -246,7 +246,7 @@
* Revamped [`zed` command](docs/commands/zed.md)
* New Zed lake format (see #3634 for a migration script)
* New version of the [ZNG format](docs/formats/zng.md) (with read-only support for the previous version)
* New version of the [ZSON format](docs/formats/zson.md)
* New version of the [ZSON format](docs/formats/jsup.md)

## v0.33.0

Expand Down Expand Up @@ -587,7 +587,7 @@ questions.
* zq: Fix an issue where returned errors could cause a panic due to type mismatches (#1720, #1727, #1728, #1740, #1773)
* python: Fix an issue where the [Python client](https://medium.com/brim-securitys-knowledge-funnel/visualizing-ip-traffic-with-brim-zeek-and-networkx-3844a4c25a2f) did not generate an error when `zqd` was absent (#1711)
* zql: Allow the `len()` function to work on `ip` and `net` types (#1725)
* ZSON: Add a [draft specification](docs/formats/zson.md) of the new ZSON format (#1715, #1735, #1741, #1765)
* ZSON: Add a [draft specification](docs/formats/jsup.md) of the new ZSON format (#1715, #1735, #1741, #1765)
* zng: Add support for marshaling of `time` values (#1743)
* zar: Fix an issue where a `couldn't read trailer` failure was observed during a `zar zq` query (#1748)
* zar: Fix an issue where `zar import` of a 14 GB data set triggered a SEGV (#1766)
Expand Down
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# SuperDB [![Tests][tests-img]][tests] [![GoPkg][gopkg-img]][gopkg]

SuperDB is a new analytics database that supports relational tables and JSON
SuperDB is a new analytics database that supports relational tables and JSON
on an equal footing. It shines when it comes to data wrangling where
you need to explore or process large eclectic data sets. It's also pretty
decent at analytics and
Expand All @@ -23,7 +23,7 @@ system for semi-structured data,
all data handled by SuperDB (e.g., JSON, CSV, Parquet files, Arrow streams, relational tables, etc) is automatically massaged into
[super-structured data](https://zed.brimdata.io/docs/formats/#2-zed-a-super-structured-pattern)
form. This super-structured data is then processed by a runtime that simultaneously
supports the statically-typed relational model and the dynamically-typed
supports the statically-typed relational model and the dynamically-typed
JSON data model in a unified compute engine.

## SuperSQL
Expand All @@ -39,7 +39,7 @@ FROM 'https://data.gharchive.org/2015-01-01-15.json.gz'
GROUP BY user
ORDER BY len(repo) DESC LIMIT 5
|> FORK (
=> FROM f"https://api.github.com/users/${user}"
=> FROM f"https://api.github.com/users/${user}"
|> SELECT VALUE {user:login,created_at:time(created_at)}
=> PASS
)
Expand All @@ -48,10 +48,10 @@ FROM 'https://data.gharchive.org/2015-01-01-15.json.gz'

## Super JSON

Super-structured data is strongly typed and "polymorphic": any value can take on any type
Super-structured data is strongly typed and "polymorphic": any value can take on any type
and sequences of data need not all conform to a predefined schema. To this end,
SuperDB extends the JSON format to support super-structured data in a format called
[Super JSON](https://zed.brimdata.io/docs/formats/zson) where all JSON values
[Super JSON](https://zed.brimdata.io/docs/formats/next/jsup) where all JSON values
are also Super JSON values. Similarly,
the [Super Binary](https://zed.brimdata.io/docs/formats/zng) format is an efficient
binary representation of Super JSON (a bit like Avro) and the
Expand All @@ -78,23 +78,23 @@ using the `super db` sub-commands.

## Piped Query Syntax

The long-term goal for SuperDB's SQL syntax (SuperSQL) is to be Postgres-compatible and interoperate
The long-term goal for SuperDB's SQL syntax (SuperSQL) is to be Postgres-compatible and interoperate
with BI tools though this is currently a roadmap item. At the same time, the project
seeks to forge new ground on the usability of SQL for data exploration. To this end,
SuperSQL supports the
[pipe query syntax](https://github.com/google/zetasql/blob/master/docs/pipe-syntax.md)
of GoogleSQL, recently described in their
[VLDB 2024 paper](https://research.google/pubs/sql-has-problems-we-can-fix-them-pipe-syntax-in-sql/).

In addition to the GoogleSQL syntax, SuperSQL includes additional pipeline
operators to enhance usability, e.g., for search, for traversing
In addition to the GoogleSQL syntax, SuperSQL includes additional pipeline
operators to enhance usability, e.g., for search, for traversing
highly nested JSON, for data shaping, etc.

To facilitate real-time, data exploration use cases,
SuperDB supports an abbreviated form of SuperSQL called
[SuperPipe](https://zed.brimdata.io/docs/language).

SuperPipe provides a large number of shortcuts when typing interactive
SuperPipe provides a large number of shortcuts when typing interactive
queries, e.g., implied group-by clauses, dropping keywords,
implied keyword searches, and so forth. Even though SuperPipe is simply
a short-hand form SuperSQL, it sort of looks like the pipeline-style
Expand Down
6 changes: 3 additions & 3 deletions compiler/ztests/load.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@ script: |
export SUPER_DB_LAKE=test
super db init -q
super db create -q samples
super db load -q -use samples schools.zson
super db load -q -use samples schools.jsup
super db create -q Orange
super db query -z 'from samples | County=="Orange" | load Orange@main author "Diane"' | sed -E 's/[0-9a-zA-Z]{42}/xxx/'
inputs:
- name: schools.zson
source: ../../testdata/edu/schools.zson
- name: schools.jsup
source: ../../testdata/edu/schools.jsup
outputs:
- name: stdout
data: |
Expand Down
8 changes: 4 additions & 4 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ For a non-technical user, SuperDB is as easy to use as web search
while for a technical user, SuperDB exposes its technical underpinnings
in a gradual slope, providing as much detail as desired,
packaged up in the easy-to-understand
[Super JSON data format](formats/zson.md) and
[Super JSON data format](formats/jsup.md) and
[SuperPipe language](language/README.md).

While `super` and its accompanying data formats are production quality, the project's
Expand All @@ -39,9 +39,9 @@ a number of different elements of the system:
* The [super data model](formats/zed.md) is the abstract definition of the data types and semantics
that underlie the super-structured data formats.
* The [super data formats](formats/README.md) are a family of
[human-readable (Super JSON, SUP)](formats/zson.md),
[sequential (Binary Super JSON, SUPZ)](formats/zng.md), and
[columnar (Super Parquet, SPAR)](formats/vng.md) formats that all adhere to the
[human-readable (Super JSON, JSUP)](formats/jsup.md),
[sequential (Super Binary, BSUP)](formats/zng.md), and
[columnar (Super Columnar, CSUP)](formats/vng.md) formats that all adhere to the
same abstract super data model.
* The [SuperPipe language](language/README.md) is the system's pipeline language for performing
queries, searches, analytics, transformations, or any of the above combined together.
Expand Down
4 changes: 2 additions & 2 deletions docs/commands/zed.md
Original file line number Diff line number Diff line change
Expand Up @@ -529,7 +529,7 @@ The `date` field here is used by the Zed lake system to do time travel
through the branch and pool history, allowing you to see the state of
branches at any time in their commit history.

Arbitrary metadata expressed as any [ZSON value](../formats/zson.md)
Arbitrary metadata expressed as any [ZSON value](../formats/jsup.md)
may be attached to a commit via the `-meta` flag. This allows an application
or user to transactionally commit metadata alongside committed data for any
purpose. This approach allows external applications to implement arbitrary
Expand Down Expand Up @@ -601,7 +601,7 @@ If the `-monitor` option is specified and the lake is [located](#locating-the-la
via network connection, `zed manage` will run continuously and perform updates
as needed. By default a check is performed once per minute to determine if
updates are necessary. The `-interval` option may be used to specify an
alternate check frequency in [duration format](../formats/zson.md#23-primitive-values).
alternate check frequency in [duration format](../formats/jsup.md#23-primitive-values).

If `-monitor` is not specified, a single maintenance pass is performed on the
lake.
Expand Down
8 changes: 4 additions & 4 deletions docs/commands/zq.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ tends to be the most space-efficient and most performant. ZNG has efficiency si
and [Protocol Buffers](https://developers.google.com/protocol-buffers)
but its comprehensive [Zed type system](../formats/zed.md) obviates
the need for schema specification or registries.
Also, the [ZSON](../formats/zson.md) format is human-readable and entirely one-to-one with ZNG
Also, the [ZSON](../formats/jsup.md) format is human-readable and entirely one-to-one with ZNG
so there is no need to represent non-readable formats like Avro or Protocol Buffers
in a clunky JSON encapsulated form.

Expand Down Expand Up @@ -104,7 +104,7 @@ Note here that the query `1+1` [implies](../language/pipeline-model.md#implied-o
| `zeek` | yes | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
| `zjson` | yes | [ZJSON - Zed over JSON](../formats/zjson.md) |
| `zng` | yes | [ZNG - Binary Row Format](../formats/zng.md) |
| `zson` | yes | [ZSON - Human-readable Format](../formats/zson.md) |
| `zson` | yes | [ZSON - Human-readable Format](../formats/jsup.md) |

The input format is typically [detected automatically](#auto-detection) and the formats for which
"Auto" is "yes" in the table above support _auto-detection_.
Expand Down Expand Up @@ -146,7 +146,7 @@ would produce this output in the default ZSON format

### ZSON-JSON Auto-detection

Since [ZSON](../formats/zson.md) is a superset of JSON, `zq` must be careful in whether it
Since [ZSON](../formats/jsup.md) is a superset of JSON, `zq` must be careful in whether it
interprets input as ZSON as JSON. While you can always clarify your intent
with the `-i zson` or `-i json`, `zq` attempts to "just do the right thing"
when you run it with JSON vs. ZSON.
Expand Down Expand Up @@ -190,7 +190,7 @@ typically omit quotes around field names.
| `zeek` | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
| `zjson` | [ZJSON - Zed over JSON](../formats/zjson.md) |
| `zng` | [ZNG - Binary Row Format](../formats/zng.md) |
| `zson` | [ZSON - Human-readable Format](../formats/zson.md) |
| `zson` | [ZSON - Human-readable Format](../formats/jsup.md) |

The output format defaults to either ZSON or ZNG and may be specified
with the `-f` option.
Expand Down
4 changes: 2 additions & 2 deletions docs/formats/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
> providing a unified approach to row, columnar, and human-readable formats. Together these
> represent a superset of both the dataframe/table model of relational systems and the
> semi-structured model that is used ubiquitously in development as JSON and by NoSQL
> data stores. The Super JSON spec has [a few examples](zson.md#3-examples).
> data stores. The Super JSON spec has [a few examples](jsup.md#3-examples).
## 1. Background

Expand Down Expand Up @@ -266,7 +266,7 @@ A set of companion documents define a family of tightly integrated
serialization formats that all adhere to the same super data model,
providing a unified approach to row, columnar, and human-readable formats:

* [Super JSON](zson.md) is a human-readable format for super-structured data. All JSON
* [Super JSON](jsup.md) is a human-readable format for super-structured data. All JSON
documents are Super JSON values as the Super JSON format is a strict superset of the JSON syntax.
* [Super Binary](zng.md) is a row-based, binary representation somewhat like
Avro but leveraging the super data model to represent a sequence of arbitrarily-typed
Expand Down
Loading

0 comments on commit 3deb4c9

Please sign in to comment.