Skip to content

Commit

Permalink
Super-fy the Super Columnar format doc (#5399)
Browse files Browse the repository at this point in the history
  • Loading branch information
philrz authored Oct 31, 2024
1 parent 22fed24 commit 210836e
Show file tree
Hide file tree
Showing 10 changed files with 151 additions and 135 deletions.
10 changes: 5 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@
> [specific guidance for users of the Zed CLI tools](https://github.com/brimdata/zed-lake-migration#zed-cli-tools).
* Zed lake storage format is now at version 3 (#4386, #4415)
* Allow loading and responses in [VNG](docs/formats/vng.md) format over the lake API (#4345)
* Allow loading and responses in [VNG](docs/formats/csup.md) format over the lake API (#4345)
* Fix an issue where [record spread expressions](docs/language/expressions.md#record-expressions) could cause a crash (#4359)
* Fix an issue where the Zed service `/version` endpoint returned "unknown" if it had been built via `go install` (#4371)
* Branch-level [meta-queries](docs/commands/zed.md#meta-queries) on the `main` branch no longer require an explicit `@main` reference (#4377, #4394)
Expand All @@ -177,7 +177,7 @@

## v1.5.0
* Add `float16` primitive type (#4301)
* Add segment compression to the [VNG](docs/formats/vng.md) format (#4299)
* Add segment compression to the [VNG](docs/formats/csup.md) format (#4299)
* Add `-unbuffered` flag to `zed` and `zq` (#4320)
* Add `-csv.delim` flag to `zed` and `zq` for reading CSV with non-comma delimiter (#4325)
* Add `csv.delim` query parameter to lake API for reading CSV with non-comma delimiter (#4333)
Expand All @@ -186,7 +186,7 @@
* Fix an issue where type decorators of union values were leaking into CSV output (#4338)

## v1.4.0
* The ZST format is now called [VNG](docs/formats/vng.md) (#4256)
* The ZST format is now called [VNG](docs/formats/csup.md) (#4256)
* Allow loading of "line" format over the lake API (#4229)
* Allow loading of Parquet format over the lake API (#4235)
* Allow loading of Zeek TSV format over the lake API (#4246)
Expand Down Expand Up @@ -629,7 +629,7 @@ questions.
## v0.23.0
* zql: Add `week` as a unit for [time grouping with `every`](docs/language/functions/every.md) (#1374)
* zq: Fix an issue where a `null` value in a [JSON type definition](docs/integrations/zeek/README.md) caused a failure without an error message (#1377)
* zq: Add [`zst` format](docs/formats/vng.md) to `-i` and `-f` command-line help (#1384)
* zq: Add [`zst` format](docs/formats/csup.md) to `-i` and `-f` command-line help (#1384)
* zq: ZNG spec and `zq` updates to introduce the beta ZNG storage format (#1375, #1415, #1394, #1457, #1512, #1523, #1529), also addressing the following:
* New data type `bytes` for storing sequences of bytes encoded as base64 (#1315)
* Improvements to the `enum` data type (#1314)
Expand Down Expand Up @@ -693,7 +693,7 @@ questions.
* zqd: Fix an issue where starting `zqd listen` created excess error messages when subdirectories were present (#1303)
* zql: Add the [`fuse` operator](docs/language/operators/fuse.md) for unifying records under a single schema (#1310, #1319, #1324)
* zql: Fix broken links in documentation (#1321, #1339)
* zst: Introduce the [ZST format](docs/formats/vng.md) for columnar data based on ZNG (#1268, #1338)
* zst: Introduce the [ZST format](docs/formats/csup.md) for columnar data based on ZNG (#1268, #1338)
* pcap: Fix an issue where certain pcapng files could fail import with a `bad option length` error (#1341)
* zql: [Document the `**` operator](docs/language/README.md#search-syntax) for type-specific searches that look within nested records (#1337)
* zar: Change the archive data file layout to prepare for handing chunk files with overlapping ranges and improved S3 support (#1330)
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ that underlie the super-structured data formats.
* The [super data formats](formats/README.md) are a family of
[human-readable (Super JSON, JSUP)](formats/jsup.md),
[sequential (Super Binary, BSUP)](formats/bsup.md), and
[columnar (Super Columnar, CSUP)](formats/vng.md) formats that all adhere to the
[columnar (Super Columnar, CSUP)](formats/csup.md) formats that all adhere to the
same abstract super data model.
* The [SuperPipe language](language/README.md) is the system's pipeline language for performing
queries, searches, analytics, transformations, or any of the above combined together.
Expand Down
2 changes: 1 addition & 1 deletion docs/commands/zed.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ replication easy to support and deploy.
The cloud objects that comprise a lake, e.g., data objects,
commit history, transaction journals, partial aggregations, etc.,
are stored as Zed data, i.e., either as [row-based Super Binary](../formats/bsup.md)
or [columnar VNG](../formats/vng.md).
or [Super Columnar](../formats/csup.md).
This makes introspection of the lake structure straightforward as many key
lake data structures can be queried with metadata queries and presented
to a client as Zed data for further processing by downstream tooling.
Expand Down
6 changes: 3 additions & 3 deletions docs/commands/zq.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ Note here that the query `1+1` [implies](../language/pipeline-model.md#implied-o
| `line` | no | One string value per input line |
| `parquet` | yes | [Apache Parquet](https://github.com/apache/parquet-format) |
| `tsv` | yes | [TSV - Tab-Separated Values](https://en.wikipedia.org/wiki/Tab-separated_values) |
| `vng` | yes | [VNG - Binary Columnar Format](../formats/vng.md) |
| `csup` | yes | [Super Columnar](../formats/csup.md) |
| `zeek` | yes | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
| `zjson` | yes | [ZJSON - Zed over JSON](../formats/zjson.md) |
| `bsup` | yes | [Super Binary](../formats/bsup.md) |
Expand Down Expand Up @@ -158,7 +158,7 @@ JSON any number that appears without a decimal point as an integer type.

:::tip note
The reason `zq` is not particularly performant for ZSON is that the ZNG or
[VNG](../formats/vng.md) formats are semantically equivalent to ZSON but much more efficient and
[Super Columnar](../formats/csup.md) formats are semantically equivalent to ZSON but much more efficient and
the design intent is that these efficient binary formats should be used in
use cases where performance matters. ZSON is typically used only when
data needs to be human-readable in interactive settings or in automated tests.
Expand Down Expand Up @@ -186,7 +186,7 @@ typically omit quotes around field names.
| `table` | (described [below](#simplified-text-outputs)) |
| `text` | (described [below](#simplified-text-outputs)) |
| `tsv` | [TSV - Tab-Separated Values](https://en.wikipedia.org/wiki/Tab-separated_values) |
| `vng` | [VNG - Binary Columnar Format](../formats/vng.md) |
| `csup` | [Super Columnar](../formats/csup.md) |
| `zeek` | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
| `zjson` | [ZJSON - Zed over JSON](../formats/zjson.md) |
| `bsup` | [Super Binary](../formats/bsup.md) |
Expand Down
2 changes: 1 addition & 1 deletion docs/formats/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ documents are Super JSON values as the Super JSON format is a strict superset of
* [Super Binary](bsup.md) is a row-based, binary representation somewhat like
Avro but leveraging the super data model to represent a sequence of arbitrarily-typed
values.
* [Super Columnar](vng.md) is columnar like Parquet or ORC but also
* [Super Columnar](csup.md) is columnar like Parquet or ORC but also
embodies the super data model for heterogeneous and self-describing schemas.
* [Super JSON over JSON](zjson.md) defines a format for encapsulating Super JSON
inside plain JSON for easy decoding by JSON-based clients, e.g.,
Expand Down
Loading

0 comments on commit 210836e

Please sign in to comment.