Super-fy the Super Columnar format doc (#5399)

brimdata · Oct 31, 2024 · 210836e · 210836e
1 parent 22fed24
commit 210836e
Show file tree

Hide file tree

Showing 10 changed files with 151 additions and 135 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -165,7 +165,7 @@
 > [specific guidance for users of the Zed CLI tools](https://github.com/brimdata/zed-lake-migration#zed-cli-tools).
 
 * Zed lake storage format is now at version 3 (#4386, #4415)
-* Allow loading and responses in [VNG](docs/formats/vng.md) format over the lake API (#4345)
+* Allow loading and responses in [VNG](docs/formats/csup.md) format over the lake API (#4345)
 * Fix an issue where [record spread expressions](docs/language/expressions.md#record-expressions) could cause a crash (#4359)
 * Fix an issue where the Zed service `/version` endpoint returned "unknown" if it had been built via `go install` (#4371)
 * Branch-level [meta-queries](docs/commands/zed.md#meta-queries) on the `main` branch no longer require an explicit `@main` reference (#4377, #4394)
@@ -177,7 +177,7 @@
 
 ## v1.5.0
 * Add `float16` primitive type (#4301)
-* Add segment compression to the [VNG](docs/formats/vng.md) format (#4299)
+* Add segment compression to the [VNG](docs/formats/csup.md) format (#4299)
 * Add `-unbuffered` flag to `zed` and `zq` (#4320)
 * Add `-csv.delim` flag to `zed` and `zq` for reading CSV with non-comma delimiter (#4325)
 * Add `csv.delim` query parameter to lake API for reading CSV with non-comma delimiter (#4333)
@@ -186,7 +186,7 @@
 * Fix an issue where type decorators of union values were leaking into CSV output (#4338)
 
 ## v1.4.0
-* The ZST format is now called [VNG](docs/formats/vng.md) (#4256)
+* The ZST format is now called [VNG](docs/formats/csup.md) (#4256)
 * Allow loading of "line" format over the lake API (#4229)
 * Allow loading of Parquet format over the lake API (#4235)
 * Allow loading of Zeek TSV format over the lake API (#4246)
@@ -629,7 +629,7 @@ questions.
 ## v0.23.0
 * zql: Add `week` as a unit for [time grouping with `every`](docs/language/functions/every.md) (#1374)
 * zq: Fix an issue where a `null` value in a [JSON type definition](docs/integrations/zeek/README.md) caused a failure without an error message (#1377)
-* zq: Add [`zst` format](docs/formats/vng.md) to `-i` and `-f` command-line help (#1384)
+* zq: Add [`zst` format](docs/formats/csup.md) to `-i` and `-f` command-line help (#1384)
 * zq: ZNG spec and `zq` updates to introduce the beta ZNG storage format (#1375, #1415, #1394, #1457, #1512, #1523, #1529), also addressing the following:
    * New data type `bytes` for storing sequences of bytes encoded as base64 (#1315)
    * Improvements to the `enum` data type (#1314)
@@ -693,7 +693,7 @@ questions.
 * zqd: Fix an issue where starting `zqd listen` created excess error messages when subdirectories were present (#1303)
 * zql: Add the [`fuse` operator](docs/language/operators/fuse.md) for unifying records under a single schema (#1310, #1319, #1324)
 * zql: Fix broken links in documentation (#1321, #1339)
-* zst: Introduce the [ZST format](docs/formats/vng.md) for columnar data based on ZNG (#1268, #1338)
+* zst: Introduce the [ZST format](docs/formats/csup.md) for columnar data based on ZNG (#1268, #1338)
 * pcap: Fix an issue where certain pcapng files could fail import with a `bad option length` error (#1341)
 * zql: [Document the `**` operator](docs/language/README.md#search-syntax) for type-specific searches that look within nested records (#1337)
 * zar: Change the archive data file layout to prepare for handing chunk files with overlapping ranges and improved S3 support (#1330)

diff --git a/docs/README.md b/docs/README.md
@@ -41,7 +41,7 @@ that underlie the super-structured data formats.
 * The [super data formats](formats/README.md) are a family of
 [human-readable (Super JSON, JSUP)](formats/jsup.md),
 [sequential (Super Binary, BSUP)](formats/bsup.md), and
-[columnar (Super Columnar, CSUP)](formats/vng.md) formats that all adhere to the
+[columnar (Super Columnar, CSUP)](formats/csup.md) formats that all adhere to the
 same abstract super data model.
 * The [SuperPipe language](language/README.md) is the system's pipeline language for performing
 queries, searches, analytics, transformations, or any of the above combined together.

diff --git a/docs/commands/zed.md b/docs/commands/zed.md
@@ -118,7 +118,7 @@ replication easy to support and deploy.
 The cloud objects that comprise a lake, e.g., data objects,
 commit history, transaction journals, partial aggregations, etc.,
 are stored as Zed data, i.e., either as [row-based Super Binary](../formats/bsup.md)
-or [columnar VNG](../formats/vng.md).
+or [Super Columnar](../formats/csup.md).
 This makes introspection of the lake structure straightforward as many key
 lake data structures can be queried with metadata queries and presented
 to a client as Zed data for further processing by downstream tooling.

diff --git a/docs/commands/zq.md b/docs/commands/zq.md
@@ -100,7 +100,7 @@ Note here that the query `1+1` [implies](../language/pipeline-model.md#implied-o
 | `line`    |  no  | One string value per input line |
 | `parquet` |  yes | [Apache Parquet](https://github.com/apache/parquet-format) |
 | `tsv`     |  yes | [TSV - Tab-Separated Values](https://en.wikipedia.org/wiki/Tab-separated_values) |
-| `vng`     |  yes | [VNG - Binary Columnar Format](../formats/vng.md) |
+| `csup`    |  yes | [Super Columnar](../formats/csup.md) |
 | `zeek`    |  yes | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
 | `zjson`   |  yes | [ZJSON - Zed over JSON](../formats/zjson.md) |
 | `bsup`    |  yes | [Super Binary](../formats/bsup.md) |
@@ -158,7 +158,7 @@ JSON any number that appears without a decimal point as an integer type.
 
 :::tip note
 The reason `zq` is not particularly performant for ZSON is that the ZNG or
-[VNG](../formats/vng.md) formats are semantically equivalent to ZSON but much more efficient and
+[Super Columnar](../formats/csup.md) formats are semantically equivalent to ZSON but much more efficient and
 the design intent is that these efficient binary formats should be used in
 use cases where performance matters.  ZSON is typically used only when
 data needs to be human-readable in interactive settings or in automated tests.
@@ -186,7 +186,7 @@ typically omit quotes around field names.
 | `table`   | (described [below](#simplified-text-outputs)) |
 | `text`    | (described [below](#simplified-text-outputs)) |
 | `tsv`     | [TSV - Tab-Separated Values](https://en.wikipedia.org/wiki/Tab-separated_values) |
-| `vng`     | [VNG - Binary Columnar Format](../formats/vng.md) |
+| `csup`    | [Super Columnar](../formats/csup.md) |
 | `zeek`    | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
 | `zjson`   | [ZJSON - Zed over JSON](../formats/zjson.md) |
 | `bsup`    | [Super Binary](../formats/bsup.md) |

diff --git a/docs/formats/README.md b/docs/formats/README.md
@@ -271,7 +271,7 @@ documents are Super JSON values as the Super JSON format is a strict superset of
 * [Super Binary](bsup.md) is a row-based, binary representation somewhat like
 Avro but leveraging the super data model to represent a sequence of arbitrarily-typed
 values.
-* [Super Columnar](vng.md) is columnar like Parquet or ORC but also
+* [Super Columnar](csup.md) is columnar like Parquet or ORC but also
 embodies the super data model for heterogeneous and self-describing schemas.
 * [Super JSON over JSON](zjson.md) defines a format for encapsulating Super JSON
 inside plain JSON for easy decoding by JSON-based clients, e.g.,