Can we implement a video codec with Zarr? #157

alxmrs · 2022-08-22T20:30:09Z

Here's a thought experiment: Given the new version of the Zarr specification (#149) along with its extension system, could we implement a video codec?

Why is this a useful question? It tests how flexible the new Zarr spec is. Video is a ubiquitous type of data that overlaps with a lot of the core Zarr processes. A huge amount of engineering effort and thought are put in to make video codecs (this is still quite an understatement). If Zarr aims to be a "meta format" for the cloud, eventually it will have to solve difficult cases like this.

Video use cases have an interesting overlap with representing scientific data: Video is multi-band (images + sound), bands include metadata, compression is required to store or work with data, and chunks of data are streamed in to be used (often).

In my initial estimate (especially, given the last Zarr open discussion), the core nuance provided by video formats is that they require a totally different chunking strategy for data. In video formats, data is typically compressed across time, where keyframes contain full images of data for specific intervals of time, and the rest of the frames are represented as diffs.

If Zarr could be extended to mimic some of the approaches used in video, it could be used to efficiently store time-series data, no matter the scientific domain. It could become a https://mpeg-g.org/ for non-bioinformatics workloads.

To further hypothesize, the infrastructure that makes video possible includes hardware acceleration. In this approach, could Zarr implementations make use of such optimizations?

jbms · 2022-08-22T21:15:31Z

As discussed at the community meeting, you can already use a video codec with zarr v2 --- just set the chunk size to be > 1 in the time dimension (presumably some multiple of the number of frames per keyframe). (And zarr v3 doesn't currently differ from v2 in a way that would make this easier or harder.) In principle it seems like it should be pretty straightforward, though you might discover some issues if you actually test it out. For example, one possible issue that occurs to me is that, depending on the format, the overhead of the header, or of initializing the codec, may be high if every chunk is encoded as a separate video "file" with just a single keyframe. That could be mitigated either by making the chunk size larger (easy), or with some other approach that is more work.

jstriebel · 2022-11-15T17:02:47Z

@alxmrs Are there more open questions? If not I'd close this issue, and please feel free to open another issue about a concrete video codec implementation as an extension.

alxmrs · 2022-11-15T20:34:08Z

Sounds good to me! Thank you.

jstriebel closed this as completed Nov 15, 2022

alxmrs mentioned this issue Jun 13, 2024

AV1 Codec support oceanum-io/xarray-video#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we implement a video codec with Zarr? #157

Can we implement a video codec with Zarr? #157

alxmrs commented Aug 22, 2022

jbms commented Aug 22, 2022

jstriebel commented Nov 15, 2022

alxmrs commented Nov 15, 2022

Can we implement a video codec with Zarr? #157

Can we implement a video codec with Zarr? #157

Comments

alxmrs commented Aug 22, 2022

jbms commented Aug 22, 2022

jstriebel commented Nov 15, 2022

alxmrs commented Nov 15, 2022