Streaming uncompressed data (with direct access) #773

rgaudin · 2024-05-23T18:50:01Z

What we do on Android for videos doesn't seem possible directly. It's a combination of tricks that are missing on iOS (at least I did not find them easily).
First the AssetFileDescriptor allows to mimic an asset (so a clearly bound piece of data) from a filepath, an offset and a size.
Then, there is the integration between Android MediaPlayer and the WebChromeClient renderer (with some hacks to transition smoothly).

I found some other people inquiring for AVPlayer and WKWebvierw integration but without answer. Maybe the wording or the keywords were wrong…

Despite this, I had some success with an easy tweak to the code: streaming the data to the client. Currently, we request the requested content from libzim, store that in a variable and put that on the response. On the renderer side, this content is read (sometimes not completely) and used.

This is OK for relatively small content, or maybe even large ones that are consumed entirely by the client (RAM will be required) but in the video use case, we know that the client wont even try to read the whole thing and will display it piece by piece.

Doing so is as easy as repeatedly calling urlSchemeTask.didReceive(additionalData)…

let response = HTTPURLResponse(
    url: url,
    statusCode: statusCode,
    httpVersion: "HTTP/1.1",
    headerFields: headers)
urlSchemeTask.didReceive(response!)

...

for i in 1...nbStreams {
    partEnd = partStart + streamThreshold
    content = ZimFileService.shared.getURLContent(url: url, start: partStart, end: partEnd)
    urlSchemeTask.didReceive(content!.data)
    partStart = partEnd
}
if (finalBytes > 0) {
    content = ZimFileService.shared.getURLContent(url: url, start: partStart, end: partStart + finalBytes)
    urlSchemeTask.didReceive(content!.data)
}
urlSchemeTask.didFinish()

This is very efficient in keeping RAM usage under control on very large videos.

I don't know exactly how it works internally but simply looping on writing 2MB chunks does the trick so I suppose renderer-reading is synced somehow.

Another improvement that is independent from this is reading video files directly from the filesystem. Leveraging item.getDirectAccessInformation() which returns the ZIM path on the fs and the offset at which the content start, we can easily read the video data from it (we already know its size).

WARN ⚠️: We can't pass the filehandle directly to the webview because FileHandle has no size parameter so it would not stop reading at the end of the content. Above streaming experiment shows we might not need this but we could still reimplement a FileHandle that stops after a defined size.

WARN ⚠️: getDirectAccessInformation only works on raw (uncompressed) entries which is something that's decided at ZIM-write time.

In my experiment, I used it on non-text/ mimetypes because I know that currently libzim only compresses those types. Downloads (un-handled formats as you call them) would similarily from it I suppose.

In a real implementation, we might look at whether entry is compressed (is libzim telling us this?) or using a fallback in case the function returns empty data (it doesn't fail…).

On whether we should use it or not, I don't know.

@mgautierfr, what do you think of using getDirectAccessInformation() and reading from filesystem instead of reading from the libzim? Is is worth the separate implementation code? What about other non-compressed content like PDF?

The text was updated successfully, but these errors were encountered:

kelson42 · 2024-05-23T19:18:52Z

@BPerlakiH Any chance you can implement this for video files and get a chance to fix #744?

BPerlakiH · 2024-05-24T07:30:40Z

@kelson42 @rgaudin I think this is a very good direction, I also had a look at the:

urlSchemeTask.didReceive(data)

to be used on partial chunks, I just need to wrap that into some nicer error handling (as theoretically reading any given chunk can fail).

I also had a look at AVPlayer earlier, which can be started with AVAsset/AVPlayerItem. Unfortunately it does not support webm directly at this stage.
It's also possible to have our own AVAssetReader but it won't go close enough to file reading, so I couldn't find a way toinject our ZIM file reading mechanism somewhere "in between".

I am setting up a PR for this reading optimisation as a standalone improvement for video files (without the HTTP range requests).

mgautierfr · 2024-05-24T11:42:49Z

@mgautierfr, what do you think of using getDirectAccessInformation() and reading from filesystem instead of reading from the libzim? Is is worth the separate implementation code?

I can't really answer about technicall difficulties about implementing that with "apple technologies". But getDirectAccessInformation is here to allow user code to bypass libzim and do direct reading of the content by reopening the file, seek and read (mmap is also a solution)
So I would say yes.

What about other non-compressed content like PDF?

getDirectAccessInformation works equally for any non-compressed content (if we content is not split between two file parts). I not sure it worth it as pdf content is pretty small compared to video but it would work too.

BPerlakiH · 2024-05-27T20:18:16Z

@mgautierfr I have found an issue related to this in libzim 9.2.0, please have a look if you can re-create it:
openzim/libzim#886

kelson42 · 2024-05-28T04:47:07Z

@BPerlakiH I though we decided to make the read operation directly without using the libzim?!

BPerlakiH · 2024-05-29T00:38:06Z

I've created a PR for this, currently it is in draft but can be tested, and reviewed, to see if it makes sense:
#778

BPerlakiH · 2024-06-01T09:18:33Z

As discussed I am narrowing down this issue to uncompressed data (with direct access), the follow up ticket for compressed data is here:
#784

rgaudin added the question label May 23, 2024

rgaudin assigned kelson42 and BPerlakiH May 23, 2024

kelson42 added this to the 3.4.0 milestone May 23, 2024

kelson42 removed their assignment May 27, 2024

kelson42 added the enhancement label May 27, 2024

BPerlakiH linked a pull request May 29, 2024 that will close this issue

773 streaming uncompressed data #778

Merged

BPerlakiH mentioned this issue May 29, 2024

773 streaming uncompressed data #778

Merged

BPerlakiH mentioned this issue Jun 1, 2024

Stream data "in chunks" for compressed data via libzim #784

Closed

BPerlakiH changed the title ~~Streaming data~~ Streaming uncompressed data (with direct access) Jun 1, 2024

kelson42 closed this as completed in #778 Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming uncompressed data (with direct access) #773

Streaming uncompressed data (with direct access) #773

rgaudin commented May 23, 2024

kelson42 commented May 23, 2024 •

edited

Loading

BPerlakiH commented May 24, 2024

mgautierfr commented May 24, 2024

BPerlakiH commented May 27, 2024

kelson42 commented May 28, 2024

BPerlakiH commented May 29, 2024 •

edited

Loading

BPerlakiH commented Jun 1, 2024

Streaming uncompressed data (with direct access) #773

Streaming uncompressed data (with direct access) #773

Comments

rgaudin commented May 23, 2024

kelson42 commented May 23, 2024 • edited Loading

BPerlakiH commented May 24, 2024

mgautierfr commented May 24, 2024

BPerlakiH commented May 27, 2024

kelson42 commented May 28, 2024

BPerlakiH commented May 29, 2024 • edited Loading

BPerlakiH commented Jun 1, 2024

kelson42 commented May 23, 2024 •

edited

Loading

BPerlakiH commented May 29, 2024 •

edited

Loading