Backblaze B2's Live Read feature allows clients to read multipart uploads before they are complete, combining the flexibility of uploading a stream of data as multiple files with the manageability of keeping the stream in a single file. This is particularly useful in working with live video streams using formats such as Fragmented MP4.
Backblaze B2 Live Read is currently in private preview. Read the announcement blog post; click here to join the preview.
This webinar explains how Live Read works and shows it in action, using OBS Studio to generate a live video stream.
This short video shows a simpler version of the demo, using FFmpeg to capture video from a webcam.
A producer client starts a Live Read upload by sending a CreateMultipartUpload
request with one or two custom HTTP parameters: x-backblaze-live-read-enabled
and, optionally, x-backblaze-live-read-part-size
.x-backblaze-live-read-enabled
must be set to true
for a Live Read upload to be initiated, while x-backblaze-live-read-part-size
may be set to the part size that will be used. If x-backblaze-live-read-enabled
is set to true
and x-backblaze-live-read-part-size
is not present then the size of the first part will be used. All parts except the last one must have the same size.
The producer client then uploads a series of parts, via UploadPart
, as normal. As noted above, all parts except the last one must have the same size. Once the producer client has uploaded all of its data, it calls CompleteMultipartUpload
, again, as it usually would.
Under the standard S3 API semantics, consumer clients must wait for the upload to be completed before they may download any data from the file. With Live Read, in contrast, consumer clients may attempt to download data, using GetObject
with the custom HTTP header x-backblaze-live-read-enabled
set to true
, from the file at any time after the upload is created. Consumer clients MUST include either Range
or PartNumber
in the GetObject
call to specify the required portion of the file. If the client requests a range or part that does not exist, then Backblaze B2 responds with a 416 Range Not Satisfiable
error. On receiving this error, a consumer client might repeatedly retry the request, waiting for a short interval after each unsuccessful request.
After the upload is completed, clients can retrieve the file using standard S3 API calls.
This repository contains a pair of simple Python apps that use boto3
, the AWS SDK for Python
to write and read Live Read uploads:
writer.py
creates a Live Read upload then reads its standard input in chunks corresponding to the desired part size, which defaults to the minimum part size, 5 MB. Each chunk is uploaded as a part. When the app receives end-of-file fromstdin
, it completes the upload. A signal handler ensures that pending data is uploaded if the app receivesSIGINT
(Ctrl+C) orSIGTERM
(the default signal sent by thekill
command).reader.py
reads a Live Read upload. The app attempts to download the file part-by-part. If the file does not yet exist, the app retries until it does. If a part is not available, the app usesListMultipartUploads
to check if the upload is still in progress. If it is, then the app retries getting the part; otherwise, the app terminates, since the upload has been completed.
The apps use boto3
's Event System to
inject the custom headers into the relevant SDK calls. For example, in the writer:
self.b2_client = boto3.client('s3')
logger.debug("Created boto3 client")
self.b2_client.meta.events.register('before-call.s3.CreateMultipartUpload', add_custom_header)
...
def add_custom_header(params, **_kwargs):
"""
Add the Live Read custom headers to the outgoing request.
See https://boto3.amazonaws.com/v1/documentation/api/latest/guide/events.html
"""
params['headers']['x-backblaze-live-read-enabled'] = 'true'
MP4 video files typically begin or end with metadata describing the video data - its duration, resolution, codec, etc. This metadata is
known as the MOOV
atom. The default placement is at the end of the file, as the metadata is not available until the
video data has been rendered. Video files intended for progressive download
use an optimization known as "fast start", where the rendering app leaves space for the metadata at the beginning of the
file, writes the video data, then overwrites the placeholder with the actual metadata. This optimization allows media
viewers to start playing the video while it is being downloaded.
MP4 video streams, in contrast, typically comprise a metadata header containing information such as track and sample descriptions, followed by a series of 'fragments' of video data, each containing its own metadata, known as media segments. This format is termed Fragmented MPEG-4, abbreviated as fMP4. Video stream creators choose an appropriate fragment size in the region of two to six seconds. Shorter fragments allow lower latency and faster encoding, while longer fragments allow better encoding efficiency.
Historically, fMP4 fragments were written to storage as individual files, with a Playlist file listing the media segment files comprising a stream. During a live stream, the Playlist file would be updated as each media segment file was written.
As an example, a one-hour live stream of 1920x1080 video at 30 frames/second, with a fragment length of two seconds would comprise the Playlist file and 1,800 media segment files of around 900 kB each. A video player must read the Playlist, then make an HTTPS request for each media segment file.
This demo shows how Live Read allows an fMP4 stream to be written to a single multipart file. A video player can read already-uploaded parts while the file is still being written. The main constraint is that the S3 API imposes a minimum part size of 5 MB. For our 1080p 30 fps example, this means that there is a minimum latency of about six seconds between the video data being written and it being available for download.
The demo instructions below include:
- Using FFmpeg to capture video from a webcam or RTMP stream.
- Piping raw video to
ffplay
for monitoring. - Piping fMP4 video to
writer.py
to be written to a Live Read file. - Using
reader.py
to play back video from a Live Read file. - Piping fMP4 video to FFmpeg for conversion to HTTP Live Streaming (HLS) format.
This demo was created on a MacBook Pro with an Apple M1 Pro CPU and macOS Sonoma 14.4.1 running Python 3.11.5 and FFmpeg version 7.0. It should also run on Linux if you change the input device and URL appropriately.
Follow these instructions, as necessary:
- Create a Backblaze B2 Account.
- Create a Backblaze B2 Bucket.
- Create an Application Key with access to the bucket you wish to use.
Be sure to copy the application key as soon as you create it, as you will not be able to retrieve it later!
git clone [email protected]:backblaze-b2-samples/live-read-demo.git
cd live-read-demo
Virtual environments allow you to encapsulate a project's dependencies; we recommend that you create a virtual environment thus:
python3 -m venv .venv
You must then activate the virtual environment before installing dependencies:
source .venv/bin/activate
You will need to reactivate the virtual environment, with the same command, if you close your Terminal window and return to the demo later. Both the producer and consumer apps use the same dependencies and can share the same virtual environment. If you are running the two apps in separate Terminal windows, remember to activate the virtual environment in the second window before you run the app!
pip install -r requirements.txt
If you do not already have FFmpeg installed on your system, you can download it from one of the links at https://ffmpeg.org/download.html or use a package manager to install it. For example, using Homebrew on a Mac:
brew install ffmpeg
Since we want FFmpeg to write two streams, we create a named pipe ('fifo') for it to send the raw video stream to ffplay
:
mkfifo raw_video_fifo
The demo apps read their configuration from a .env
file. Copy the included .env.template
to .env
:
cp .env.template .env
Now edit .env
, pasting in your application key, its ID, bucket name, and endpoint:
# Copy this file to .env then edit the following values
AWS_ACCESS_KEY_ID='<Your Backblaze application key ID>'
AWS_SECRET_ACCESS_KEY='<Your Backblaze application key>'
AWS_ENDPOINT_URL='<Your bucket endpoint, prefixed with https://, for example, https://s3.us-west-004.backblazeb2.com>'
BUCKET_NAME='<Your Backblaze B2 bucket name>'
Now we can start FFplay in the background, reading the fifo, and FFmpeg in the foreground, writing raw video to the
fifo and fMP4 to its standard output. We pipe the standard output into writer.py
, giving it a B2 key.
The --debug
flag provides some useful insight into its operation.
ffplay -vf "drawtext=text='%{pts\:hms}':fontsize=72:box=1:x=(w-tw)/2:y=h-(2*lh)" \
-f rawvideo -video_size 1920x1080 -framerate 30 -pixel_format uyvy422 raw_video_fifo &
ffmpeg -f avfoundation -video_size 1920x1080 -r 30 -pix_fmt uyvy422 -probesize 10000000 -i "0:0" \
-f rawvideo raw_video_fifo -y \
-f mp4 -vcodec libx264 -g 60 -movflags empty_moov+frag_keyframe - | \
python writer.py myfile.mp4 --debug
Picking apart the FFmpeg command line:
ffmpeg -f avfoundation -video_size 1920x1080 -r 30 -pix_fmt uyvy422 -probesize 10000000 -i "0:0" \
-
FFmpeg is reading video and audio data from AVFoundation input devices
0
and0
respectively. On my MacBook Pro, these are the built-in FaceTime HD camera and MacBook Pro microphone. Video data is captured with 1920x1080 resolution (1080p) at 30 fps, using theuyvy422
pixel format. FFmpeg will analyze the first 10 MB of data to get stream information (omitting this option results in a warning:not enough frames to estimate rate; consider increasing probesize
).On a Mac, you can list the available video and audio devices with
ffmpeg -f avfoundation -list_devices true -i ""
.
-f rawvideo raw_video_fifo -y \
- The raw video stream is sent, without any processing, to the fifo. Using the raw video stream minimizes latency in monitoring the webcam video.
-f mp4 -vcodec libx264 -g 60 -movflags empty_moov+frag_keyframe - | \
- A second stream is encoded as MP4 using the H.264 codec and sent to
stdout
. FFmpeg writes an emptymoov
atom at the start of the stream, then sends fragments of up to 60 frames (two seconds), with a key frame at the start of each fragment.
ffplay -vf "drawtext=text='%{pts\:hms}':fontsize=72:box=1:x=(w-tw)/2:y=h-(2*lh)" \
-f rawvideo -video_size 1920x1080 -framerate 30 -pixel_format uyvy422 raw_video_fifo &
The ffplay
command line shows a timestamp on the display and specifies the video format, resolution, frame rate and
pixel format, since none of this information is in the raw video stream.
After a few seconds, the ffplay
window appears, showing the live camera feed.
writer.py
creates a Live Read upload then, every few seconds, uploads a part to Backblaze B2:
DEBUG:writer.py:Created multipart upload. UploadId is 4_zf1f51fb913357c4f74ed0c1b_f2008fdd4c7c9303e_d20240515_m185832_c004_v0402005_t0032_u01715799512833
...
DEBUG:writer.py:Uploading part number 1 with size 52428800
DEBUG:writer.py:Uploaded part number 1; ETag is "7c223b579b7da8dd1b433d6eb2d0f141"
In practice, the debug output from FFmpeg and writer.py
is interleaved. I've removed FFmpeg's debug output for
clarity.
Once the first part has been uploaded, you can use the included watch_upload.sh
script in a second Terminal window to monitor the total size of the uploaded parts:
./watch_upload.sh my-bucket myfile.mp4
As an alternative to using FFmpeg to capture video directly from the webcam, you can use OBS Studio to generate a Real-Time Messaging Protocol (RTMP) stream.
Start FFmpeg, listening as an RTMP server, receiving Flash Video-formatted (FLV) data, and piping its output to the writer app:
ffmpeg -listen 1 -f flv -i rtmp://localhost/app/streamkey \
-f mp4 -g 60 -movflags empty_moov+frag_keyframe - | \
python writer.py myfile.mp4 --debug
Start OBS Studio, navigate to the Settings page, and click Stream on the left. Set Service to 'Custom', Server to rtmp://localhost/app/
and Stream Key to streamkey
. (You can change app
and streamkey
in the FFmpeg command and OBS Studio configuration, but both values must be present).
Start streaming in OBS Studio. As above, writer.py
creates a Live Read upload then, every few seconds, uploads a part to Backblaze B2.
As another alternative, you can have the writer read data from a local file, in the manner of tail -f
, and upload it to B2 via Live Read:
python writer.py myfile.mp4 /path/to/local/file.mp4 --debug
You can use OBS Studio to write a suitable file - navigate to the Settings page, and click Output on the left. Set Recording Path to an appropriate location, and set Recording Format to 'Fragmented MP4 (.mp4)'.
Start recording in OBS Studio. As above, writer.py
creates a Live Read upload then, every few seconds, uploads a part to Backblaze B2.
Open another Terminal window in the same directory and activate the virtual environment:
source .venv/bin/activate
Start reader.py
, piping its output to ffplay
. Note the -
argument at the end - this tells
ffplay
to read stdin
:
python reader.py myfile.mp4 --debug \
| ffplay -vf "drawtext=text='%{pts\:hms}':fontsize=72:box=1:x=(w-tw)/2:y=h-(2*lh)" -
reader.py
will start reading the available parts as soon as they are available:
DEBUG:reader.py:Getting part number 1
WARNING:reader.py:Cannot get part number 1. Will retry in 1 second(s)
...
DEBUG:reader.py:Getting part number 1
DEBUG:reader.py:Got part number 1 with size 5242880
After a few seconds, a second ffplay
window appears, showing the video data that was read from the Live Read file.
You can leave the demo running as long as you like. The writer will continue uploading parts, and the reader will continue downloading them.
You can use reader.py
with FFmpeg to create HLS-formatted video data:
python reader.py stream.mp4 --debug |
ffmpeg -i - \
-flags +cgop -hls_time 4 -hls_playlist_type event hls/stream.m3u8
In this example, FFmpeg writes the HLS manifest to a file named stream.m3u8
in the hls
directory, writing video data in 4 second segments (-hls_time 4
) to the same directory.
You can use rclone mount
to write the HLS data to a Backblaze B2 Bucket (you can use the same bucket as the Live Read file, or a different bucket altogether):
rclone mount b2://my-bucket/hls ./hls --vfs-cache-mode writes --vfs-write-back 1s
Upload index.html
to the same location as the HLS data. Open https://<your-bucket-name>.<your-bucket-endpoint>/index.html
(for example, https://my-bucket.s3.us-west-004.backblazeb2.com
) in a browser. You should see the live stream:
When you terminate FFmpeg and writer.py
via Ctrl+C, the writer uploads any remaining data in its stdin
buffer and
completes the multipart upload:
INFO:__main__:Caught signal SIGINT. Processing remaining data.
INFO:__main__:Press Control-C again to terminate immediately.
DEBUG:uploader:Uploading part number 3 with size 4120064
DEBUG:uploader:Uploaded part number 3; ETag is "9453bc700d885233d1b9f43efcf14f4d"
DEBUG:uploader:Completing multipart upload with 3 parts
INFO:uploader:Finished multipart upload. Uploaded 14605824 bytes
DEBUG:__main__:Exiting Normally.
If necessary, you can press Ctrl+C again to terminate the writer immediately:
INFO:__main__:Caught signal SIGINT. Processing remaining data.
INFO:__main__:Press Control-C again to terminate immediately.
DEBUG:uploader:Uploading part number 4 with size 4466922
^CINFO:__main__:Caught signal SIGINT while processing remaining data. Terminating immediately.
Traceback (most recent call last):
File "/Users/ppatterson/src/live_read_demo/writer.py", line 112, in <module>
main()
...
Note that terminating the app before the upload is completed will result in an unfinished large file remaining in your bucket. You can use the B2 command line to list and remove unfinished large files. Check the filename in the listing to ensure that you cancel the correct unfinished large file.
% b2 file large unfinished list b2://my-bucket
4_z51951f8973a51c7f940d0c1b_f2288a3150459f5a3_d20240703_m192256_c004_v0402019_t0001_u01720034576896 stream.mp4 binary/octet-stream
4_z51951f8973a51c7f940d0c1b_f2021de9b7fe4d115_d20240708_m200520_c004_v0402023_t0005_u01720469120500 another_file.mp4 binary/octet-stream
% b2 file large unfinished cancel b2id://4_z51951f8973a51c7f940d0c1b_f2288a3150459f5a3_d20240703_m192256_c004_v0402019_t0001_u01720034576896
4_z51951f8973a51c7f940d0c1b_f2288a3150459f5a3_d20240703_m192256_c004_v0402019_t0001_u01720034576896 canceled
The reader detects that it has reached the end of the data, and exits:
DEBUG:downloader:Got range bytes=5242880-10485759 with size 4100572
DEBUG:downloader:Getting range bytes=9343452-14586331
DEBUG:downloader:Download is complete. Downloaded 9343452 bytes
INFO:downloader:Finished multipart download
DEBUG:__main__:Exiting Normally.
You can terminate the reader with Ctrl-C if you wish.
The uploaded video data is stored as a single file, and can be accessed in the usual way:
% aws s3 ls --human-readable s3://my-bucket/myfile.mp4
2024-05-15 12:15:50 14.9 MiB myfile.mp4
% ffprobe -hide_banner -i https://s3.us-west-004.backblazeb2.com/my-bucket/myfile.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'https://s3.us-west-004.backblazeb2.com/my-bucket/myfile.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso6iso2avc1mp41
encoder : Lavf61.1.100
Duration: 00:00:35.34, start: 0.000000, bitrate: 3539 kb/s
Stream #0:0[0x1](und): Video: h264 (High 4:2:2) (avc1 / 0x31637661), yuv422p(progressive), 1920x1080, 3692 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
encoder : Lavc61.3.100 libx264
Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 62 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
If you ran the FFmpeg command to create HLS data then, after you close the applications, you can edit the HLS manifest file to change the stream from a live event to video on demand (VOD):
- Download
stream.m3u8
. - Open it in an editor.
- Change the line
to
#EXT-X-PLAYLIST-TYPE:EVENT
#EXT-X-PLAYLIST-TYPE:VOD
- Append the following line to the file:
#EXT-X-ENDLIST
Now, when you open the stream in a browser, you will be able to view the recording from its start.
Nothing in reader.py
or writer.py
is specific to fMP4 or the streaming video use case. Feel free to experiment with
Live Read and your own use cases. Fork this repository and use it as the basis for your own implementation. Let us know
at [email protected] if you come up with something interesting!