alternative architecture for low latency audio streaming #1

tgarc · 2017-02-19T20:49:38Z

I just thought it might be relevant to discuss an alternative implementation of low latency audio streaming I've been working towards.

The general idea can be explained in a few points:

Audio callback is implemented in GIL-less C and does nothing but read and write data from and to a ring buffer (no processing)
Python user communicates data to the audio callback via a cffi wrapped portaudio ring buffer.
Audio processing is done using reader and writer threads. If low latency is not the highest priority, these can be implemented using python threads. Otherwise, they can be implemented using C or better yet, through Cython using with nogil.

My intention is to make a flexible architecture whose feature set can be extended relatively easily.

This is what I've been trying to achieve with this python module I've been working on: https://github.com/tgarc/pastream/tree/ringbuffer. Currently, I'm still using a few python threading synchronization primitives to synchronize with the audio callback since it made coding a bit simpler, but my intention is to move towards a scheme that doesn't require acquiring in the GIL in the callback.

Anyway, I just wanted to bounce these ideas off you and see if anything sticks :)

The text was updated successfully, but these errors were encountered:

mgeier · 2017-02-20T09:37:27Z

Thanks for your suggestions, but I don't really see what's "alternative" about them ...

Point 1 is already done. Data is read from a ringbuffer in the action PLAY_RINGBUFFER and written to a ringbuffer with RECORD_RINGBUFFER, see src/rtmixer.c.

Point 2 is also done. I've wrapped the PortAudio ringbuffer in rtmixer.RingBuffer and the functions play_ringbuffer() and record_ringbuffer() allow to transport such ringbuffers to the audio callback. See src/rtmixer.py.

I consider Point 3 out of scope for rtmixer, but users can do with their end of the ringbuffers whatever they want.

The main target for me is low-latency input and output, with an option for low jitter, too (by utilizing the time argument of the play/record functions). If low latencies are not necessary, users might as well use the sounddevice module directly and do everything in Python.

When starting work on the rtmixer module, I didn't really plan to make a module. I just wanted to create a little example how a callback implemented in C could be used together with the sounddevice module. This turned out to be not quite as easy (or little) as I expected, therefore I decided to create a module that others can directly use without having to implement their own callback in C.

I'm only focusing on audio input and output, not on realtime processing. For that, people will probably have to write their own callbacks in C, but the rtmixer module can still serve as an example for how to set things up.
I still would like to make it simpler to start a new project based on this, without having to copy everything. I guess a first step would be to include the ringbuffer code in all the distributed PortAudio packages, then my wrapper could be included directly in the sounddevice module. I've asked the PortAudio mailing list, but haven't got an answer yet: https://lists.columbia.edu/pipermail/portaudio/2017-February/001070.html.

I like your idea of using Cython, but I have no experience with it. Do you think the whole callback could be implemented in Cython (without ever acquiring the GIL)?

BTW, if you are interested in realtime processing with Python, you should also have a look at PYO: https://github.com/belangeo/pyo.

tgarc · 2017-02-20T16:37:59Z

I guess a first step would be to include the ringbuffer code in all the distributed PortAudio packages, then my wrapper could be included directly in the sounddevice module.

That'd be great to include RingBuffer as part of sounddevice. It's indeed unfortunate it's not part of the API. However couldn't you do what you did with rtmixer and convert sounddevice to a 'compiled' module including the pa_ringbuffer source code?

Do you think the whole callback could be implemented in Cython (without ever acquiring the GIL)?

If it can be written in C then there's no reason I can see that would block you from implementing it with Cython without the GIL. Functions can be defined with nogil.

Thanks for the reference to pyo, it looks like a great project but it's a bit overkill for my needs. I'm actually more interested in audio analysis, not synthesis.

mgeier · 2017-02-20T21:08:03Z

couldn't you do what you did with rtmixer and convert sounddevice to a 'compiled' module including the pa_ringbuffer source code?

I guess I could. But I think this would mean more maintenance work and it would probably make the installation harder in some cases.
I would have to provide manylinux1 wheels and many more wheels (for different Python versions) for macOS and Windows.

But if you find a way to make this both easily maintainable (e.g. via automatic artifact generation on a free CI server) and not harder to install than it is now, please tell me!

It would probably also be harder for potential contributors, but I don't have that many of those anyway, so this is probably not a strong point.

On the plus side, this would probably reduce the module load time and it would be very simple to add the ringbuffer. And the CFFI people seem to like API mode much more than ABI mode.

Originally, I wanted to be able to choose freely between API and ABI modes, using the same code base.
I've asked about this on the CFFI mailing list: https://groups.google.com/forum/#!topic/python-cffi/oBMFw7R1sFI. I guess this isn't possible and it probably doesn't even make any sense.
But an answer there led me to the current design of rtmixer, which I think works quite well, except the re-use of the RingBuffer should be simpler IMHO.

I'm actually more interested in audio analysis, not synthesis.

There are also several libraries for that. Again, I don't have experience with any of them, but you can try them if you like:

https://github.com/librosa/librosa
https://github.com/tyiannak/pyAudioAnalysis
etc.

mgeier · 2017-02-20T21:47:42Z

PS:
Originally I thought that a great advantage of using ABI mode was that users can install any version of PortAudio they want on their system and CFFI would find it. But the reality is that I got multiple Github issues where the sounddevice module couldn't be loaded because some unwanted version of PortAudio was found instead of the bundled one (mainly on Windows).
When using API mode (and ensuring static linking of PortAudio!), these problems would hopefully go away.

mgeier · 2017-02-21T11:44:15Z

PPS:
Binary wheels won't work for Raspberry PI, and PyPy support might nearly double the number of binary wheels to be provided.

tgarc · 2017-02-22T04:48:57Z

Unfortunately, I can't really comment on how much harder it would make building/releasing distributions since I don't have any experience with wheels. I'm not sure I even understand why they're required (why can't they be built on the user machine?). I will have to get my self familiar with this in the coming days.

mgeier · 2017-02-22T08:32:40Z

I'm not sure I even understand why they're required (why can't they be built on the user machine?).

Did you ever ask an average Windows or macOS user to install a C compiler?
Most likely, they don't even know what a compiler is.

And I would like even a below-average Windows/macOS user to be able to install the sounddevice module.

The rtmixer module is different, it is targeted at advanced users.

For the same reason, the RingBuffer probably doesn't belong into the sounddevice module anyway, since it is only meant for advanced use cases?

Binary wheels really simplify (and speed up!) the installation with pip, that's why NumPy et al. also provide them:
https://pypi.python.org/pypi/numpy
https://pypi.python.org/pypi/scipy
https://pypi.python.org/pypi/matplotlib

But, as you can see, there are no wheels for RaspberryPi, or in fact any ARM or more exotic processor. And there are no wheels for PyPy.

OTOH, the sounddevice library can currently be installed on (sufficiently recent version of) Raspbian and also for PyPy (probably even the combination of both?) without compiling anything!

tgarc · 2017-03-07T04:44:51Z

I see. Not having to compile anything is certainly advantageous. The RingBuffer module I find is quite useful for being able to seamlessly communicate data to portaudio and still be simple and clean to use in python. However, my needs go beyond just being able to play/record sounds which is the core of what sounddevice provides, and would most certainly fall outside of the common use case. I agree with you that it's best to keep sounddevice installation/distribution simple to allow more flexibility between platforms (and easier maintenance).

Frankly what I may end up doing if I end up distributing the module I'm working on, is to "vendor" the ringbuffer code inside of my package. The ringbuffer codebase is so small that it will likely make more sense. Besides, I haven't found setuptools/pip to be too friendly with using VCS dependencies.

mgeier · 2017-03-07T14:15:44Z

I think vendoring the ring buffer code makes sense, but how do you combine it with your own code?
It's quite easy to use a pure Python module locally as a sub-module, but how do you handle all the CFFI stuff?

Are you talking about copying the code manually to your project or using git submodule?

What can I do to make vendoring easier?

tgarc · 2017-03-07T17:44:01Z

I might just copy the python module and manually add the cffi code to my cffi code so it all gets built as one module. If I make any improvements/fixes it should be easy enough to go back and patch those into the pa-ringbuffer repository. I also briefly looked at using git submodule, but it didn't seem entirely straightforward to add a local (non-PyPi) package to the setuptools dependencies either. Anyway, I don't think there's much that can be done to make vendoring easier. I really appreciate all the feedback though, you're input has been invaluable.

…

On Tue, Mar 7, 2017 at 8:15 AM, Matthias Geier ***@***.***> wrote: I think vendoring the ring buffer code makes sense, but how do you combine it with your own code? It's quite easy to use a pure Python module locally as a sub-module, but how do you handle all the CFFI stuff? Are you talking about copying the code manually to your project or using git submodule? What can I do to make vendoring easier? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/mgeier/python-rtmixer/issues/1#issuecomment-284732704>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAaVHFOz7dXYWjlLwzoV6gu_qCdKYuT9ks5rjWaQgaJpZM4MFlmD> .

mgeier · 2017-03-08T18:37:22Z

OK. If you make any new discoveries, please let me know!
And I'm very interested in how you'll solve the ring buffer re-use problem in the end!

tgarc · 2017-05-11T16:32:18Z

I would have to provide manylinux1 wheels and many more wheels (for different Python versions) for macOS and Windows.
But if you find a way to make this both easily maintainable (e.g. via automatic artifact generation on a free CI server) and not harder to install than it is now, please tell me!

FYI, I'm working on adding automatic wheel generation/deployment to my project and I've just successfully done it for manylinux1 (the only one supported by PyPi for linux):

https://github.com/tgarc/pastream/tree/add_wheels

With this config travis-ci automatically generates the wheels for all python versions (2.6-3.6) using the manylinux docker image (see https://github.com/pypa/manylinux). Once the wheels are built, travis-ci automatically pushes them up to pypi as well.

Hopefully I can find another docker image for mac osx + python that I can add as an additional step to the build. As far as windows, I'm really not sure right now.

Another thing to note is that I had to manually add a few portaudio source files to the MANIFEST so that they are included in the source build (sdist):

portaudio/include/portaudio.h
portaudio/src/common/pa_ringbuffer.h
portaudio/src/common/pa_memorybarrier.h

This avoids having to include the entire portaudio release with every source distribution.

mgeier · 2017-05-15T14:55:41Z

@tgarc Thanks for the update!

Once the wheels are built, travis-ci automatically pushes them up to pypi as well.

Does this upload to PyPI on every commit?
Including all PRs?

I would expect only the releases to be uploaded there ... though I think some kind of "latest" wheels would also be interesting, but is PyPI the right place for this?

About the MANIFEST ... way ahead of you: edae8f3.

tgarc · 2017-05-15T18:37:40Z

Does this upload to PyPI on every commit? Including all PRs?

Deployment can be controlled by using a 'on condition'; see the travis ci documentation: https://docs.travis-ci.com/user/deployment/pypi/

... I think some kind of "latest" wheels would also be interesting, but

is PyPI the right place for this? I'm not sure what you mean by latest. If your talking about alpha/beta releases I believe PyPI could still be used for this - in these cases I think the user would have to specify the version they want explicitly with pip. If you mean HEAD or otherwise experimental builds, PyPI would probably not be the right place for these IMO. For those kind of builds you could potentially deploy them to github releases which is also supported by travis-ci: https://docs.travis-ci.com/user/deployment/releases/ BTW There are a couple of projects I've seen that try to provide a generic way to deploy python builds across osx/windows/manylinux using a combination of travis-ci and appveyor that (supposedly) requires minimal project configuration. I couldn't find the original project I had seen, but this project has the same intent: https://github.com/joerick/cibuildwheel I don't know how 'minimal' it really is to configure but I will definitely be trying at least one of these projects out. I'll make an update once I settle on something.

…

On Mon, May 15, 2017 at 9:55 AM, Matthias Geier ***@***.***> wrote: @tgarc <https://github.com/tgarc> Thanks for the update! Once the wheels are built, travis-ci automatically pushes them up to pypi as well. Does this upload to PyPI on every commit? Including all PRs? I would expect only the releases to be uploaded there ... though I think some kind of "latest" wheels would also be interesting, but is PyPI the right place for this? About the MANIFEST ... way ahead of you: edae8f3 <mgeier@edae8f3> . — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/mgeier/python-rtmixer/issues/1#issuecomment-301500640>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAaVHJwpJQxdJggGkOjDcxr2ja478Gltks5r6GdtgaJpZM4MFlmD> .

mgeier · 2017-05-16T08:37:37Z

Thanks, that's cool stuff!

I'm not sure what you mean by latest. [...] If you mean HEAD or otherwise experimental builds, PyPI would probably not be the right place for these IMO.

Yes, that's what I meant. Similar to the "latest" documentation builds on http://readthedocs.org.

But that's probably not very important anyway.

Deploying wheels only for tagged commits seems to be the right thing to do.

Running some unit tests on each platform is really helpful if it happens on every commit and even on PRs, but it's really not necessary to build the wheels each time.

tgarc · 2017-06-11T23:39:11Z

Hey @mgeier just a quick update:
I've gotten about 95% of the way there for automated wheel building & deployment for my project:

https://github.com/tgarc/pastream

I'm using the cibuildwheel app I mentioned earlier. I've only got ~~two~~ one open issues:

On Linux, Python 2.7, using cibuildwheel on travis-ci fails to find libffi when doing a pip install of the package. This is really weird. It works fine when I run cibuildwheel from my machine with docker installed.

~~2) For some reason the deployment step on AppVeyor returns a 'build failure' even though it successfully deploys the wheels. I've got no leads on this error...~~

Otherwise, things look good. It was a pretty slow and painful process to get everything working on all platforms so if you decide to start deploying wheels hopefully this will give you a head start.

Edit

I fixed (2). Turns out like most things windows, powershell is evil: it interprets anything that is output to STDERR as a fatal error. I just used CMD instead.

mgeier · 2017-06-13T09:03:02Z

Thanks for sharing, that sounds very promising!

mgeier · 2017-07-12T12:39:05Z

I've started experimenting with changing sounddevice into a compiled module: spatialaudio/python-sounddevice#91.

It works on Linux but on macOS I'm getting a linker error. I didn't try anything else yet ...

mgeier · 2021-05-26T18:08:01Z

Ring buffer re-use is solved with https://github.com/spatialaudio/python-pa-ringbuffer.

API-mode experiments (that are not likely to be merged) are in spatialaudio/python-sounddevice#91.

In the meantime rtmixer wheel packages are automatically built on CI.

I'm closing this, but anyone should feel free to add further comments (or create a new issue) if desired.

mgeier mentioned this issue Jul 12, 2017

WIP: Investigate using CFFI's API mode spatialaudio/python-sounddevice#91

Draft

8 tasks

mgeier closed this as completed May 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

alternative architecture for low latency audio streaming #1

alternative architecture for low latency audio streaming #1

tgarc commented Feb 19, 2017

mgeier commented Feb 20, 2017

tgarc commented Feb 20, 2017

mgeier commented Feb 20, 2017

mgeier commented Feb 20, 2017

mgeier commented Feb 21, 2017

tgarc commented Feb 22, 2017

mgeier commented Feb 22, 2017

tgarc commented Mar 7, 2017

mgeier commented Mar 7, 2017

tgarc commented Mar 7, 2017 via email

mgeier commented Mar 8, 2017

tgarc commented May 11, 2017

mgeier commented May 15, 2017

tgarc commented May 15, 2017 via email

mgeier commented May 16, 2017

tgarc commented Jun 11, 2017 •

edited

Loading

mgeier commented Jun 13, 2017

mgeier commented Jul 12, 2017

mgeier commented May 26, 2021

alternative architecture for low latency audio streaming #1

alternative architecture for low latency audio streaming #1

Comments

tgarc commented Feb 19, 2017

mgeier commented Feb 20, 2017

tgarc commented Feb 20, 2017

mgeier commented Feb 20, 2017

mgeier commented Feb 20, 2017

mgeier commented Feb 21, 2017

tgarc commented Feb 22, 2017

mgeier commented Feb 22, 2017

tgarc commented Mar 7, 2017

mgeier commented Mar 7, 2017

tgarc commented Mar 7, 2017 via email

mgeier commented Mar 8, 2017

tgarc commented May 11, 2017

mgeier commented May 15, 2017

tgarc commented May 15, 2017 via email

mgeier commented May 16, 2017

tgarc commented Jun 11, 2017 • edited Loading

mgeier commented Jun 13, 2017

mgeier commented Jul 12, 2017

mgeier commented May 26, 2021

tgarc commented Jun 11, 2017 •

edited

Loading