Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alternative architecture for low latency audio streaming #1

Closed
tgarc opened this issue Feb 19, 2017 · 19 comments
Closed

alternative architecture for low latency audio streaming #1

tgarc opened this issue Feb 19, 2017 · 19 comments

Comments

@tgarc
Copy link

tgarc commented Feb 19, 2017

I just thought it might be relevant to discuss an alternative implementation of low latency audio streaming I've been working towards.

The general idea can be explained in a few points:

  1. Audio callback is implemented in GIL-less C and does nothing but read and write data from and to a ring buffer (no processing)
  2. Python user communicates data to the audio callback via a cffi wrapped portaudio ring buffer.
  3. Audio processing is done using reader and writer threads. If low latency is not the highest priority, these can be implemented using python threads. Otherwise, they can be implemented using C or better yet, through Cython using with nogil.

My intention is to make a flexible architecture whose feature set can be extended relatively easily.

This is what I've been trying to achieve with this python module I've been working on: https://github.com/tgarc/pastream/tree/ringbuffer. Currently, I'm still using a few python threading synchronization primitives to synchronize with the audio callback since it made coding a bit simpler, but my intention is to move towards a scheme that doesn't require acquiring in the GIL in the callback.

Anyway, I just wanted to bounce these ideas off you and see if anything sticks :)

@mgeier
Copy link
Member

mgeier commented Feb 20, 2017

Thanks for your suggestions, but I don't really see what's "alternative" about them ...

Point 1 is already done. Data is read from a ringbuffer in the action PLAY_RINGBUFFER and written to a ringbuffer with RECORD_RINGBUFFER, see src/rtmixer.c.

Point 2 is also done. I've wrapped the PortAudio ringbuffer in rtmixer.RingBuffer and the functions play_ringbuffer() and record_ringbuffer() allow to transport such ringbuffers to the audio callback. See src/rtmixer.py.

I consider Point 3 out of scope for rtmixer, but users can do with their end of the ringbuffers whatever they want.

The main target for me is low-latency input and output, with an option for low jitter, too (by utilizing the time argument of the play/record functions). If low latencies are not necessary, users might as well use the sounddevice module directly and do everything in Python.

When starting work on the rtmixer module, I didn't really plan to make a module. I just wanted to create a little example how a callback implemented in C could be used together with the sounddevice module. This turned out to be not quite as easy (or little) as I expected, therefore I decided to create a module that others can directly use without having to implement their own callback in C.

I'm only focusing on audio input and output, not on realtime processing. For that, people will probably have to write their own callbacks in C, but the rtmixer module can still serve as an example for how to set things up.
I still would like to make it simpler to start a new project based on this, without having to copy everything. I guess a first step would be to include the ringbuffer code in all the distributed PortAudio packages, then my wrapper could be included directly in the sounddevice module. I've asked the PortAudio mailing list, but haven't got an answer yet: https://lists.columbia.edu/pipermail/portaudio/2017-February/001070.html.

I like your idea of using Cython, but I have no experience with it. Do you think the whole callback could be implemented in Cython (without ever acquiring the GIL)?

BTW, if you are interested in realtime processing with Python, you should also have a look at PYO: https://github.com/belangeo/pyo.

@tgarc
Copy link
Author

tgarc commented Feb 20, 2017

I guess a first step would be to include the ringbuffer code in all the distributed PortAudio packages, then my wrapper could be included directly in the sounddevice module.

That'd be great to include RingBuffer as part of sounddevice. It's indeed unfortunate it's not part of the API. However couldn't you do what you did with rtmixer and convert sounddevice to a 'compiled' module including the pa_ringbuffer source code?

Do you think the whole callback could be implemented in Cython (without ever acquiring the GIL)?

If it can be written in C then there's no reason I can see that would block you from implementing it with Cython without the GIL. Functions can be defined with nogil.

Thanks for the reference to pyo, it looks like a great project but it's a bit overkill for my needs. I'm actually more interested in audio analysis, not synthesis.

@mgeier
Copy link
Member

mgeier commented Feb 20, 2017

couldn't you do what you did with rtmixer and convert sounddevice to a 'compiled' module including the pa_ringbuffer source code?

I guess I could. But I think this would mean more maintenance work and it would probably make the installation harder in some cases.
I would have to provide manylinux1 wheels and many more wheels (for different Python versions) for macOS and Windows.

But if you find a way to make this both easily maintainable (e.g. via automatic artifact generation on a free CI server) and not harder to install than it is now, please tell me!

It would probably also be harder for potential contributors, but I don't have that many of those anyway, so this is probably not a strong point.

On the plus side, this would probably reduce the module load time and it would be very simple to add the ringbuffer. And the CFFI people seem to like API mode much more than ABI mode.

Originally, I wanted to be able to choose freely between API and ABI modes, using the same code base.
I've asked about this on the CFFI mailing list: https://groups.google.com/forum/#!topic/python-cffi/oBMFw7R1sFI. I guess this isn't possible and it probably doesn't even make any sense.
But an answer there led me to the current design of rtmixer, which I think works quite well, except the re-use of the RingBuffer should be simpler IMHO.

I'm actually more interested in audio analysis, not synthesis.

There are also several libraries for that. Again, I don't have experience with any of them, but you can try them if you like:

https://github.com/librosa/librosa
https://github.com/tyiannak/pyAudioAnalysis
etc.

@mgeier
Copy link
Member

mgeier commented Feb 20, 2017

PS:
Originally I thought that a great advantage of using ABI mode was that users can install any version of PortAudio they want on their system and CFFI would find it. But the reality is that I got multiple Github issues where the sounddevice module couldn't be loaded because some unwanted version of PortAudio was found instead of the bundled one (mainly on Windows).
When using API mode (and ensuring static linking of PortAudio!), these problems would hopefully go away.

@mgeier
Copy link
Member

mgeier commented Feb 21, 2017

PPS:
Binary wheels won't work for Raspberry PI, and PyPy support might nearly double the number of binary wheels to be provided.

@tgarc
Copy link
Author

tgarc commented Feb 22, 2017

Unfortunately, I can't really comment on how much harder it would make building/releasing distributions since I don't have any experience with wheels. I'm not sure I even understand why they're required (why can't they be built on the user machine?). I will have to get my self familiar with this in the coming days.

@mgeier
Copy link
Member

mgeier commented Feb 22, 2017

I'm not sure I even understand why they're required (why can't they be built on the user machine?).

Did you ever ask an average Windows or macOS user to install a C compiler?
Most likely, they don't even know what a compiler is.

And I would like even a below-average Windows/macOS user to be able to install the sounddevice module.

The rtmixer module is different, it is targeted at advanced users.

For the same reason, the RingBuffer probably doesn't belong into the sounddevice module anyway, since it is only meant for advanced use cases?

Binary wheels really simplify (and speed up!) the installation with pip, that's why NumPy et al. also provide them:
https://pypi.python.org/pypi/numpy
https://pypi.python.org/pypi/scipy
https://pypi.python.org/pypi/matplotlib

But, as you can see, there are no wheels for RaspberryPi, or in fact any ARM or more exotic processor. And there are no wheels for PyPy.

OTOH, the sounddevice library can currently be installed on (sufficiently recent version of) Raspbian and also for PyPy (probably even the combination of both?) without compiling anything!

@tgarc
Copy link
Author

tgarc commented Mar 7, 2017

I see. Not having to compile anything is certainly advantageous. The RingBuffer module I find is quite useful for being able to seamlessly communicate data to portaudio and still be simple and clean to use in python. However, my needs go beyond just being able to play/record sounds which is the core of what sounddevice provides, and would most certainly fall outside of the common use case. I agree with you that it's best to keep sounddevice installation/distribution simple to allow more flexibility between platforms (and easier maintenance).

Frankly what I may end up doing if I end up distributing the module I'm working on, is to "vendor" the ringbuffer code inside of my package. The ringbuffer codebase is so small that it will likely make more sense. Besides, I haven't found setuptools/pip to be too friendly with using VCS dependencies.

@mgeier
Copy link
Member

mgeier commented Mar 7, 2017

I think vendoring the ring buffer code makes sense, but how do you combine it with your own code?
It's quite easy to use a pure Python module locally as a sub-module, but how do you handle all the CFFI stuff?

Are you talking about copying the code manually to your project or using git submodule?

What can I do to make vendoring easier?

@tgarc
Copy link
Author

tgarc commented Mar 7, 2017 via email

@mgeier
Copy link
Member

mgeier commented Mar 8, 2017

OK. If you make any new discoveries, please let me know!
And I'm very interested in how you'll solve the ring buffer re-use problem in the end!

@tgarc
Copy link
Author

tgarc commented May 11, 2017

I would have to provide manylinux1 wheels and many more wheels (for different Python versions) for macOS and Windows.
But if you find a way to make this both easily maintainable (e.g. via automatic artifact generation on a free CI server) and not harder to install than it is now, please tell me!

FYI, I'm working on adding automatic wheel generation/deployment to my project and I've just successfully done it for manylinux1 (the only one supported by PyPi for linux):

https://github.com/tgarc/pastream/tree/add_wheels

With this config travis-ci automatically generates the wheels for all python versions (2.6-3.6) using the manylinux docker image (see https://github.com/pypa/manylinux). Once the wheels are built, travis-ci automatically pushes them up to pypi as well.

Hopefully I can find another docker image for mac osx + python that I can add as an additional step to the build. As far as windows, I'm really not sure right now.

Another thing to note is that I had to manually add a few portaudio source files to the MANIFEST so that they are included in the source build (sdist):

  • portaudio/include/portaudio.h
  • portaudio/src/common/pa_ringbuffer.h
  • portaudio/src/common/pa_memorybarrier.h

This avoids having to include the entire portaudio release with every source distribution.

@mgeier
Copy link
Member

mgeier commented May 15, 2017

@tgarc Thanks for the update!

Once the wheels are built, travis-ci automatically pushes them up to pypi as well.

Does this upload to PyPI on every commit?
Including all PRs?

I would expect only the releases to be uploaded there ... though I think some kind of "latest" wheels would also be interesting, but is PyPI the right place for this?

About the MANIFEST ... way ahead of you: edae8f3.

@tgarc
Copy link
Author

tgarc commented May 15, 2017 via email

@mgeier
Copy link
Member

mgeier commented May 16, 2017

Thanks, that's cool stuff!

I'm not sure what you mean by latest. [...] If you mean HEAD or otherwise experimental builds, PyPI would probably not be the right place for these IMO.

Yes, that's what I meant. Similar to the "latest" documentation builds on http://readthedocs.org.

But that's probably not very important anyway.

Deploying wheels only for tagged commits seems to be the right thing to do.

Running some unit tests on each platform is really helpful if it happens on every commit and even on PRs, but it's really not necessary to build the wheels each time.

@tgarc
Copy link
Author

tgarc commented Jun 11, 2017

Hey @mgeier just a quick update:
I've gotten about 95% of the way there for automated wheel building & deployment for my project:

https://github.com/tgarc/pastream

I'm using the cibuildwheel app I mentioned earlier. I've only got two one open issues:

  1. On Linux, Python 2.7, using cibuildwheel on travis-ci fails to find libffi when doing a pip install of the package. This is really weird. It works fine when I run cibuildwheel from my machine with docker installed.

2) For some reason the deployment step on AppVeyor returns a 'build failure' even though it successfully deploys the wheels. I've got no leads on this error...

Otherwise, things look good. It was a pretty slow and painful process to get everything working on all platforms so if you decide to start deploying wheels hopefully this will give you a head start.

Edit

I fixed (2). Turns out like most things windows, powershell is evil: it interprets anything that is output to STDERR as a fatal error. I just used CMD instead.

@mgeier
Copy link
Member

mgeier commented Jun 13, 2017

Thanks for sharing, that sounds very promising!

@mgeier
Copy link
Member

mgeier commented Jul 12, 2017

I've started experimenting with changing sounddevice into a compiled module: spatialaudio/python-sounddevice#91.

It works on Linux but on macOS I'm getting a linker error. I didn't try anything else yet ...

@mgeier
Copy link
Member

mgeier commented May 26, 2021

Ring buffer re-use is solved with https://github.com/spatialaudio/python-pa-ringbuffer.

API-mode experiments (that are not likely to be merged) are in spatialaudio/python-sounddevice#91.

In the meantime rtmixer wheel packages are automatically built on CI.

I'm closing this, but anyone should feel free to add further comments (or create a new issue) if desired.

@mgeier mgeier closed this as completed May 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants