Zoned Network Sound-Streaming: The Problem

For a while, now, I have been looking for a reliable way to manage zoned music-playing around the house. The general idea is that I’d like to be able to play music from a central point and have it streamed over the network to a selection of receivers, which could be remotely turned on and off when required, but still allow for multiple receivers to play simulataneously.

Apple’s AirPlay has supported this for a while now, but requires the purchasing of AirPlay compatible hardware, which is expensive. It’s also very iTunes-based - which is something that I do not use.

Various open-source tools also allow network streaming. Icecast (through the use of Darkice) allows clients to stream from a multimedia server, but this causes pretty severe latency in playback between clients (ranging up to around 20 seconds, I’ve found) - not a good solution in a house!

PulseAudio is partly designed around being able to work over the network, and supports the discovery of other PulseAudio sinks on the LAN and the selection a sound card to transmit to through TCP. This doesn’t seem to support multiple sound card sinks very well, however.

PulseAudio’s other network feature is its RTP broadcasting, and this seemed the most promising avenue for progression in solving this problem. RTP utilises UDP, and PulseAudio effecively uses this to broadcast its sound to any devices on the network that might be listening on the broadcast address. This means that one server could be run and sink devices could be set up simply to receive the RTP stream on demand - perfect!

However, in practice, this turned out not to work very well. With RTP enabled, PulseAudio would entirely flood the network with sound packets. Although this isn’t a problem for devices with a wired connection, any devices connected wirelessly to the network would be immediately disassociated from the access point due to the complete saturation of PulseAudio’s packets being sent over the airwaves.

This couldn’t be an option in a house where smartphones, games consoles, laptops, and so on require the WLAN. After researching this problem a fair bit (and finding many others experiencing the same issues), I found this page, which describes various methods for using RTP streaming from PulseAudio and includes (at the bottom) the key that could fix my problems - the notion of compressing the audio into MP3 format (or similar) before broadcasting it.

Trying this technique worked perfectly, and did not cause network floods anywhere nearly as severely as the uncompressed sound stream; wireless clients no longer lost access to the network once the stream was started and didn’t seem to lose any noticeable QoS at all. In addition, when multiple clients connected, the sound output would be nearly entirely simultaneous (at least after a few seconds to warm up).

Unfortunately, broadcasting still didn’t work well over WLAN (sound splutters and periodic drop-outs), so the master server and any sound sinks would need to be on a wired network. This is a small price to pay, however, and I am happy to live with a few Ethernet-over-power devices around the house. The next stage is to think about what to use as sinks. Raspberry Pis should be powerful enough and are significantly cheaper than Apple’s equivalent. They would also allow me to use existing sound systems in some rooms (e.g. the surround-sound in the living room), and other simple speaker setups in others. I also intend to write a program around PulseAudio to streamline the streaming process and a server for discovering networked sinks.

I will write an update when I have made any more progress on this!

Zoned Network Sound-Streaming: The Problem

📲 Subscribe to updates