How a macOS app achieves broadcast-grade audio quality by prioritizing SRT over WebRTC—and why the industry standard might have it backwards.
The Architecture Challenge
Remote DJ streaming presents an interesting streaming engineering problem: how do you deliver uncompressed audio to multiple venues simultaneously while maintaining broadcast-grade quality, without requiring $15,000 broadcast encoders at each endpoint?
The conventional approach combines video and audio into a single RTMP or HLS stream, relies on adaptive bitrate to handle network fluctuations, and accepts the 15-30 second latency that comes with segment-based delivery. DJing Stream, a macOS application designed for professional DJ-to-venue streaming, takes a radically different approach worth examining from a protocol architecture perspective.

Separate Streams, Separate Protocols
The core architectural decision is treating audio and video as fundamentally different media requiring different protocols:
| Stream | Protocol | Bitrate | Priority |
|---|---|---|---|
| Audio | SRT | ~2,304 kbps | Primary |
| Video | WebRTC | ~1,500 kbps | Secondary |
This inversion—audio bitrate higher than video—is virtually unheard of in streaming. Most platforms allocate 5-10x more bandwidth to video than audio. Here's the reasoning: for professional venue deployment, audio quality is the only thing that matters. A bar's sound system will expose every compression artifact. The video feed showing the DJ? That's supplementary—nice to have on screens, but not critical to the customer experience.
Why SRT for Audio?
SRT (Secure Reliable Transport) provides several properties essential for professional audio. Notably, HLS does not support LPCM (Linear PCM) audio—it requires lossy codecs like AAC or AC-3, making it fundamentally unsuitable for uncompressed audio delivery.
Ordered delivery with retransmission: Unlike WebRTC's best-effort model where packets may be dropped or arrive out of order, SRT guarantees ordered delivery with automatic retransmission of lost packets. For audio, a dropped packet means an audible glitch. SRT's ARQ mechanism ensures that if any data gets lost in transit, it's resent before the buffer depletes.
Configurable latency/reliability trade-off: SRT exposes a latency parameter that directly controls the retransmission window. Higher latency = more time for packet recovery = higher reliability. DJing Stream exposes this as a user-facing slider:
Latency Configuration by Use Case:
├── Live venue deployment: 4-5 seconds (maximum reliability)
├── Interactive sessions: 2-3 seconds (accept occasional dropouts)
├── Home listening: 4-6 seconds (prioritize quality)
└── Challenging networks: 8-10 seconds (international, mobile, congested)
Constant bitrate: SRT doesn't adapt bitrate based on network conditions—it maintains consistent quality and relies on the retransmission buffer to absorb variations. This is critical for audio where adaptive bitrate means audible quality fluctuations.
Why WebRTC for Video?
WebRTC remains the right choice for video, for different reasons:
- Real-time feedback: DJs want to see the crowd; venues may want to display the DJ performing. This requires low latency even at the cost of quality.
- NAT traversal: WebRTC's ICE/STUN/TURN infrastructure handles the complexity of peer-to-peer video between DJs and venues behind NATs.
- Acceptable degradation: Video quality fluctuations are visually tolerable in a way audio glitches are not.
The key insight: if video stutters, audio stays perfect. The streams are completely independent. Toggle video off entirely to save resources without affecting audio.
Uncompressed PCM Over SRT
Where most streaming platforms use AAC or Opus at 128-320 kbps, DJing Stream transmits 24-bit PCM audio:
Audio Specifications:
├── Format: Uncompressed 24-bit PCM
├── Sample rate: 44.1 kHz or 48 kHz (auto-detected)
├── Bitrate: ~2,304 kbps
├── Container: MPEG-TS
└── Transport: SRT
For context, Spotify's highest quality streams at 320 kbps using lossy compression. DJing Stream delivers more than seven times the bitrate with zero compression artifacts. The trade-off is bandwidth—each listener consumes approximately 2.5 Mbps for audio only.
Hub-and-Spoke Distribution
The network architecture uses a relay model rather than peer-to-peer:
DJ Mixer
│
▼ USB/Thunderbolt
macOS (AVFoundation capture)
│
▼ MPEG-TS/SRT
SRT Relay Server
│
├──────────────────┬──────────────────┐
▼ ▼ ▼
Venue 1 Venue 2 Venue N
(SRT Subscriber) (SRT Subscriber) (SRT Subscriber)
The DJ publishes a single stream regardless of listener count. The relay server handles fan-out distribution. This keeps upload bandwidth requirements constant for the DJ while enabling simultaneous multi-venue delivery.
Each venue then routes the SRT stream through AVAudioEngine to their sound system or AirPlay endpoints.
Apple Silicon as Broadcast Infrastructure
Traditional broadcast contribution encoders from manufacturers like Comrex or Tieline cost $3,000-$15,000 per endpoint. They achieve slightly lower latency (1-2 seconds) but operate point-to-point—requiring separate hardware for each venue connection.
DJing Stream runs on consumer Macs. Apple Silicon's unified memory architecture and hardware-accelerated media processing enable what previously required dedicated broadcast equipment:
- AVFoundation for low-latency audio capture from any USB/Thunderbolt interface
- Hardware-accelerated encoding for video (when enabled)
- Efficient SRT processing for reliable transport
A refurbished Mac mini M1 ($250-300) handles broadcast-grade streaming without breaking a sweat. The barrier to entry drops from thousands of dollars to existing Mac hardware.
Comparison with Consumer Platforms
Why not just use Mixcloud Live, Twitch, or YouTube Live? Beyond the audio quality limitations (lossy compression, adaptive bitrate), there's a licensing consideration that streaming engineers should understand:
Consumer streaming platforms are licensed for personal listening—they hold public performance licenses for their platform delivery. However, venues playing that content through their sound systems create a secondary public performance that requires the venue's own PRO licensing (ASCAP, BMI, SESAC, SACEM, etc.). Many venues operating in this grey area don't realize the distinction.
DJing Stream positions itself as transport infrastructure for venues that already hold appropriate public performance licenses—the same licensing they need for any live DJ or background music system.
Technical Specifications Summary
| Parameter | Value |
|---|---|
| Audio format | Uncompressed 24-bit PCM |
| Audio sample rate | 44.1 kHz / 48 kHz (auto) |
| Audio bitrate | ~2,304 kbps |
| Audio transport | SRT (MPEG-TS container) |
| Video format | H.264 720p |
| Video transport | WebRTC |
| Default latency | 4-6 seconds E2E |
| Configurable range | 2-10 seconds |
| Platform | macOS 15+ (Sequoia) |
| Architecture | Apple Silicon recommended |
Implementation Considerations
For streaming engineers evaluating similar architectures, several design decisions are worth noting:
Protocol independence: Separating audio and video streams allows each to use optimal protocols without compromise. The architectural complexity is higher but the quality benefits are substantial. Perfect audio/video sync is not essential for DJ streaming—but real-time visual feedback is a must. Standard segment-based protocols like HLS introduce 15-30 seconds of latency, making visual monitoring impossible. WebRTC solves this for video while SRT handles the audio quality requirements.
User-exposed latency control: Rather than hiding latency behind "low latency mode" toggles, exposing the actual parameter with use-case guidance lets operators make informed trade-offs.
Relay architecture vs. P2P: The hub-and-spoke model adds a relay hop but dramatically simplifies multi-destination delivery and keeps source bandwidth constant. For any application requiring one-to-many distribution, this is likely the correct choice.
Audio-first bitrate allocation: For any application where audio quality is the primary value proposition, consider whether the standard video-heavy bandwidth allocation makes sense for your use case.
Conclusion
DJing Stream represents an interesting departure from conventional streaming architecture: prioritizing SRT reliability over WebRTC speed for audio, allocating more bandwidth to audio than video, and leveraging Apple Silicon to democratize broadcast-grade transport.
Whether you're building venue streaming systems, remote production workflows, or any application where audio fidelity is critical, the architectural patterns here—separate protocols for separate media types, configurable latency trade-offs, and hub-and-spoke distribution—offer a template worth considering.
The application is available on the Mac App Store. More information at djing.com.