Libopus vs AAC-LC Encoding Latency Comparison
This article provides a direct comparison of the strict encoding latency between the Libopus (Opus) and AAC-LC (Advanced Audio Coding Low Complexity) codec libraries. By analyzing algorithmic delay, frame sizes, and configuration limits, we explain which codec performs better for real-time, ultra-low-latency audio applications.
Inherent Algorithmic Delay and Frame Sizes
The primary factor dictating strict encoding latency is the codec’s frame size and its required look-ahead buffer.
- Libopus (Opus): Libopus is highly flexible and designed specifically for interactive real-time communication. It supports frame sizes of 2.5 ms, 5 ms, 10 ms, 20 ms, 40 ms, and 60 ms. The inherent algorithmic delay of Libopus is equal to the frame size plus a small look-ahead (typically 5 ms for the CELT layer, though this can be bypassed or reduced). This allows Libopus to achieve an ultra-low encoding latency of under 10 ms (down to 5 ms in hybrid mode or 2.5 ms in pure CELT mode).
- AAC-LC: The Low Complexity profile of AAC is designed primarily for efficient audio storage and streaming, not interactive telecommunication. It uses a fixed frame size of 1024 samples. At a standard sampling rate of 48 kHz, a single frame represents 21.3 ms of audio. Because AAC-LC requires overlap-and-add windowing, the strict encoding look-ahead delay adds another 1024 samples, resulting in a minimum algorithmic codec latency of approximately 42.6 ms before factoring in any hardware or system buffer overhead.
Strict Latency Comparison at 48 kHz
| Latency Metric | Libopus (Minimum Config) | Libopus (Standard Config) | AAC-LC (Standard Config) |
|---|---|---|---|
| Frame Size | 2.5 ms (120 samples) | 20 ms (960 samples) | 21.3 ms (1024 samples) |
| Look-Ahead / Overlap | 2.5 ms (120 samples) | 5.0 ms (240 samples) | 21.3 ms (1024 samples) |
| Total Algorithmic Delay | 5.0 ms | 25.0 ms | 42.6 ms |
Protocol and Application Suitability
Because of these structural differences, the two libraries serve entirely different latency thresholds:
- Libopus is the industry standard for WebRTC, VoIP, and interactive gaming. Its ability to scale down to 5 ms of strict encoding latency makes it unnoticeable to the human ear, which is crucial for natural, two-way conversation.
- AAC-LC is unsuitable for real-time interactive communications. While variants like AAC-LD (Low Delay) and AAC-ELD (Enhanced Low Delay) exist to reduce latency to acceptable real-time levels (around 15–20 ms), standard AAC-LC is strictly constrained by its 1024-sample window, making it best suited for one-way broadcasting and playback.