Libopus 20ms Frame Algorithmic Delay

This article explains the exact algorithmic delay introduced by the libopus encoder when configured with a standard 20ms frame size. It breaks down the mathematical components of this latency—specifically the frame size and the encoder look-ahead—to provide developers with the precise figures needed for real-time audio budget calculations.

Understanding Algorithmic Delay

Algorithmic delay is the inherent latency introduced by an audio codec’s design, independent of hardware buffering, operating system processing, or network transmission. For the Opus codec (implemented via the libopus library), the total algorithmic delay is calculated using a simple formula:

Total Algorithmic Delay = Frame Size + Encoder Look-Ahead

Component 1: Frame Size (20 ms)

The frame size represents the duration of the audio signal buffered before the encoder begins processing. When strictly configured to a 20ms frame size, the encoder must wait for exactly 20 ms of incoming audio samples to accumulate before it can generate an encoded packet.

Component 2: Encoder Look-Ahead (6.5 ms)

The libopus encoder requires a small “look-ahead” window of future audio samples to perform windowing, overlapping, and transient analysis.

The Exact Latency Figures

Depending on the operational mode of the encoder when compressing a 20ms frame, the exact algorithmic delay is as follows:

For most real-time VoIP and interactive audio applications running the default libopus configuration, the exact algorithmic delay to account for is 26.5 ms.