Minimum Algorithmic Latency of libopus

This article explains the minimum algorithmic latency achievable when encoding audio with the libopus library. It outlines the components that contribute to this delay, explains how the minimum threshold of 5 milliseconds is calculated, and discusses the configuration required to achieve this ultra-low latency state.

Understanding Algorithmic Latency in Opus

Algorithmic latency is the inherent delay introduced by an audio codec’s design, independent of hardware processing speed or network transmission times. In the Opus audio codec (implemented via libopus), this latency is determined by two primary factors: the frame size (or packet duration) and the codec’s look-ahead buffer.

To achieve the absolute lowest latency, the encoder must be configured to use its smallest supported frame size.

The 5 Millisecond Minimum Limit

The absolute minimum algorithmic latency achievable with standard libopus is 5.0 milliseconds (ms).

This 5.0 ms limit is the sum of two distinct components:

  1. Frame Size (2.5 ms): The shortest standard audio frame duration supported by the Opus codec is 2.5 ms.
  2. Look-ahead (2.5 ms): The underlying MDCT (Modified Discrete Cosine Transform) technology used in the CELT layer of Opus requires a 2.5 ms look-ahead window to perform overlap-add operations and prevent audio aliasing.

When you configure libopus to encode using 2.5 ms frames, the mathematical delay formula is:

\[\text{Algorithmic Latency} = \text{Frame Size} + \text{Look-ahead}\] \[\text{Algorithmic Latency} = 2.5\text{ ms} + 2.5\text{ ms} = 5.0\text{ ms}\]

How to Configure libopus for Minimum Latency

To achieve the 5.0 ms latency target in a practical application, you must configure the encoder with specific parameters:

Trade-offs of Ultra-Low Latency Encoding

While a 5 ms algorithmic latency is ideal for real-time interactive applications like musical collaboration or gaming, it comes with trade-offs: