Libopus Maximum Audio Packet Duration Limits

This article provides an overview of the maximum duration limits that the libopus library enforces on a single encoded audio packet. It explains the technical constraints dictated by the Opus specification, how frame sizes combine to reach this limit, and the underlying reasons for these design choices in real-time communication.

The Absolute Maximum Duration: 120 ms

The libopus library, which implements the IETF Opus Audio Codec (RFC 6716), strictly limits the maximum duration of a single encoded audio packet to 120 milliseconds (ms). Any attempt to encode a packet representing more than 120 ms of audio in a single payload will violate the Opus specification.

How the 120 ms Limit is Structured

An Opus packet is composed of a Table of Contents (TOC) header followed by one or more audio frames. The codec allows for several specific frame durations:

To achieve the maximum packet duration of 120 ms, libopus packages multiple frames of the same duration into a single packet. The packet configurations are constrained by the following rules:

Packets containing more than 120 ms of audio cannot be represented because the TOC byte in the Opus bitstream header cannot encode a frame count and size combination that exceeds this threshold.

Minimum Packet Duration

While the maximum limit is 120 ms, the minimum duration of an Opus packet is 2.5 ms. This ultra-low duration is designed for applications requiring minimal algorithmic delay, such as live musical performances or high-speed gaming communication.

Why Libopus Imposes the 120 ms Limit

The 120 ms limitation is a deliberate design choice aimed at balancing compression efficiency, network reliability, and latency: