Libopus Encoder State Internal Data Structure
This article explains the internal data structure used to hold the
encoder state in libopus, the reference implementation of
the Opus audio codec. It details how the main state structure is
defined, how it manages the hybrid nature of the codec, and how its
memory layout is structured for optimal performance.
The primary internal data structure that holds the encoder state in
libopus is struct OpusEncoder
(externally exposed as the typedef OpusEncoder).
Because the Opus codec is a hybrid format that combines two distinct
technologies—SILK (optimized for voice) and CELT (optimized for general
audio and low latency)—the OpusEncoder structure acts as a
master controller that encapsulates the states of both underlying
encoders.
Memory Layout and Offsets
To avoid multiple dynamic memory allocations, libopus
structures OpusEncoder as a single, contiguous block of
memory. The structure itself contains configuration parameters, followed
by the actual state data for the SILK and CELT sub-encoders appended
directly to the end of the struct.
To navigate this contiguous memory block,
struct OpusEncoder utilizes specific member variables: *
silk_enc_offset: An integer representing
the byte offset from the start of the OpusEncoder memory
block to the beginning of the SILK encoder state. *
celt_enc_offset: An integer representing
the byte offset from the start of the OpusEncoder memory
block to the beginning of the CELT encoder state.
Key Members of
struct OpusEncoder
Inside src/opus_encoder.c, the structure contains
several critical fields:
- Encoder Mode and Configuration: It stores the sampling rate, number of channels, application type (e.g., VoIP, Audio, or Restricted Low Delay), and the current operational mode (SILK-only, CELT-only, or Hybrid).
- Sub-Encoder States:
- SILK State: Located at the
silk_enc_offset, this memory area is cast to SILK-specific state structures (likeSilkEncoder). - CELT State: Located at the
celt_enc_offset, this memory area is cast toCELTEncoder.
- SILK State: Located at the
- Bitrate and Bandwidth Controllers: Variables that track the target bitrate, maximum allowed bandwidth, and rate control parameters (such as the voice activity detection threshold).
- Stream States: Variables that track the channel layout, stereo width, and state transitions when switching between SILK and CELT modes.
Allocation and Initialization
Because of its variable size depending on channels and compiler
configurations, the exact size of OpusEncoder is not
hardcoded. Instead, the library provides helper functions to allocate
and initialize the structure:
opus_encoder_get_size(int channels): Calculates the exact number of bytes required to hold theOpusEncoderstructure, including the required padding and the sub-encoder states for the specified number of channels.opus_encoder_init(): Initializes the pre-allocated memory block, setting up the offsets and default state parameters.opus_encoder_create(): Dynamically allocates the required memory usingmallocand initializes it.