In many cases, we need to convert audio format for different applications.
Mu-law format is one popular format which can save space, while PCM is linear and direclty consumed by applications.
There could be also different sampling rates we need be aware of.
One such applicaiton is the openAI realtime API usage, which usually takes PCM format with 24kHz sampling rate.
Below is an example showing how to complete the process by converting mu-law (8 kHz) data to 16-bit linear PCM, resampling it to 24 kHz, and finally encoding it as a base64 string:
import audioop |
Explanation:
ulaw2lin(data, width)
: Converts mu-law data to linear PCM with the specified sample width (in bytes). Herewidth=2
(16-bit).ratecv(data, width, channels, in_rate, out_rate, state)
: Resamples linear PCM to a new sample rate. It requires the current sample width, number of channels, input rate, and output rate. The state can typically beNone
on the first call.b64encode(...)
: Encodes the binary data in Base64. The.decode("utf-8")
step converts it into a UTF-8 string instead of bytes.