Using SDL_AudioStream

From the dawn of time, until SDL 2.0.6, there was only one way to convert audio through SDL: By using the SDL_AudioCVT structure.

It's a usable API, for various needs, but it has a few problems:

We have a better API that SDL has been using internally for awhile now, since it needs to bridge data between the app's audio callbacks and the platform APIs that consume and produce data, and that data might be coming and going at any size and format at inexact times. Not only does this API have to convert and resample data on the fly, it needs to be able to buffer it when one end produces data at a different rate than the other is consuming it.

For SDL 2.0.7, we've cleaned up these internal APIs and made them available to apps. We call it SDL_AudioStream.

To avoid confusion: this is strictly an optional API, even if you use SDL for audio playback or capture. SDL might use it behind the scenes if it silently converts data between your callback and the platform, but that isn't your concern. If you don't like callbacks and just wanted to feed SDL audio data as you have more to give it, and let SDL figure it out, you can do that too, but that's a different API (that's SDL_QueueAudio() and friends).

Here are some immediate uses for SDL_AudioStream:

Using SDL_AudioStream is pretty simple. First, you create one. Let's say you want to produce mono data in Sint16 format at 22050Hz, for something that wants to consume stereo data in Float32 format at 48000Hz.

// You put data at Sint16/mono/22050Hz, you get back data at Float32/stereo/48000Hz
SDL_AudioStream *stream = SDL_NewAudioStream(AUDIO_S16, 1, 22050, AUDIO_F32, 2, 48000);
if (stream == NULL) {
    printf("Uhoh, stream failed to create: %s\n", SDL_GetError());
} else {
    // We are ready to use the stream!
}

Now all you have to do is feed your stream data!

Sint16 samples[1024];
int num_samples = read_more_samples_from_disk(samples); // whatever.
// you tell it the number of _bytes_, not samples, you're putting!
int rc = SDL_AudioStreamPut(stream, samples, num_samples * sizeof (Sint16));
if (rc == -1) {
    printf("Uhoh, failed to put samples in stream: %s\n", SDL_GetError());
    return;
}

// Whoops, forgot to add a single sample at the end...!
//  You can put any amount at once, SDL will buffer
//  appropriately, growing the buffer if necessary.
Sint16 onesample = 22;
SDL_AudioStreamPut(stream, &onesample, sizeof (Sint16));

As you add data to the stream, SDL will convert and resample it. You can ask how much converted data is available:

int avail = SDL_AudioStreamAvailable(stream);  // this is in bytes, not samples!
if (avail < 100) {
    printf("I'm still waiting on %d bytes of data!\n", 100 - avail);
}

And when you have enough data to be useful, you can read out samples in the requested format:

float converted[100];
// this is in bytes, not samples!
int gotten = SDL_AudioStreamGet(stream, converted, sizeof (converted));
if (gotten == -1) {
    printf("Uhoh, failed to get converted data: %s\n", SDL_GetError());
}
write_more_samples_to_disk(converted, gotten); /* whatever. */

Of course, you don't have to read it all at once. This both streams in and out of a converted buffer, so you can read less than is available:

int gotten;
do {
    float converted[100];
    // this is in bytes, not samples!
    gotten = SDL_AudioStreamGet(stream, converted, sizeof (converted));
    if (gotten == -1) {
        printf("Uhoh, failed to get converted data: %s\n", SDL_GetError());
    } else {
        // (gotten) might be less than requested in SDL_AudioStreamGet!
        write_more_samples_to_disk(converted, gotten); /* whatever. */
    }
} while (gotten > 0);

In terms of performance: buffer allocations, conversion, and resampling happen during stream puts. Getting from the stream is a little bookkeeping and some memcpy() calls. Plan accordingly.

The one gotcha of this interface: you might notice that you have less available than you expect (possibly even zero bytes available!). When resampling, SDL keeps a buffer of padding available so that data sent through in chunks still resamples smoothly. Rather than try to predict the future, it just holds onto the first little piece you feed into the stream, and then starts converting that part after it's received more data, holding a tiny bit back each time to keep the stream sounding smooth.

There are two ways to deal with this: if you're planning to stream forever, don't do anything. Just keep feeding more data as you have it, and reading more data from the stream as it becomes available, and it'll all work out.

If you are simply at the end of the data you want to stream, you can communicate this to SDL and it will convert any buffered data it's been holding onto internally, making it available to be read with SDL_AudioStreamGet().

    SDL_AudioStreamFlush(stream);

Note that if you flush a stream, you can then feed it more data, but there will likely be gaps in the audio output, as the resampler will use silence for the padding at the end. You really only want to flush to finish off a stream and get the last few samples out of it.

If, for whatever reason, you want to throw a stream's contents away without reading it, you can:

    SDL_AudioStreamClear(stream);

This will remove any data you've put to the stream without reading, and reset internal state (so the resampler will be expecting a fresh buffer instead of resampling against data you previously wrote to the stream). This is useful if you plan to reuse a stream for different source, or just decided that the current source wasn't working out; maybe you're muting an offensive person on a VoIP app.

When you are done with the stream, you can destroy it:

     SDL_FreeAudioStream(stream);

This frees up internal state and buffers. You don't have to drain the stream before freeing it. The SDL_AudioStream pointer you've been using is invalid after this call.

That's all!