macOS: How to map AVAudioDevice audio inputs to AVCaptureMovieFileOutput output tracks?

The Situation

I'm on macOS and I have an AVCaptureSession with camera and audio device inputs which are fed into an AVCaptureMovieFileOutput. What I am looking for is a way to map audio device input channels to file output audio channels, preferably using an explicit channel map.

By default, AVCaptureMovieFileOutput takes (presumably) the maximum number of input channels from an audio device that matches an audio format supported by the capture output, and records all of them. This works as expected for mono devices like the built-in microphone and stereo USB mics, the result being either a 1ch mono or a 2ch stereo audio track in the recorded media file.

However, the user experience breaks down for 2ch input devices that have an input signal on only one channel, which is reasonable for a 2ch audio interface with one mic connected. This produces a stereo track with the one input channel panned hard to one side. It gets even weirder for multichannel interfaces. For example, an 8ch audio input device results in a 7.1 audio track in the recorded media file with input audio mapped to separate tracks. This is far from ideal during playback, where audio sources are surprisingly coming from seemingly random directions.

The Favored Solution

Ideally, users should be able to select which channels of their audio input device will be mapped to which audio channel in the recorded media file via UI. The resulting channel map would be configured somewhere on the capture session.

The Workaround

I have found that AVCaptureFileOutput does not respond well to channel layouts that are not standard audio formats like mono, stereo, quadrophonic, 5.1, and 7.1. This means, channel descriptions and channel bitmaps are out of the question.

What does work, is configuring the output with one of the supported channel layouts and disabling audio channels via AVCaptureConnection.

With that, the output's encoder produces reasonable results for mono and stereo input devices, if the configured channel layout is kAudioChannelLayoutTag_Stereo, but anything else is mixed down to mono. I am somewhat sympathetic to this solution in so far that in lieu of an explicit channel map the best guess the audio encoder could make, is mixing every enabled channel down to mono. But, as described above, this breaks for 2ch input devices where only one channel is connected to a signal source. The result is a stereo track with audio hard panned to one side.

The Question

Is there a way to implement the described favored solution with AVCapture* API only, and if not, what's the preferred way of dealing with this scenario - going directly for AVAudioEngine and AVAssetWriter?

  • Correction: Title should read AVCaptureDevice, not AVAudioDevice, of course.

Add a Comment