AV1 Hardware Decoding

Recently I've been trying to play some AV1-encoded streams on my iPhone 15 Pro Max. First, I check for hardware support:

VTIsHardwareDecodeSupported(kCMVideoCodecType_AV1); // YES

Then I need to create a CMFormatDescription in order to pass it into a VTDecompressionSession. I've tried the following:

{
 mediaType:'vide' 
 mediaSubType:'av01' 
 mediaSpecific: {
  codecType: 'av01'  dimensions: 394 x 852 
 } 
 extensions: {{
    CVFieldCount = 1;
    CVImageBufferChromaLocationBottomField = Left;
    CVImageBufferChromaLocationTopField = Left;
    CVPixelAspectRatio =     {
        HorizontalSpacing = 1;
        VerticalSpacing = 1;
    };
    FullRangeVideo = 0;
 }}
}

but VTDecompressionSessionCreate gives me error -8971 (codecExtensionNotFoundErr, I assume).

So it has something to do with the extensions dictionary? I can't find anywhere which set of extensions is necessary for it to work 😿.

VideoToolbox has convenient functions for creating descriptions of AVC and HEVC streams (CMVideoFormatDescriptionCreateFromH264ParameterSets and CMVideoFormatDescriptionCreateFromHEVCParameterSets), but not for AV1.


As of today I am using XCode 15.0 with iOS 17.0.0 SDK.

Post not yet marked as solved Up vote post of mrlvsva Down vote post of mrlvsva
2.5k views
  • It would be great if Apple could provide an example of how to decode AV1 video on an iPhone 15 Pro Max or one of the new M3 MacBook Pros, using VideoToolbox.

    The information appears not to be anywhere.

Add a Comment

Replies

Might these links be of use:

  1. https://chromium.googlesource.com/chromium/src/+/master/media/gpu/mac/vt_video_decode_accelerator_mac.cc
  2. https://chromium.googlesource.com/chromium/src/+/HEAD/media/gpu/mac/vt_config_util.mm

Specifically the function CreateVideoFormatAV1 in 1 and the function CreateFormatExtensions in 2.?

Must set 'SampleDescriptionExtensionAtoms' for 'extensions', 'SampleDescriptionExtensionAtoms' must contain 'av1C' for av1 extradata, example:

{
	mediaType:'vide' 
	mediaSubType:'av01' 
	mediaSpecific: {
		codecType: 'av01'		dimensions: 720 x 1280 
	} 
	extensions: {{
    BitsPerComponent = 8;
    CVFieldCount = 1;
    CVImageBufferChromaLocationBottomField = Left;
    CVImageBufferChromaLocationTopField = Left;
    CVImageBufferColorPrimaries = "ITU_R_709_2";
    CVImageBufferTransferFunction = "ITU_R_709_2";
    CVImageBufferYCbCrMatrix = "ITU_R_709_2";
    Depth = 24;
    FormatName = "'av01'";
    FullRangeVideo = 0;
    RevisionLevel = 0;
    SampleDescriptionExtensionAtoms =     {
        av1C = {length = 20, bytes = 0x81050c000a0e0000002cd59f3fddaf9901010104};
    };
    SpatialQuality = 0;
    TemporalQuality = 0;
    VerbatimISOSampleEntry = {length = 124, bytes = 0x0000007c 61763031 00000000 00000001 ... 000a6669 656c0100 };
    Version = 0;
}}
}
  • Thank you for your solution. But how can I get the correct SampleDescriptionExtensionAtoms for av1C? What do the length and bytes mean here?

  • If use 'FFMPEG', you can get 'av1C' from AVCodecContext's extradata. If use 'AVFoundation', you can get from AVAssetTrack's formatDescriptions.

Add a Comment

Take a look at this issue. They had implemented av1 hardware decoding on iPhone 15 Pro, using same method as Chromium.

https://github.com/moonlight-stream/moonlight-ios/issues/585

Just posting back here as I got all this working in the end.

In case it's useful, here are the stumbling blocks I encountered. Probably, these are just more a reflection of my lack of understanding but maybe it'll help someone.

To construct AV1 Codec Configuration Box outside of FFmpeg etc, then this describes the structure:

  1. https://aomediacodec.github.io/av1-isobmff/#av1codecconfigurationbox-section

The information needed comes from parsing the Sequence Header OBU:

  1. https://aomediacodec.github.io/av1-spec/#general-sequence-header-obu-syntax

If you're writing from scratch (i.e. not. using ffmpeg or whatever), then you need to write or find code to parse the sequence header OBU.

Once you've written the 4 bytes described in 1. then you also need to append the sequence header OBU data block to the end of the block. If you don't, then the decoder setup will fail.

This is then added to the extensions dictionary, along with all the other basic information needed to initialise the decoder (the Chrome references detail all this information).

You then create the video format description using CMVideoFormatDescriptionCreate, passing in the extensions.

I then got caught out with a decode error because I didn't realise that I also had to pass in the Sequence Header OBU with the first frame data I attempted to decode. It wasn't enough that I had already given the same Sequence Header OBU when creating the video format description (via the extensions).

After that it worked.

Decoding itself is slightly simpler than with HEVC, in that you don't need to parse the OBUs, you just pass the data straight to the decoder. With HEVC, you had to parse the NALUs and only pass in slice segments, while also doing some minor conversion of the way the NALU's length is presented to the decoder.

It would be helpful, Apple, if you could consider writing something like CMVideoFormatDescriptionCreateFromAV1SequenceHeaderOBU similar to the existing CMVideoFormatDescriptionCreateFromH264ParameterSets and CMVideoFormatDescriptionCreateFromHEVCParameterSets.

This would lower the bar a little to AV1 hardware decoding.

  • Were you getting -12911 error on trying to decode the frame? How is it supposed to pass that Sequence Header OBU? Haven't found a line in the chromium src above.

  • Yea can you share what error you were getting? Did you just stuff the same extradata into the decode stream before any frames?

Add a Comment

Sorry for the late reply I hadn't set up notifications (I have now).

  1. If you don't append the sequence header OBU data to the end of the AV1CodecConfigurationRecord then decoder initialisation will fail with error -12911.

  2. if you don't include the sequence header OBU data at the start of the first frame you decode, then you will get error -12909 inside the decompression callback.

In my scenario, I am in control of the encoder (NVENC) so I get the sequence header OBU when I initialise the encoder, and send this across as my "extra data" before I send any encoded frame data. On the decode side, I then use this in 1 and 2 above. My encoder then doesn't send any further sequence header OBUs at all as in my case the format doesn't change.

In a more general scenario, there might be a sequence header with every IDR frame, so you'd just need to make sure you wait until you get the first sequence header so you can initialise the decoder as per 1. above, and then since the sequence header is already part of the encoded packet data 2. would be taken care of anyway.