Streaming is available in most browsers,
and in the WWDC app.
Export HDR media in your app with AVFoundation
Discover how to author and export high dynamic range (HDR) content in your app using AVFoundation. Learn about high dynamic range and how you can take advantage of it in your app. We'll show you how to implement feature sets that allow people to export HDR content, go over supported HDR formats, review current restrictions, and explore the Apple platforms that support HDR export.
- Core Media
- Have a question? Ask with tag wwdc20-10010
- Learn more about AVFoundation
- Search the forums for tag wwdc20-10010
Hello, and welcome to WWDC.
Hi, everyone. I’m Kevin Matsumura, and I’ll be explaining the simple steps you need to follow to preserve High Dynamic Range in your exported videos. In this session, we’ll review some basic points on what High Dynamic Range video is, what HDR formats we support in export, how simple it is to export HDR video content with AV-Asset-Export-Session, the additional flexibility that AV-Asset-Writer provides when exporting HDR, and finally touch on which Apple platforms support HDR export. All right, let’s get started. High Dynamic Range extends the dynamic range of the video above and beyond what is capable by Standard Dynamic Range video. It does so via a significantly increased range of luminance values.
SDR video is limited to up to 100 candelas per square meter, also commonly referred to as nits. HDR video, on the other hand, can go all the way up to 10,000 nits. That’s two orders of magnitude greater.
This results in videos that can display brighter whites and deeper blacks. HDR also offers better contrast, resulting in more details in shadows and striking highlights.
While this is technically all that High Dynamic Range means, HDR is often implicitly paired with Wide Color gamut. The color gamut, in combination with the dynamic range, defines the color volume, or the total number of colors that can be described.
To accommodate the larger luminance and color ranges, HDR is also typically associated with higher bit depths, with 10-bit common for distribution media. HDR is all about the transfer function. Transfer functions describe how scene linear light is mapped to non-linear code values and then back to display light.
The goal throughout this overall process is to preserve the artistic intent of the original scene all the way to the end user’s display.
Opto-Electronic Transfer Functions, or OETFs, are on the encode or capture-side and map from scene light to code values.
Electro-Optical Transfer Functions, or EOTFs, are on the decode or display-side and map from code values back to display light.
Note that the EOTF is not necessarily an exact inverse of the OETF.
HDR introduced two new transfer functions. The Hybrid Log-Gamma, or HLG, transfer function was designed to be backwards compatible with SDR. This means that HLG content will display on SDR displays but any pixels exceeding 100 nits would be clipped.
HLG is a scene-referred system, which means the encoded code values are relative to the normalized scene light.
The Perceptual Quantizer, or PQ, transfer function was designed based on how the human visual system works, specifically with regard to contrast sensitivity.
Some common PQ HDR formats that you are likely familiar with are Dolby Vision and HDR10. While an HDR transfer function is sufficient to categorize a video as HDR, there are HDR formats that insert additional information into the output bitstream. This additional information, let’s call it HDR metadata, allows better mapping of the encoded code values back to display light and is used by the EOTF in the decoder to account for differences between the mastering display environment and the current display environment. The contents of HDR metadata can generally be categorized as stream, scene, or frame statistics.
An example of an HDR format with no metadata is HLG.
Some formats add a bit of static metadata that is constant over the entire movie. An example of this is HDR10.
And then there are formats that add dynamic metadata that varies per scene, or even on every video frame. This is what Dolby Vision does. Note that dynamic metadata can also be paired with static metadata. I mentioned earlier that HDR is often associated with Wide color gamut. This diagram illustrates the differences in the color gamuts. This first triangle represents both the BT.709 and sRGB color spaces, which were originally developed for the standard definition broadcast television industry. This next larger triangle represents the P3 color space, originated by Digital Cinema Initiatives for the film industry. P3 can represent more nuanced colors over BT.709. This last and largest triangle represents BT.2020, which offers even more colors. BT.2020 was developed as a standard for ultra-high definition television. Wide Color gamut, in combination with HDR transfer functions and 10-bit formats, result in videos that are more vibrant and representative of the original scene.
Now that we have a basic understanding of HDR, let’s go over which HDR formats are supported by export.
AVFoundation supports HDR via the HLG and HDR10 formats, which are both industry standards and widely supported by consumer displays.
This is important, because we want your user to be able to capitalize on all the benefits of HDR all the way to their display.
So regardless of if your user is viewing your content on an Apple device that natively supports HDR, like an iPhone, or iPad, or MacBook, or iMac, or if they are displaying to an external HDR monitor or TV, your exported videos will render in HDR for your user. Note that Apple already brought support for HDR export to our Pro users back in 2017, specifically via Final Cut Pro X and Compressor. This year we are bringing HDR export to you, the developers.
While HDR video content has typically been limited to high-end video cameras to date, we anticipate that HDR content will become more common in the future and want to position you with the tools to take full advantage of it.
Now let’s talk about the main export interface that most of you will be dealing with, AV-Asset-Export-Session. AV-Asset-Export-Session is the most common interface of export. It assumes a file transcoding workflow. You can use it to simply export a single asset in a sharing use case, or you can do some editing by temporally combining multiple track segments via AV-Composition, or performing spatial compositions using AV-Video-Composition prior to exporting.
AV-Asset-Export-Session is designed to make exporting simple. AV-Asset-Export-Session is preset-based. The presets define what the output encoding parameters are, and AV-Asset-Export-Session takes care of the rest for you. If needed, it will perform color space conversion, downscaling, frame rate conversion, and so on. AV-Asset-Export-Session offer presets for H.264, HEVC and Apple ProRes codecs.
There are presets that conform to specific maximum video resolutions, and there are also a limited number of presets that allow for selecting different video quality levels.
Here’s a code snippet that illustrates the simplicity of AV-Asset-Export-Session.
You first create the export session with a source asset and a preset. In this case, I’ve chosen the HEVC-Highest-Quality preset.
You then specify the output file name as a URL and the outputFileType.
That’s it. In the minimal case, this is all you need to kick-off the export. So what do you need to change in order to export HDR videos with AV-Asset-Export-Session? I think you’ll like the answer to this. Nothing, if you are using HEVC or Apple ProRes presets. We have upgraded the existing HEVC presets to preserve the source’s HDR format.
For example, HDR10 source assets will export as HDR10, HLG exports to HLG, and SDR source assets, of course, will continue to export as SDR.
The Apple ProRes presets will also preserve the HDR format of the source.
In order to convert from HDR to SDR, we recommend that you use the H.264 presets which will also maximize backwards compatibility.
Compositions, whether they are temporal or spatial, need additional handling when mixing dynamic range formats.
Let’s consider the case where we have a mix of HDR and SDR content.
AV-Asset-Export-Session will inspect all source assets presented to it, searching for HDR content. Upon finding the HDR content, the exported file will be HDR, and any SDR assets in the composition will be converted to HDR.
If there's a mix of differing HDR formats, export’s default policies are to first prefer HLG over PQ formats. Note that AV-Asset-Export-Session knows how to convert properly between PQ and HLG formats.
If further refinement is needed, then export prefers HDR formats with metadata over those with no metadata. Graphically, the export composition policies I just described are shown in this table. The first row and column show the preference of HDR over SDR. The second row and column show the preference of HLG over PQ. And the logic reflected in this table is extensible to any number of source assets.
While we expect AVAssetExportSession to be sufficient for many developers' needs, we know it doesn’t work for everyone.
That brings us to AVAssetWriter. Here are two common workflows for AVAssetWriter: The first workflow employs reading samples from AVAssetReader that were optionally edited via AVComposition and/or AVVideoComposition and then exporting with AVAssetWriter.
The next workflow represents an app that uses AVAssetWriter to export camera-captured frames. AVAssetWriter puts you in complete control of the export process. You can explicitly specify the video codec, what bit rate you want to use, the frame rate of the exported video, the exact dimensions of the video frames, the color space and the dynamic range. On Macs with multiple video encoders, you can even specify which video encoder to use for the export.
Note this is not an exhaustive list by any means, but it gives you an idea of some common parameters that you may want to configure.
Since AVAssetWriter offers many controls, I’ll now discuss two methods that can simplify the configuration process. This code snippet shows the simplest invocation of AVAssetWriter. First you create the AssetWriter with an output URL and file type. Then you need to provide an outputSettings dictionary, which controls all facets of the export. However, here we are using the AVAssetWriterInput interface which accepts a sourceFormatHint in the form of the videoFormatDescription of the source. In this case, the only required key in the outputSettings dictionary is the video codec type.
AVAssetWriter will construct reasonable defaults for the remaining parameters based on the source videoFormatDescription provided in the hint. This next code snippet shows how to configure AVAssetWriter with AVOutputSettingsAssistant. AVOutputSettingsAssistant offers presets to choose from similar to AVAssetExport-Session. The presets provide a template dictionary that you can either use directly or as a starting point for configuring AVAssetWriter-Input.
Here I’ve selected the hevc1920x1080 dimensional preset. We also strongly recommend that you provide the sourceVideoFormatDescription to the settings assistant prior to retrieving the video-Settings. By providing the sourceVideoFormat, AVOutputSettingsAssistant can combine the source characteristics with the preset settings, and return settings very similar to what AVAssetExportSession would configure for an analogous export.
It is quite possible that these settings are sufficient for you, in which case you can pass them unmodified to AVAssetWriter-Input. But you also have the opportunity to alter the settings at this point, possibly changing the output video dimensions, or the video bit rate, as examples. Now let’s talk about the various keys within the AVAssetWriter-Input output-Settings dictionary that are relevant to HDR. There are three such keys that you need to take care in configuring when dealing with HDR: the video codec, color properties and compression properties keys. Let’s go into more detail on each of these.
The AVVideoCodecKey. AVFoundation supports HDR via two codecs: HEVC and Apple ProRes.
HEVC is a very common distribution format for HDR. HEVC provides good compression efficiency, so the exported file sizes will be manageable.
Apple ProRes is a mezzanine format used in professional video workflows. The AVVideoColorPropertiesKey is a dictionary with three keys: transfer function, color primaries and YCbCr-Matrix.
Transfer function we’ve already gone over. HLG and PQ are the two supported HDR transfer functions.
The Color-Primaries-Key is typically set to a Wide Color gamut format. BT.2020 is common for HDR content.
The YCbCr-Matrix-Key represents how to convert between YUV and RGB color representations. BT.2020 non-constant luminance system is typical for HDR.
The AVVideoCompressionPropertiesKey is a dictionary that contains most of the keys that you may want to vary in the output settings.
However, the only relevant HDR key in this is the VideoProfileLevelKey which must be set to HEVC_Main10_AutoLevel when exporting HDR using the HEVC codec.
Note that 8-bit HEVC HDR is not supported, and this key is not applicable to ProRes exports.
All right, let’s take the time now to summarize how you would configure the keys I just discussed when outputting to two common HDR formats: HLG and HDR10. This table shows what the relevant HDR settings are for exporting an HLG file.
You must select either HEVC or Apple ProRes for the codec. You can specify any value for color primaries and YCbCr-Matrix, but they are both typically set to ITU_R_2020. The transfer function must obviously be set to HLG, since we desire an HLG file.
And if you’ve selected the HEVC codec, you’ll also need to specify the Profile-Level as HEVC_Main10_AutoLevel.
Now let’s tackle HDR10, which is a bit more complicated. As before, select either HEVC or Apple ProRes as the codec type.
While HDR10 is conventionally HEVC-based, we also use HDR10 to define a ProRes asset that has MasteringDisplayColorVolume and ContentLightLevelInfo specified. More on that in a bit.
As in the HLG case, the Color-Primaries and Matrix are typically set to ITU_R_2020, and the transfer function must be set to PQ.
Again, set the Profile-Level to HEVC_Main10_AutoLevel for HEVC HDR10 exports.
Finally for HDR10, there are two optional keys that we recommend you also specify. Let’s discuss these in more detail. Mastering Display Color Volume, or MDCV, describes the display on which the content was mastered. Content Light Level Information, or CLLI, provides information about the overall light levels of the movie.
Ideally both the values should be specified. This will allow more accurate HDR rendering by the EOTF of the decoder. If absent, the EOTF of the decoder will be forced to assume a default for them, which may not match what you intended.
MDCV and CLLI SEI message formats are defined in this referenced ISO specification.
At this point, I’d like to briefly touch on which Apple platforms can support HDR export. iOS supports HEVC hardware encoding on devices with Apple A10 Fusion chips or newer. Fortunately A10 devices have been around for a while, dating back to the iPhone 7, iPads released in 2018, and the 2019 iPod touch.
In regards to Macs, both HEVC and Apple ProRes software encoders are available on all Macs. HEVC hardware encoding is generally available on 2017 and newer Macs running the new macOS.
Hardware encoding will make the export significantly faster.
All right, time to wrap this up. We’ve briefly reviewed what HDR is and how it is designed to preserve the artistic intent of the original scene all the way to the end user’s display. With increased dynamic range, HDR can create videos with brighter whites and deeper blacks, as well as increased details in shadows and highlights. Due to all the benefits that HDR video provides, you need to ensure that your apps preserve this vibrant HDR content in your exports.
The steps are simple. Like with any export, you can use AVAssetExportSession or AVAssetWriter to do this. Select either an HEVC or Apple ProRes codec or preset.
With AVAssetWriter, remember to set the transfer function to HLG or PQ. Explicitly set the video profile level to HEVC_Main10_AutoLevel for HEVC HDR exports, and finally, use a Wide Color gamut, preferably ITU_R_2020.
Here are some additional references on HDR. Note there is a companion session to this one, addressing HDR editing and playback using AVFoundation.
Thank you for watching my presentation. I hope you enjoy the rest of WWDC 2020.
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.