Your C++ games and apps can now tap into the power of Metal. We'll show you how metal-cpp helps you bridge your C++ code to Metal, explore how each manages object lifecycles, and demonstrate utilities that can help these language cooperate in your app. We'll also share best practices for designing app architecture to elegantly integrate Objective-C and C++ together.
♪ Mellow instrumental hip-hop music ♪ ♪ Hi, my name is Keyi Yu, and I'm an engineer from the Metal Ecosystem team. Today, it's my pleasure to introduce metal-cpp. We created metal-cpp for anyone who uses C++ and wants to build Metal applications for Apple platforms. Metal-cpp is a low-overhead library that connects your C++ applications to Metal. First, I'll start with an overview of what metal-cpp is and how it works, and then I'll cover some details about the lifecycles for Objective-C objects. C++ and Objective-C handle lifecycles a bit differently, and I'll show you how to handle those differences. Xcode and metal-cpp have some great utilities that can help you manage the object lifecycles in your apps. And finally, I'll show you how to integrate C++ code with Objective-C classes. So here's a look at metal-cpp and how it works. Metal is the foundation for accelerated graphics and compute on Apple platforms, enabling your apps and games to tap into the incredible power of the GPU. It was originally designed using the powerful features and the conventions offered by Objective-C. But if your code base is in C++, you may need something to bridge between your code and Metal's Objective-C code. Introducing metal-cpp! It serves as a hub between your C++ application and Objective-C Metal. With metal-cpp in your application, you can use Metal classes and functions in C++, and metal-cpp can help you call Objective-C functions in runtime. metal-cpp is a lightweight Metal C++ wrapper. I say it's lightweight, because it's implemented as a header-only library with inline function calls. It provides 100 percent coverage of the Metal API by implementing a one-to-one mapping of C++ calls to Objective-C APIs. To do this, metal-cpp wraps parts of the Foundation and CoreAnimation frameworks. It's open source under Apache 2 License, so you can modify the library and include it to your applications, easily. metal-cpp uses C to call directly into the Objective-C runtime. This is the exact same mechanism that the Objective-C compiler uses to execute Objective-C methods. So this wrapper introduces little overhead. Since metal-cpp implements a one-to-one mapping of C++ to Objective-C calls, it follows the same Cocoa memory-management rules. I will discuss this in more detail later. This one-to-one mapping also allows all of the developer tools to work seamlessly, including GPU Frame Capture and the Xcode debugger. These are the series of calls necessary to draw a triangle with metal-cpp. If you are familiar with C++, it's a good time to learn Metal, because you don't need to worry about language syntax. If you've already used Metal with Objective-C, in terms of function calls, there's very little difference between the Objective-C interface of Metal and metal-cpp. I am going to demonstrate how easy it is to use metal-cpp. First, I create a command buffer, which I will fill with commands for the GPU to execute. I can simply use the raw pointer in C++ as a mapping to ID in Objective-C. I can create a render command encoder and write render commands with a command buffer. The C++ function renderCommandEncoder and the Objective-C method renderCommandEncoder WithDescriptor are the same. The only differences are the name conventions of the languages. I then set a render pipeline state object which contains the vertex and fragment shaders and various other rendering states. Then I encode my draw call to render a single triangle. Then I indicate that I've finished encoding render commands. I present the drawable, so the triangle is displayed onscreen. Finally, I commit my command buffer. This tells the GPU that it can begin executing my commands. Obviously, metal-cpp and Objective-C Metal are almost the same. You don't need to worry about language syntax now with metal-cpp, you can directly look into the Metal documentation to learn the concepts and usage of Metal. You may have already played with this deferred lighting sample before. We now provide a new version of this deferred lighting sample which uses metal-cpp. We hope this can help you learn how to code with metal-cpp in practice. I'm also excited to introduce a series of incremental C++ samples that introduces the Metal API and shows you how to accomplish different tasks with it.
So now that you know a little bit about metal-cpp, how do you actually use it? We published metal-cpp last year. Here's the webpage where you can find the downloads and instructions. Let me show you the steps you will need to take. After downloading metal-cpp, you should tell Xcode where to find it. Here, I put metal-cpp under the current project. Then, you need to set C++17 or higher as the C++ language dialect. Next, add three frameworks to the project: Foundation, QuartzCore, and Metal. Now there's only one thing left to do before using C++ interfaces of those frameworks. There are three headers in metal-cpp. Since metal-cpp is a header-only library, you need to generate their implementations before importing the header files. To do this, define three macros: NS_PRIVATE_IMPLEMENTATION, CA_PRIVATE_IMPLEMENTATION, AND MTL_PRIVATE_IMPLEMENTATION. If you are interested in what metal-cpp does with the macros under the hood, please check out header bridge files in the metal-cpp folder. You can use the headers separately or put them in a single header. You can import the header files whenever you need them. But remember, do not define the NS, CA, or MTL_PRIVATE_IMPLEMENTATION macros more than once. Otherwise, you may cause duplicate definition errors. To use metal-cpp effectively, you'll need to know Cocoa's memory management rules, how to use the great utilities that can help you manage object lifecycles, and how to design your application architecture when you interface with other frameworks. I'll start with object lifecycle management. During your application's operation, you typically need to allocate and release memory. You also need to manage command buffers, pipeline objects, and resources. To help manage this memory, Objective-C and Cocoa objects include a reference count. This is also present in metal-cpp. Reference counting helps you manage your memory. Using reference counting, all objects contain a retainCount property. Components in an app increase the count to keep objects they're interacting with alive and decrease it when they are done with them. When the retainCount hits zero, the runtime deallocates the object. There are two types of reference counting in Objective-C. One is called Manual Retain-Release -- MRR; the other is Automatic Reference Counting -- ARC. When compiling code with the ARC feature, the compiler takes the references you create and automatically inserts calls to the underlying memory-management mechanism. metal-cpp objects are manually retained and released. So you need to understand Cocoa's conventions to know when to retain and release objects. Unlike creating objects in C++, metal-cpp objects are neither created with new nor destroyed with delete. With Cocoa's conventions, you own any object you create with methods starting with the alloc, new, copy, mutableCopy, or create. You can take ownership of an object using retain. When you no longer need it, you must relinquish ownership of an object you own. You can release it immediately or release it afterwards. You must not relinquish ownership of an object you do not own as you risk a double free. Next, I'll walk through an example of these Cocoa conventions. In class A, a method uses alloc to create an object and init to initialize this object. Remember, never call init on an object twice. Class A takes the ownership and is responsible for deallocating it. Now the retain count for this object is one. Next, class B uses retain to get the object and takes ownership of this object. So far, I have two objects that share the ownership of this object represented by the orange cube. The retain count increases by one.
Class A doesn't need this object anymore, so class A should manually call release for it. As a result, the retain count decreases by one. Now, only class B owns the object. OK, finally, class B wants to release this object too. Now the retain count is zero, so the runtime frees the object. Here's a situation where a method in class B returns an object. You still need this object in the rest of the programs. In other words, you want to relinquish ownership of an object in a method in class B, but you don't want it to be deallocated immediately. In this case, you should call autorelease in class B. The retain count is still one after you call autorelease, and thus, you can still use the object later. Here's the question: since class B does not own this object anymore, who is responsible for deallocating it? The Foundation Framework provides an important object, called the AutoreleasePool. The Autorelease API puts the object into an AutoreleasePool. Now, the AutoreleasePool takes the ownership of the object. The AutoreleasePool decrements the receiver's retain count when the AutoreleasePool is destroyed. You are not the only one who can create autoreleased objects. Metal creates several autoreleased objects as part of its operation. All methods that create temporary objects add them to AutoreleasePools by calling autorelease under the hood. It is the AutoreleasePool's responsibility to release them. In other words, with an AutoreleasePool, you can code in a more elegant way. You can have an AutoreleasePool for the main application. We also encourage you to create and manage additional AutoreleasePools at smaller scopes to reduce your program's working set. You also need AutoreleasePools for every thread you create. Here's an example showing how to use an AutoreleasePool and autoreleased objects. In this sample, an AutoreleasePool is created by alloc, which means you take the ownership and it should be manually released. Now we have an AutoreleasePool. As we discussed in the beginning, you should create a command buffer. It's not created with alloc or create, so you don't own it. Instead, it's an autoreleased object created by Metal. This command buffer will be put into the AutoreleasePool. It's the AutoreleasePool's responsibility to deallocate it. You can use it as you wish until you release the AutoreleasePool. Then you need to create a RenderPassDescriptor. This RenderPassDescriptor will be put into the AutoreleasePool as well. Same to RenderCommandEncoder. It's also an autoreleased object created by Metal. Don't forget this currentDrawable object. It will be put into the AutoreleasePool too. At the end of the piece of code, I use pPool->release to release the AutoreleasePool. Before being deallocated, the AutoreleasePool releases everything that it owns, in this case, it releases the CommandBuffer, RenderPassDescriptor, RenderCommandEncoder, and currentDrawable. Then the AutoreleasePool is released. So far, you got to know Cocoa's conventions, autoreleased objects, and AutoreleasePools. It's important to correctly manage object lifecycles to avoid memory leaks and zombie objects, and we have great tools to help you avoid and debug these issues. I'll focus on two utilities: NS::SharedPtr and NSZombie. NS::SharedPtr is a new utility that can help you manage the object lifecycle. You can find it under Foundation framework in the metal-cpp folder. Notice that it is not exactly the same as std:shared_ptr. So there's no dependency on the C++ standard library and no extra cost on storing the reference count. Here's what NS::SharedPtr is like. Transfer and retain functions clearly express the intent of consuming an object. Transfer creates a SharedPtr without increasing the pointee's referenceCount, effectively transferring ownership to the SharedPtr. The retain function sends a retain to the passed-in object. Use this function to keep alive objects that are in AutoreleasePools and to express that the pointer's owner has a vested interest in the pointee's lifecycle. You can access the underlying object as expected via get and via the operator->. SharedPtr copy, move, construction, and assignment work as expected, with copy increasing the retainCount. Moves are fast and do not affect the retain count in the general case. SharedPtrs always send exactly one release to the pointee on destruction. You can avoid this if you want by calling the detach function. Going back to the top, it's important to know the differences between creating a pointer by transferring or retaining it. So for TransferPtr, suppose I have an MRR object, with a reference count of 1. After I pass it to the TransferPtr function, the SharedPtr takes ownership of the object, but its retainCount doesn't change. When the pointer goes out of scope, the SharedPtr's destructor runs and calls release on the MRR object, which decrements the retainCount to 0. Another function is NS::RetainPtr. When you want to avoid deallocating an object because you want to use it later, you should use NS::RetainPtr. Suppose we have this MRR object; the retainCount is one. After we pass it to the RetainPtr function, the retainCount increases by one. After running out of the scope, this RetainPtr calls release for this MRR object. So the retainCount is one. In general, NS::TransferPtr takes the ownership of an object for you. But NS::RetainPtr helps you retain an object when you don't want it to be deallocated. When you pass an object to these two functions, NS::TransferPtr doesn't change the reference count but NS::RetainPtr increases reference count by one as it calls retain for you under the hood. The destructor of these two functions both call release for the passed-in object and, therefore, reference count decreases by one. If the reference count hits zero, the object will be freed in runtime. Here's an example of NS::TransferPtr. When I talked about the render pass, which drew a single triangle, I needed this render pipeline state. Here are the calls to create a render pipeline state object. These are the attributes that a render pipeline descriptor needs. According to Cocoa's conventions, since these calls starts with new and alloc, I own these objects. So I need to call release for these objects. With NS::SharedPtr, I don't need to call release for those MRR objects because NS::SharedPtrs takes the ownership of these objects. So here, I pass raw pointers to the TransferPtr function. After doing that, there's no need to call release as I did in the previous slide. If you are familiar with ARC, you may find that MRR used with NS::SharedPtr is similar to using ARC. You may encounter use-after-free bugs when handling memory manually. They occur when you are trying to use an object which has been already released. NSZombie is a good way to check for those bugs. When use-after-free bugs occur, it triggers a breakpoint and provides you with a stack trace. You can enable Zombies very easily with an environment variable. Just set NSZombieEnabled to YES. Or If you're using Xcode, you can enable Zombies in a scheme. Here's how it works. I want to create a new render pipeline state object with the same render pipeline settings. So in this newRenderPipelineState function, I reuse the pDesc object.
After clicking on run, Xcode triggers a breakpoint and shows me the stack trace. That means I got something wrong. Hm, what's the problem? Maybe NSZombie can help here, so I enable NSZombie in scheme.
When I run the program again, NSZombie triggers a breakpoint. I get something new in the console output: "message sent to deallocated instance." Oh, I reused an object that I have already released. And it's the render pipeline descriptor. So I need to use this render pipeline descriptor before calling release. By doing that, I fix the problem. More tools and details are covered in this year's talk, "Profile and optimize your game's memory." For example, you can learn how to track retainCount in allocations in instruments. Feel free to check out other tools on Apple platforms. You will find out that they can help you debug your game and improve performance. Now you know how to manage object lifecycles in metal-cpp. But you may still need to interface with other frameworks, like game controller and audio. These are still in Objective-C. How can you interface with those APIs and design an elegant application architecture? Say you wrote a ViewController in Objective-C, but you wrote a renderer in C++ with metal-cpp. You need to call renderer methods, like draw, from the ViewController. The challenge here is to nicely separate the two languages but have them work together. The solution is to create an adapter class which calls C++ from Objective-C files. By doing this, you can focus on Objective-C or C++ in files where you implement features. For example, I can create a RendererAdapter class in Objective-C. And down in the implementation, I add an Objective-C method so that I can call it directly from the ViewController. Inside of the interface, I declare a C++ pointer to a renderer object. Inside the body of the method, I directly call the renderer's C++ method. This method needs to pass the MTK::View as a C++ object into the draw method, so it casts the view as a C++ type by using the __bridge keyword. I'll talk more about this cast later. In contrast, you need to call MTKView which is written with Objective-C in Renderer which is written with C++. It's challenging as well. Similarly, the solution is to create an adapter class. With this class, in C++ files, you can call Objective-C methods using C++ interface. For example, I can create a ViewAdapter class. I write the interfaces in C++, so in the Renderer class, I can call those C++ view methods easily. While in the implementation, I call Objective-C methods from MTKView, including currentDrawable and depthStencilTexture. You may notice there're some __bridge keywords here. I use them to cast between metal-cpp objects and Objective-C objects. As you learned in the beginning, metal-cpp objects are manually retained and released, but objects created by Objective-C use automatic reference counting. You need to move objects from MRR to ARC and from ARC to MRR. Here are three types of bridge casting which can help you cast between Objective-C and C++. They can also help you transfer ownership _bridge casting casts between Objective-C and metal-cpp objects. There is no transfer of ownership between them. __bridge_retained casting casts an Objective-C pointer to a metal-cpp pointer and takes the ownership from ARC. __bridge_transfer casting moves a metal-cpp pointer to Objective-C and transfers the ownership to ARC. Going back to the problem, you need to cast between metal-cpp objects and Objective-C objects. If there's no transfer of ownership, you can use __bridge cast. If you want to cast from metal-cpp to Objective-C objects and transfer the ownership to Objective-C, you should use __bridge_transfer cast. If you want to cast from Objective-C to metal-cpp objects and take the ownership out of ARC, you should use __bridge_retained cast. Here's a case when I have to use MetalKit to leverage the asset loading code. That means in my C++ application, I need a texture as a metal-cpp object, but it is created by Objective-C methods. I need the ability to transfer ownership out of ARC so I can manually release it. And in this case, I need to pick __bridge_retained cast. I have this C++ function that loads a texture from the catalog and I want to return a metal-cpp texture. But inside, I'm calling some Objective-C functions in MetalKit. I need to define the options that a texture loader needs. Then I create a texture loader by calling an Objective-C method from MetalKit. With that loader, I can create a texture object and load a texture from the catalog. This method is also an Objective-C method from MetalKit. Now I have an Objective-C type texture, I need to cast it to the metal-cpp object and take it out of ARC. With these steps in mind, it's time to code, and I'll show you how casting works in practice. First step is to define the texture loader options that a texture loader needs. I can safely cast the metal-cpp storage mode and usage to the Objective-C type, as the metal-cpp type defines them to the same values. Here I create a texture loader. I have a device that is a metal-cpp object, and I need to pass it to the initWithDevice method. Because the metal-cpp object is an Objective-C object, I can cast it like a toll-free object. There is no transfer of ownership. Now I use the texture loader options and a texture loader to create a texture. And I want to return the loaded texture as a metal-cpp object. So I need to take it out of ARC and cast it to the corresponding pointer type. This is done with a __bridge_retained cast. After this, I can use this texture as any metal-cpp object. I am responsible for releasing it. In this section, I provided an adapter pattern which can help you handle two different languages in your program. I also showed how to interface with Objective-C and C++ with three types of casts. To summarize, metal-cpp is a lightweight and very efficient Metal C++ wrapper. I talked about how to manage object lifecycles when using metal-cpp, how to interface with Objective-C in an elegant manner, and how our developer tools can help you debug. Download metal-cpp and play with all the amazing samples now! See what you can create with Metal. We look forward to seeing your C++ applications running across all Apple platforms. Thanks for watching! ♪