Use Accelerate to improve performance and incorporate encrypted archives
The Accelerate framework helps you make large-scale mathematical computations and image calculations that are optimized for high-performance, low-energy consumption. Explore the latest updates to Accelerate and its Basic Neural Network Subroutines library, including additional layers, activation functions, and improved optimizer support. Check out improvements to simd.h that include better support for C++ templates. Discover support for Apple Encrypted Archive, an extension to Apple Archive that combines compression with powerful encryption and a digital signature. And learn how you can keep data your safe and secure without compromising on performance.
♪ Bass music playing ♪ ♪ Jonathan Hogg: Hello, and welcome to this session on the Accelerate and associated frameworks. I'm Jonathan from Apple's Vector & Numerics team, and today I'm going to talk to you briefly about the Accelerate framework before telling you what's new in our machine learning library, BNNS. I'll then cover improvements to simd.h, and introduce Apple Archive, and our new Apple Encrypted Archive containers. So let's get started with a brief overview of the Accelerate framework. Accelerate provides high-performance numerical computation across all Apple platforms: MacOS, iOS, iPadOS, watchOS and tvOS. Accelerate also provides access to the machine learning accelerators in Apple Silicon Macs and recent iPhone and iPad devices. The only way to leverage this hardware is by calling Accelerate either directly or through higher-level frameworks such as Core ML. Accelerate is composed of several parts. vDSP provides primitives for signal processing such as DFT and FFT routines. vImage provides routines for image processing such as format conversion and convolution. vForce provides vectorized versions of transcendental functions such as sine and cosine. BLAS and LAPACK provide high-performance implantations of the standard dense matrix algebra routines, while Sparse BLAS and our Sparse Solvers provide similar functionality for sparse matrices. Finally, BNNS provides support for machine learning. I'm also going to talk to you today about some related frameworks. simd.h provides computational small vectors and matrices, such as those covered in graphics programming, whilst Compression and Apple Archive provide support for lossless data compression. In order to use these frameworks, simply add the relevant include or import statement to your code and add the framework to your Xcode project. Now, let me tell you about BNNS in more detail. BNNS stands for Basic Neural Network Subroutines and provides performance primitives for machine learning on the CPU. For those of you unfamiliar with Apple's machine learning ecosystem, this diagram shows the lay of the land. There are three main hardware blocks: the CPU, which includes the machine learning accelerators I mentioned previously; the GPU; and the separate neural engine. BNNS provides performance primitives on the CPU in the same way as MPS provides performance primitives on the GPU. Above this layer, there are a number of frameworks that run on one or more of these backends. These include Apple's high-level machine learning frameworks -- Core ML and Create ML -- as well as the specialist frameworks like Vision and Natural Language. BNNS provides support for both training and inference across a wide range of layer types, as well as support for the optimizers shown at the bottom. In this release, we have added support for several new layer types: embedding, random fill, and quantization; as well as support for AdamW optimizer. We've also improved existing layers, adding two new activation functions: SiLU and HardSwish, as well as support for new arithmetic functions: ternary select, multiply add, as well as element-wise minimum and maximum. Layer fusions provide the ability to consume the output of one layer immediately as the input to a subsequent layer, avoiding the need to write it out to memory and read it back in again. We've added layer fusions of convolution and fully connected layers with the new quantization layer, as well as a fusion between the arithmetic and normalization layers. Other improvements include improved support for gradient clipping in the optimizer, which can also be used as standalone functions, as well as AMSGrad support for our Adam-based optimizers. Together, these improvements have expanded the range and network architectures we can accelerate even further. Now, let me tell you about some improvements to simd.h. simd.h provides computational primitives on small vectors and matrices that fit into the CPU's registers, including support for functions such as sine and cosine as well as useful geometric operations, including support for quaternions. The thing I really like about simd.h is it lets us jump in and get 90 percent of the benefit of vectorization with 10 percent of the effort. Here, let me show you. Here's a neural network activation function I just invented. As you can see, it has three different branches. If the input is less than minus pi, I just return zero. If it's between minus pi and pi, I return two times the exponent of x, multiplied by x plus pi over two. Otherwise, if it's greater than pi, I return two times the exponent. That's great, but if I have a large amount of data, I may want to run it faster than a scalar loop allows. So let me show you how to implement that in simd. I already have some boilerplate in place, an extension which allows me to easily write vectors to a buffer, and a simple loop that iterates over our output array in increments of one length eight vector at a time. The interesting part is how to translate our scalar function into a simd equivalent. Let's start by looking again at our scalar code. I see it has several branches. These don't work well for vectorization. Instead, let's construct this out of parts we can merge based on a mask. Looking more closely, if x is less than minus pi, I just return zero. If it's greater, I return an expression involving two times exponent of x. Let's pull that out. Now, we want to construct a vector from that y, replacing with zero everywhere that x is element-wise less than minus pi. Next, we can look at a greater than case. Here, we are either multiplying by one in the high region, or by x plus pi over two in the middle region. So let's write that in the same way. We take our x plus pi over two expression and this time we're replacing with one everywhere that x is element-wise greater than or equal to pi. Now all that remains is to multiply these two quantities together. Obviously, if a zero element, multiplying by either value of b still returns zero. So let's run that and see how it looks.
Now, I can see -- looking down the console -- that my new simd version is almost three times faster than the previous scalar code. So how is simd improved in this release? We've improved usability for C++ programmers using templates. We have added types and traits structures to allow you to move between the underlying scalar type and vector length and the concrete simd type without complicated code structures or needing to implement similar types yourself. To simplify their use, we have also added convenient aliases to reduce the need for C++ boilerplate. Here's an example of what they look like in use. The vector and matrix types allow us to go from an underlying type -- such as float or int -- and a vector length to a concrete type and also have members providing access to related types, such as the unaligned version and the mask type resulting from comparisons. The Vector_t and Matrix_t aliases provide simplified syntax to access the same definitions as we had before. The get_traits struct allows us to go in the other direction, moving from the concrete simd type to the generic one. And again, there are aliases to simplify the syntax for common use cases. We've also added templated versions of the make and convert functions to allow their use in templated code. These work the same as the existing functions, but their destination type is now a template parameter rather than part of the function name. In addition to our C++ improvements, we have added several new functions supported across all our languages. These are classification functions -- like isfinite and isinf -- that provide vector versions of the scalar functions in libm, as well as new functions for calculating the gamma function, as well as the trace of simd matrices. Now, introducing Apple Archive and our new Apple Encrypted Archive formats. Apple Archive has been powering our system updates for the better part of a decade. In the macOS 11 release, we gave you access to the compressed container and archive format. New in macOS 12, we have added APIs for encryption to this support. The archive format itself provides a modern, modular approach, allowing you to select exactly which file attributes and metadata you want to store. It is streamable, which means you don't have to worry about fitting the entire data in memory at once. It also supports separate manifest files for indexing into large archives like file system images. The new Apple Encrypted Archive builds on this, combining compression, authenticated encryption, and a digital signature into a single secure package. It gives you state-of-the-art cryptography that's been designed and audited by our Security team, as well as outside experts. Data confidentiality means that your data stays secret. Data authenticity means that you can be sure it hasn't been corrupted in transit. Sender authentication means you can be sure who sent it. Signature privacy means that in a public key context, only you and the sender know who has signed it. We also obfuscate metadata -- such as the file lengths -- and include resigning attack protection. Together, this means you can be confident that your data remains private and secure. In order to facilitate correct deployment, we offer a number of different profiles for different use cases. The basic profile is to have a digital signature without encryption. This can be used for things like software updates where the contents are not secret, but you want to be sure the data hasn't been tampered with. Next, we have symmetric encryption, with or without a signature, using a securely shared binary key. This is similar to the next option which uses a password rather than a binary key. Finally, we have full-blown public key encryption, again with or without signature. In all profiles, compression is optional and data is always authenticated. To work with these formats, we provide a number of command-line tools. For working with the compressed archive portion of the format, there is compression_tool, and for the encrypted archive, there is aea. The aa tool handles the entire container. There is of course also an API provided by the Apple Archive framework in both Swift and C. It is stream-based, allowing for both sequential and random access. Its implementation is multithreaded for blazingly fast performance. So, let's see this API in action. Here, we have a simple demo app we have put together. The top portion of the window acts as a drag-and-drop target for things we want to encrypt, whilst the bottom part is a simple status pane. Let's say I want to encrypt this TopSecret directory. I just drag and drop this into the app. And, oh no! We get an error. We haven't implemented this function yet! Let's do that now. So, what do we need to do to encrypt this with Apple Archive? First, we need an encryption context that describes the algorithm and profile to use, along with our encryption secret. We also need a file stream we're going to write the archive to. We combine these to create an encryption stream. The encryption stream will encrypt a stream of bytes, so we need an adaptor that will translate the directory we want to encrypt into such. This is the encoder stream. The data, of course, flows in the opposite direction to the object creation. We feed archive entries into the encoder stream, which transform them into bytes for the encryption stream, which then outputs the encrypted data to the file stream. Let's see how that looks in code. Here, we specify that we're using a symmetric profile. And the "none" tells us that we're going to use no digital signature. The initial portion of the enum just specifies the particular algorithm we want to use. Here, we're going to use "lzfse" to compress our data. With the context created, we just need to specify our symmetric encryption key. Next, we create those three streams. First, we create the file stream, then we combine it with a context to create the encryptionStream. Finally, we derive the encoderStream. Now, it's important that we remember to close these streams in the correct order. In particular, closing the encryptionStream does a lot of work behind the scenes, as it signs and seals the archive. Finally, all that remains is for us to feed our files into the encoderStream. I specify the file attributes I want to encode and then call the writeDirectoryContents method. All that remains is print a status message to the console with the encryption key. Let's see if that worked. If I drop our TopSecret directory into the app, it succeeds, encrypts it, and prints out our encryption key. Now, if I drag and drop our encrypted archive into the app, it tries to decrypt it and asks for the encryption key. So let's copy and paste that encryption key and let's see what's inside. Mmm, delicious! That's everything I have for you on Apple Encrypted Archive, so let's wrap up. Today, I talked to you about improvements to the Accelerate framework, including support for new layer types in BNNS, as well as expanded C++ support and other functionality in simd.h. I then gave you an introduction to the Apple Archive and new Apple Encrypted Archive formats and their support in the frameworks. Thank you and enjoy the rest of WWDC. ♪
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.