Build an Endpoint Security app

Back to WWDC 2020

Build an Endpoint Security app

System Extensions improve the reliability and security of macOS. Learn about the modern replacement for Kernel Authorization KPIs and discover tips for making a great security product with the Endpoint Security framework.

Resources
Related Videos

WWDC22
- What’s new in Endpoint Security
WWDC 2020
- Build trust through better privacy
WWDC 2019
- System Extensions and DriverKit
Download

Hello, and welcome to WWDC.
Hey. My name's Matthew, and I'm on the Security Engineering and Architecture team. Today, we're going to talk about the Endpoint Security Framework. The Endpoint Security Framework was first introduced last year in macOS Catalina. It is intended to be used as a replacement for the Kauth KPI, the unsupported Mac kernel framework, and the OpenBSM audit trail.
Kernel extensions, or KEXTS, can be difficult to develop and even more difficult to debug. Further, KEXTS pose a maintenance nightmare, as kernel interfaces frequently change.
They can also degrade overall system security and stability as even minor bugs often lead to kernel panics.
With the EndpointSecurity Framework, you no longer need to develop a kernel extension, and you can instead focus on the real goals of your products.
Products that use EndpointSecurity, or ES, are able to tap into a rich event stream from a normal application. Right now, ES supports roughly 100 event types, and we're constantly adding more.
There are two categories of event types. We have NOTIFY events, which inform you of an operation taking place, and AUTH events, which allow you to control whether or not an operation should be allowed to continue. We'll be discussing how the EndpointSecurity subsystem works, how your products can best utilize the event stream, as well as some more advanced features available to you.
While it's possible to distribute your ES application as a stand-alone product, we believe there are a lot of benefits to delivering your product as an EndpointSecurity based system extension. System extensions were also introduced last year in macOS Catalina. Several types of extensions are supported, including Network Extensions for writing applications such as VPNs and content filters, DriverKit for controlling hardware, and EndpointSecurity targeted at products like endpoint detection and response. We're not going to cover system extensions in detail here, but you should check out last year's WWDC 2019 session titled "System Extensions and DriverKit" for more information.
First, when the extension is installed, it becomes protected by System Integrity Protection, preventing accidental or malicious tampering of you extension and assets.
We're also able to provide a greater level of protection for you daemon similar to that of system daemons. This means that we prevent even root users from being able to unload your launchd job.
Also, there are some EndpointSecurity features that products can only use if they are a system extension, such as during system start-up, the ability to execute and set up an event stream before other third party applications are able to execute. I always like starting with an example, so let's take a quick look at the key components needed to get started with EndpointSecurity. One thing to note, this framework is provided as a C library, which is a little different than many other Apple frameworks you might encounter. Using C allowed us greater control over some memory and performance characteristics. Also, our goal was to be able to provide a library that would be quickly adoptable by existing products. By using C, this library is callable from many languages including Swift, Objective-C and Rust.
First off, in this example, you can see we start by initializing a new event stream using the ES new client API.
This function returns an ES client T handle as an out parameter, but, importantly, is also defines the event handler block. The block passed here will be invoked whenever there is an event ready to be processed.
In this example, you can see the block simply prints out the event type. Next, we set up the subscriptions for the event stream using the es_subscribe API. In this example, we have a single event in the array, the NOTIFY_EXEC event.
Also in this example, we see the es_delete client API being used, when something went wrong setting up the subscriptions, and this API can be used to clean up resources previously obtained from a call to es_new_client.
Finally, we call dispatch main which will allow the program to continue executing and process events via the event handler block submitted above.
To help you all get an understanding of how EndpointSecurity works, let's take a look at a high-level architecture design.
When your process connects to the EndpointSecurity subsystem using the es_new_client API, a channel is set up through which messages will be enqueued for processing. In this diagram here, there are two different ES-based applications.
One has set up two event streams using multiple calls to es_new_client, and the second one has a single channel. We typically refer to these channels as ES clients.
Each ES client is able to have its own set of subscriptions to control which events it receives.
When an event is triggered in the kernel, and assuming our client is subscribed to that event, EndpointSecurity intercepts and enriches this event with information that is typically helpful for ES clients performing analysis.
We'll discuss the information collected in more detail in a second, but once collected, this data is then wrapped in a message envelope. Next, that message is then sent to all appropriate ES clients, enqueueing the message for the event handler blocks to process.
The message is enqueued for all clients simultaneously. So for those of you already familiar with EndpointSecurity, this is a slight change in behavior, where before, events were delivered to clients serially.
Some events require a response, which we'll talk about shortly, in which case the original operation is held up in the kernel until a response is received.
For normal notification events, or after receiving a response for AUTH events, the operation is immediately unblocked in the kernel and allowed to continue.
Last thing to note in this diagram, is that EndpointSecurity attempts to cache as much as possible in order to reduce the number of messages that are required to be delivered for AUTH events. We have a lot more information about the cache later on.
The message envelope I just mentioned refers to the es_message_t structure, which is the wrapper for all ES events delivered to ES clients via the event handler block.
Each message contains three primary categories of information. First, message metadata. This includes things like the event type, message generation time, whether it's an AUTH or a NOTIFY event, and a message version number used for compatibility, which we'll touch on later.
Next, all messages include information about the process that caused the event to be triggered in the first place. Process information includes information about the executable file, including full stat info, code signing information, process ID and user ID information, and a whole lot more.
Last, each message contains information specific to the event that occurred. For example, the SIGNAL event includes information about the process being signaled and the signal number. The exec event includes information such as the file that will be executed and the executable arguments, and the OPEN event includes information about the file being opened and the open flags.
Time-of-check time-of-use issues are extremely important for you to be aware of when developing your ES products. Quickly, for those who may not be familiar, this refers to a class of bugs that occur when a program makes assumptions about some state that were previously true but may no longer be true. Classic example here is first checking if a file exists and then opening the file as a two-step process. If someone on the system maybe deleted the file after your program checked its existence, but before your program opened it, and your program didn't handle open failures correctly, it could lead to unexpected behavior.
The information provided in ES messages reflects a snapshot in time.
The system continues to process multiple threads simultaneously, and those threads may do things that can change system state before your application has an opportunity to inspect a message.
This means, in some cases, parts of the message may have different values than if you were to query that information on your own.
Products need to be cautious of things like time-of-check time-of-use issues. This is no different than those issues faced by kernel extension-based products today. One last thing to note before diving in to a demo, EndpointSecurity has a handful of runtime requirements that programs must satisfy in order to successfully establish an ES event stream.
First, an application must have the proper EndpointSecurity entitlement. This is a restricted entitlement, and you can request a provisioning profile via the link shown here.
If your application is bundled as a system extension, there is an additional entitlement required for the containing app bundle in order to install the extension. This information can be found within the system extension documentation.
Also, system extensions require approval from the user to complete installation.
Once an extension begins the installation process by using the system extension API, the user must approve the installation from the Security and Privacy Preferences dialogue on the General pane.
Next, in order to increase user privacy, we require that your applications obtain user consent in the form of Full Disk Access permissions, which is found on the Privacy pane of the Security and Privacy Preferences dialogue.
If you deploy as a system extension, on installation, we pre-populate your extension in this dialogue to make it easier on users to enable these permissions.
If your product is being deployed on managed devices, there are two MDM payloads available to aid distribution. First, there's payload to define extensions that will be automatically allowed to install without user approval. And second, there is a privacy preferences payload that can automatically enable Full Disk Access permissions. All right. Now, let's just into our first demo where we're going to look at notification messages and expand upon an existing ES extension.
So over here we're looking at our main function for a NOTIFY demo, and we can see right away this code looks very similar to the example code that we saw a few slides back. We start by setting up our new client event stream with es_new_client, and our handler block here just calls our handle_event function, which we'll look at in a second. Next, we set up our subscriptions here and call es_subscribe. Right now we have the NOTIFY_EXEC and NOTIFY_FORK event types.
Then, once the subscriptions are in place, we call dispatch_main to allow our program to continue executing and processing these events. So let's jump up in the handle_event.
Up here, you can see we use a switch statement on the message event_type, and then we have a case for each one to just log message to the screen. So this extension is already installed and running and approved on the system, so let's go take a look at what's happening. So we're going to set up our log stream here first, and there's a predicate involved that allows us to just focus on the executions that I'm doing right now. Next, I'm gonna use the ps utility as an example.
So now, when we execute ps, we see two messages come through. We first see that our Z shell is forking a new child process, and then from that forked process, it wants to execute the new executable /bin/ps. All right. Now let's expand upon this demo and add the EXIT event to get full process life cycle information. So here in our event array, we're going to add the ES_EVENT_TYPE_NOTIFY_EXIT to our list of subscriptions.
In our handle_event function, we need to add this new case.
So I'm gonna go ahead and add the break statement so I don't forget to do that later, and now we want to log this information. So I'll set up some of the boiler plate code for the OS log function here, and for the format string, let's go ahead and add the instigating process information like we did above, along with the pid.
And then next, I know that the EXIT event includes some event specific information related to the exit status. So let's get that as well.
All right. So now, to fill in the data for this format string, we need to first look for the instigating process, and we can do that looking at the process member of the message structure.
From there, we can look at the executable, and then finally grab the path.
Next, we need that pid for that process, and for processes, EndpointSecurity provides audit tokens for that information, which includes things like the user ID and the process ID. So you can use the lib BSM function, audit_token_to_pid, to extract that.
So, once again, we'll dive into the-- the process field and pull out the audit token.
And finally, we want the event specific information free at the EXIT event, which is the exit status code.
So to do that, we'll look at the event member of the message. From there we can pull out the EXIT event and then, finally, the stat field for that status code.
All right. So we're going to build this extension, and now we need to install it. A quick note that I've set up a system extension developer mode on this system, and that just allows us to more quickly deploy test extensions.
The other thing to note here is that system extension requires that the containing app bundle live in /Applications to be able to install the extension. So we're gonna drag this containing app we just built over to /Applications.
And then, once there, we can run that app and finally install the extension. So this is now just performing the upgrade, and the new extension is now running and live on the system. So, once again, let's use the ps utility as an example. Now you can see we get a third exit message, that includes the status code of zero, indicating the process exited successfully. As a quick example, we could pass an invalid flag to ps and we can see now the status code is 256 indicating there might have been an issue with the process.
Now that we've seen notification events in action, it's time to talk about AUTH events. A major feature of EndpointSecurity is the ability to authorize operations on the system. AUTH events are synchronous, which means after we enqueue the message for the ES clients, the operation is held up in the kernel and unable to resume until a response is received or a deadline expires.
Each AUTH message received has its own deadline field contained inside the message structure. Your ES clients must inspect this value for each message and ensure that the work it performs is completed before the deadline expires. If a client fails to respond before the deadline, the application will be terminated.
If your application is a system extension, the launchd job we submit for you will be automatically restarted. It's important to emphasize that each individual message has its own deadline, and there is no guarantee that each message allows for the same amount of response time. These values can and will change over time.
When a deadline is missed, an implicit ALLOW is applied as the response, but the result will not be cached, allowing future operations to be reevaluated.
There are two APIs available for responding to AUTH events. Most event types require a simple ALLOW or DENY response using the es_respond_auth_result API.
The second response API, es_respond_flags_result is used when an event has a range of options that may be permitted or denied, such as being able to allow a file to be opened with read-only permissions instead of read/write.
You should look at documentation to know which API to use for each event type, however as of today, only the AUTH OPEN event uses the flags result API.
If there are multiple clients on a system that subscribe to the same AUTH event, the responses from all the clients are combined by applying the most restrictive response, so if you have four clients respond to an event with ALLOW but one wants to DENY, the overall result is to deny the operation.
Similarly for flags responses, only the subset of flags set by all clients will be allowed.
You should also be aware that ES does not send introspective AUTH events as this would lead to trivial deadlocks. The events will be implicitly allowed. We will, however, send NOTIFY messages for events instigated by your process.
One quick caveat to call out with the flags result API is that your ES client process should respond with all the flags it will ever permit, not necessarily only the flags requested for an individual event. This has to do with caching, and we'll discuss this in more detail soon. With muting, ES clients are able to prevent receiving messages from processes that may not be of interest to the application. There are two main ways to do this.
First, with es_mute_process, the ES application provides an audit token that uniquely identifies a process. Typically, the client will obtain the audit token from a previous message. The EndpointSecurity subsystem will automatically track process exits and remove them from your client's set of muted processes. There is no need to manually use the corresponding es_unmute_process API unless your client wants to start receiving messages again from a process that it previously muted.
ES also supports muting by path literals, that is complete paths, with es_mute_path_literal as well as path prefixes with es_mute_path_prefix.
As the names suggest, this mechanism will prevent messages from instigating processes which match the provided paths. We recommend that ES clients use caution with these last two APIs here. While we employ data structures with fast lookup operations, adding a large number of paths can potentially have an adverse affect on performance. Muting by process audit token is generally a better way to mute events without as much overhead.
Many of the AUTH events can cache the combined result from the ES client responses, and this is stored in a single global cache shared across all ES clients.
The cache strategy used by EndpointSecurity is best effort, and entries may expire at any time. EndpointSecurity tracks several operations, and will automatically invalidate cached entries, for example, if a file is written, truncated, deleted and many more.
While we don't guarantee responses will be cached, if an ES client sets the cache flag in the response APIs to false, ES will guarantee not to cache that result, even if another client does request that it be cached.
Some reasons that an ES client may want to deny caching might be for the EXEC events when the executable is very dependent upon the executable arguments or environment variables, such as interpreters where you may want to inspect the script that the interpreter will execute.
ES applications are able to clear the entire cache with the es_clear_cache API, however we do not support invalidating individual cache entries.
Also note that ES will automatically clear the cache when a new client connects or an existing client disconnects. This allows for various scenarios where the remaining clients might respond with a different result than the previously combined cache result.
A few slides back I mentioned an important caveat that you need to be aware of when using the es_respond_flags_result API.
While at first it might sound unintuitive, the response must contain all flags an ES client might ever allow for an operation. This is because the cache for flags responses operates the same way as AUTH responses.
When a future event occurs, the cache is first consulted. If a cached entry exists, the flags from that entry are compared against the requested flags for the new operation, and the result is applied without generating an AUTH message.
So consider the case where a process opens a file with read-only flags set, and suppose the file being opened isn't a concern to your application. That is, your application wants to allow that process to manipulate the file however it needs. When your ES client process responds with the flags result, it should set all appropriate flags, including the WRITE flag, even though it wasn't requested for this particular event. Because what would happen if you did not set the WRITE flag? Well, if the process opened the same file for writing in the future, and the cached result still existed, the operation would be automatically denied without generating an AUTH message, even though that wasn't what your product intended. Now that we've gone over the two, let's look at a quick comparison between AUTH and NOTIFY event types.
NOTIFY events are always asynchronous. We enqueue the message for the client, but the operation immediately continues and could be completed before the ES client has an opportunity to see the message.
AUTH events, as previously stated, are synchronous and the operation is held up until a response is received.
Regarding delivery, NOTIFY messages are always delivered. However, AUTH messages are only delivered if there is no cached result and the message isn't being delivered to the same process that instigated the event.
Neither AUTH nor NOTIFY messages are delivered if the instigating process is muted.
Finally, there is a slight difference in message structure between the two. NOTIFY messages contain the result as applied by EndpointSecurity. That is, the ALLOW or DENY result from the corresponding AUTH message. If there were no ES clients subscribed to the AUTH variant, or no AUTH variant exists for the event, the result is implicitly allowed.
For AUTH messages, no result information exists yet, obviously. However, a message will have a deadline value and also requires a response from the ES client process. Now, let's do another demo, this time looking at AUTH event types and seeing how we can write a client to handle and respond to these messages. So, once again, we're looking at our main function here. You'll notice right away it looks very similar to the previous demo setup that we had. One slight difference here, we're going to be performing some asynchronous work on an asynchronous dispatch queue. So, we're going to go ahead and set up that queue first. But afterwards, we'll go ahead and initialize our new client event stream with the es_new_client. And our handler block will once again call our handle_event function.
For this demo, we're going to be looking at AUTH_EXEC and AUTH_OPEN event types. And after subscribing, we once again call into dispatch_main to allow our program to continue executing and processing these events.
So, let's look at the handle_event function.
Once again, we see a switch statement here where we call an appropriate handle function for each individual event type. Let's start with EXEC.
First thing to note here is that we call the es_respond_auth_result with ES_AUTH_RESULT_ALLOW for all EXECS. But we've left a TODO for ourselves to deny new EXECs matching the signing ID. Now, we typically expect most products will want more restricted polices, such as matching on CD hashes, but for demonstration purposes, this is a little more clear to see what we're trying to block. So, how can we do this? Well, one way, we could just use a simple string comparison on the new process that we'll be executing then compare that signing ID to the one that we want to block.
So, because we're looking at the event-specific data, we'll dive into the event union. And then from here, we can look at the target process and pull out the signing_id data.
So, now we'll compare that to the signing_id_to_block, and if they're equal, we want to deny that. So, we can do that with the es_respond_auth_result API. We need to pass our client and message argument here. And next we need to, define our result which is simply the AUTH_RESULT_DENY.
Go ahead and allow this to be cached. And to complete this out, we'll put this previous existing response here in the else statement. So, now, if our new executable matches the signing ID we want to block, we will respond with a "deny" result, and everything else will be allowed.
So now let's look at the open case.
So in handle_open, this may look a little less familiar, and there's some concepts we need to introduce that we haven't quite talked about yet. It is recommended that your ES clients perform as little work as possible in their event handler block. This typically means that you shouldn't perform a lot of I/O or large CPU-intensive tasks. And the goal here is that your event handler block should be as quick as possible so that they can return and continue dequeuing messages in order to keep that message queue size small, and that will help prevent with dropping messages.
So in this function, you see we copy our message with the es_copy_message API, and we'll discuss that more later on.
After we've copied our message, we asynchronously call our handle_open_worker function, and then once that's complete, we then go ahead and free that copied message.
So, what does our worker function do? We have three cases that we're concerned about. Our first case is a test for an EICAR file, and for those not familiar, an EICAR file is a test file that antivirus products typically use to test their operation without needing to introduce real malicious code on the system.
So, it's not important what this function does for this demo purposes. Just know that it needs to open and inspect the file contents, and it was a big driver for why we're doing this work asynchronously.
So, assuming this is an EICAR file, we will use the es_respond_flags_result API. And you can see we've defined a mask of zero. And this essentially clears all bits and will deny all open operations for that file.
Next, you can see we left a TODO for ourselves.
So, what's going on here? In this if statement, you can see we're inspecting the open event specific data for the file that is being opened, and we compare that against the read only prefix here defined as /usr/local/bin. So we want to prevent anybody from manipulating files on these directories.
So, to do that, we can once again use the es_respond_flags_result API.
We'll also pass our client and message arguments. Now we need to define the flags. We know that the only operation that we want to deny here is the write operation. So we can define the mask that is first with all bits set and then use some bitwise operations to clear the right flag.
One thing to note here you might notice is that EndpointSecurity provides flags for the open event that are the kernel versions of flags, not the oflags that you might be familiar with from the open system call. This is documented in the open event, and you should check that out.
Last, we'll allow this to be cached. And then the final case here already written, is that for all other open operations, we have our mask set here with all bits set and will allow all those open operations to continue successfully.
So, let's recap what we've done.
So, the three things that our program was intending to do was first to deny the text edit application to open. We haven't installed the extension yet, so we can see all this working now. So text edit can successfully open.
We also wanted to deny writing to files in /usr/local/bin.
So, as an example here, we have set up a script called hi.sh that simply prints "hi" to the screen.
And just to see that running, we can execute it. It does exactly what you'd expect it to do. Finally, in our home directory, I have an EICAR file. So, we'll print that out here, and we can see the standard EICAR test file definition.
All right, so, now let's go over, build our extension.
We once again need to install the system extension, so we're gonna drag it over to /Applications.
Now we'll launch the containing app bundle and install that new extension, and now our AUTH demo is now live and running. So, we can first try now to open the text edit application. We can see the icon bounces, but nothing's able to launch anymore.
Next we want to try and modify a file in usr/local/bin. So, here we can see I'm trying to modify the contents of the hi script to say "Bye" instead. However, this is now no longer allowed. But you can see this file is still allowed to be opened for read-only purposes, and the script is able to still execute.
And finally we have the EICAR file that we want to prevent all operations on. So we can see even trying to read that file is no longer permitted. After seeing how easy it is to write in ES client to handle AUTH events, let's dive into some more advanced topics. The message structure contains a version field that will allow your ES-based applications to maintain compatibility when deployed across different operating system releases.
New fields can be added to the various EndpointSecurity structures over time, allowing us to provide additional information to your ES clients. The version number is a single integer value, and all messages in an OS release will share the same version number.
If an OS release modified structures in a way that could affect compatibility, the version number will be increased. Clients using newer fields must first check the message version to ensure the field is available. Failing to do so will result in undefined behavior. Please pay close attention to the header docs for the ES structures as they indicate in which version new fields were added.
As a quick example, let's take a look at the acl field for the create event. This field didn't exist in the initial release, and, when it was added, we bumped the message version to 2. In the small example here, we have the handle_notify_create function that operates on NOTIFY_CREATE events.
We see that before accessing the acl member, the function first checks that the message version is greater than or equal to two.
The so-called early boot feature is a powerful mechanism of EndpointSecurity. And, as a reminder, this feature is only available to ES-based products that are system extensions, not to stand-alone ES applications. An extension opts into this feature by setting the NSEndpointSecurityEarlyBoot key in the extensions input/out plist. When there are registered early boot clients on system start-up, the system still comes up normally. However, no additional third-party applications will be allowed to execute until all of the early boot extensions are ready. To signal readiness, your extension needs to make at least one call to the es_subscribe API.
Once all early boot clients have made their first subscriptions, the third-party applications are finally allowed to execute. This means that your client should be prepared to subscribe to all necessary events in a single call to es_subscribe.
If you split your events across multiple calls, third-party applications may begin executing before the additional subscriptions are in place and your client can miss events.
It's also important to note that there is a deadline enforced by EndpointSecurity subsystem by which all early boot clients must signal readiness. Once the deadline expires, all third-party executions are automatically allowed to continue.
Your extensions should be careful not to perform long initialization steps before making a subscription to prevent missing events.
We do not consider the deadline value to be API and is subject to change over time, but we believe the value is suitable for your products to come up and become ready without having a large impact on end users.
Those of you who have already worked with the EndpointSecurity framework have likely noticed that we do not provide events related to networking operations. This is intentional as these are better covered by the NetworkExtension framework.
There's a small exception related to UNIX domain sockets, and ES does provide events for these.
You should also know that it's possible to use both the EndpointSecurity and NetworkExtension frameworks from a single combined system extension. For these products, you should use the system extension APIs and install flow as you normally would for extensions that are of a single extension type.
Additionally, it is safe to combine keys in your extension's info.plist that are specific to either network extensions or EndpointSecurity extensions and the system will take care of applying those values as appropriate.
ES provides messages to clients in the same order as they occurred on the system. For example, if a client subscribes to FORK and EXEC events, in the common scenario, when a new process is spawned, the client's handler block will always receive the fork event before the corresponding exec.
Also note that message ordering applies to subscriptions on an individual ES client. If your application creates multiple ES clients and splits subscriptions across them, the order of the events are only sequenced relative to the subscriptions for each ES client.
If desired, you can reconstruct global ordering by using the message generation time contained in the message struct.
Messages are delivered to ES clients one at a time, and messages are delivered only when the ES client's event handler block returns from processing the previous message. The message structure received by the handler block has a lifetime that is only guaranteed to be valid as long as the invocation of the handler block for that message.
Once you return from the block, you should not continue to access the message as doing so will result in undefined behavior.
If you need to extend the lifetime of the message, as we needed to do in the demo, you can do so using the es_copy_message API. This will return a handle to the message that will live until the corresponding es_free_message API is called.
An important feature this implies is that AUTH events do not necessarily need to be replied to before the handler block returns. It also means that AUTH events need not necessarily be responded to in the same order as presented to the ES client.
It might be important for your application to first inspect future messages in order to make a decision on a previous message. In this case, the ES application can first copy the message and return from the handler block to begin inspecting additional messages, and then later reply to that copied message with the appropriate response.
Don't forget to finally free the message when it's no longer needed. If you need to do some sort of asynchronous message processing, such as using an asynchronous dispatch queue, be sure you initialize the dispatch queue with a quality-of-service class appropriate for your application.
This will likely be dependent on many factors, such as whether or not you subscribe to AUTH events, expected event volume or how much CPU or system resources you wish to consume.
Next, we wanted to share some various tips that we think will help you develop great EndpointSecurity-based products.
One field you might've noticed in the message structure is the is_es_client Boolean member. ES applications are not able to authorize their own actions as these introspective events are automatically allowed. However, there might be multiple ES clients on a system, and ES clients do have the ability to authorize the actions of other ES clients. For both AUTH and NOTIFY messages, the is_es_client field will be set to "true" if the instigating process contain the EndpointSecurity client entitlement.
ES client should inspect this field and take appropriate action to ensure they don't inappropriately interfere with those clients' actions or perform actions that could create feedback loops.
Path muting is a fantastic way to prevent your client from being inundated with messages. It also helps keep the overall system more performant by reducing the number of messages that need to be processed or authorized. For instance, if your client isn't concerned about the Spotlight service on Mac OS indexing files, muting the appropriate indexing processes can greatly reduce the number of messages received for some event types.
Applications that take advantage of the cache should be aware that caching should be used for performance purposes only, and caching should never be used for policy.
This is because, as stated earlier, cache entries may expire at any time. For example, if your application decides to deny a process from opening any files and allows the response to be cached, you should then not mute the process. Consider what happens if the cache entry is later invalidated or otherwise removed. If that process attempts to reopen files, no AUTH events will be generated because your ES client has muted the process and those open operations would be automatically allowed.
The next tip is to be careful when debugging your ES application when subscribed to AUTH events. There is no way to disable or otherwise extend the response deadline. If an AUTH event is enqueued while your application hits a break point, you're still required to respond before the deadline, or your process will be terminated as per the normal rules.
Finally, it is possible for EndpointSecurity to drop messages without delivering them to clients if the message queue is full. Client can determine if this has happened by inspecting the sequence number field in the message structure. This is a per-client and per-event-type sequence number so you can determine exactly how many of each event type may have been missed. Clients that see higher drop rates are encouraged to take advantage of more advanced EndpointSecurity API usage patterns such as muting, asynchronous processing and splitting subscriptions across multiple ES clients.
This year we're announcing that the audit subsystem is being deprecated. This mainly refers to functionality related to audit events such as events written to audit trail files, typically those found in the /var/audit directory, as well as events sent to the Auditpipe pseudo-device for applications that wanted a live event stream. This does not apply to things like audit tokens and audit sessions, which will continue to be a core part of the system. Products that rely on audit events should migrate to using the EndpointSecurity framework. So what's new in macOS Big Sur? First up, there is some new event-specific API for the EXEC event. ES now provides the list of file descriptors and file descriptor types that a newly executing process will begin with. We also provide unique identifiers specifically for Pipe file descriptors that will allow you to track how multiple processes might be communicating via the Pipe interprocess communication mechanism. Due to performance limitations, not all file descriptors may be provided. The main focus was for standard in, standard out and standard err descriptors, but we also provide several more if they exist, and we expose to clients if there are additional descriptors we may not have enumerated.
Regarding performance, we think early adopters of EndpointSecurity framework will have a lot to look forward to in macOS Big Sur. We've rewritten many of our data structures to reduce memory allocations and increase event throughput. We've tuned the caches to eliminate invalidation bottlenecks and improved our overall memory performance. Combined, these enhancements have led to better system performance and fewer drops. We think this will provide much better performance characteristics for your EndpointSecurity products as well.
There were also several new event types that were added in macOS Big Sur. Some of these are additional AUTH variants for previously existing NOTIFY events and some are brand new. I wanted to call out a couple here. First, we added the trace event that was widely requested. This will notify your client when a process is being debugged.
Second, we're excited to introduce the CS_INVALIDATED event. As many of you know, as a signed process executes, the kernel is constantly validating that the code it pages in matches the individual page hashes of that binary. If a mismatch is found, the process has its code signing flags updated to clear the CS_VALID bit. Previously, ES clients had to wait for future messages and inspect the code signing flags of the instigating process to determine if it went invalid. However, with this event, clients can get an immediate notification.
One thing to note, binaries that use hardened run-time features, such as those that have the CS_KILL code signing flag set, will still automatically be terminated by the system if a hash mismatch is encountered.
So, where do you go from here? First, you can use the top link shown to begin the process of requesting the restricted EndpointSecurity entitlement. Next, you can begin reviewing the official EndpointSecurity and System Extension API documentation on the Apple developer website. I want to also encourage you to carefully read the EndpointSecurity header docs in the SDK. We've added a lot of comments and detail that may not be on the websites that we think answers common questions and will help you build great products. Lastly, we'll be making sample code available largely based on today's demos to help you get started working with the EndpointSecurity framework. With that, thanks for joining us, and we look forward to seeing what you build. And enjoy WWDC 2020.
Looking for something specific? Enter a topic above and jump straight to the good stuff.

An error occurred when submitting your query. Please check your Internet connection and try again.

Resources

Related Videos

WWDC22

WWDC 2020

WWDC 2019