Develop kernel-resident device drivers and kernel extensions using Kernel.

Kernel Documentation

Pinned Posts

Posts under Kernel tag

45 Posts
Sort by:
Post not yet marked as solved
1 Replies
404 Views
Hello, I have tried to create a thread with thread_create_running API. It works but i would like to suspend this thread. I can call thread_suspend, but my thread has already been start before i call this API. Is there a way to create a thread without running it automaticaly. Thanks
Posted Last updated
.
Post not yet marked as solved
2 Replies
591 Views
Hello, My purpose is to understand how macOS works. Here is what i've done: I have wrote a c program on a M1 CPU with this lines: printf("Before breakpoint\n"); asm volatile("brk #0"); printf("After breakpoint\n"); When i run this program with lldb, a breakpoint is hit on the second line. So i suppose lldb is writing a "brk #0" instruction when we put a breakpoint manually. I can't continue to next line with lldb "c" command. PC stays on the brk instruction. I need to manually set PC to next instruction in lldb. Now, what i want to do is to create my own debugger. (I want to understand how lldb works). I have managed to ptrace the target program and i was able to catch an event with waitpid when "brk #0" is hit. But i don't know how i can increase PC value and continue execution because i can't do this on Silicon CPU: ptrace(PTRACE_GETREGS, child_pid, NULL, &regs); ptrace(PTRACE_SETREGS, child_pid, NULL, &regs); kill(child_pid, SIGCONT); So my question is: How does lldb managed to change ARM64 registers of a remote process ? Thanks
Posted Last updated
.
Post not yet marked as solved
0 Replies
499 Views
Simple question, I want to determine the number of performance cores in an Python script (better a Python app frozen with PyInstaller, which could make a difference). there are some ways to get the number of CPUs/cores like os.cpu_count(), multiprocessing.cpu_count() or psutil.cpu_count() (the later allowing discrimination between physical and virtual cores). However, Apple Silicon CPUs are separated into performance and efficiency cores, which you can get with (e.g.) sysctl hw.perflevel0.logicalcpu_max for performance and sysctl hw.perflevel1.logicalcpu_max for efficiency cores. Is there any way to get this in Python besides running sysctl and get the shell output? Maybe using the pyobjc package?
Posted
by gernophil.
Last updated
.
Post not yet marked as solved
1 Replies
428 Views
My app uses a lot of memory, so I often get "memory limit" crashes. In Cerf & ID & Profiles i enabled increase Memory Limit Capability, in my myAppName.entitlements But result of print(os_proc_available_memory()) the same. Where my mistake? Thank you iPhone 8, iOS 15.2
Posted
by Koder228.
Last updated
.
Post not yet marked as solved
0 Replies
331 Views
So upstream went and added a mutex_enter_interruptible() which Linux calls mutex_lock_interruptible() and FreeBSD sx_xlock_sig(lock). I was simply going to point it to lck_mtx_lock() and call it a day and ignoring the interruptible bit, but I am curious if there is a way to achieve something similar on XNU. In this case, to be able to hit ^C in userland, get a signal, and have lck_mtx_lock() or variant, giveup and return error.
Posted
by lundman.
Last updated
.
Post not yet marked as solved
3 Replies
545 Views
I found out that this code fails on Sonoma on apple silicon: #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <cassert> #include <iostream> int main() { const char* filename = "data_file"; int dataSize = 1024; // 1 kilobyte int fd; // Create or overwrite the file fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR | S_IXUSR ); if (fd == -1) { perror("Error creating file"); return 1; } // Make the file 1 KB in size if (ftruncate(fd, dataSize) == -1) { perror("Error setting file size"); close(fd); return 1; } // Map the file into memory for writing int* writeData = (int*)mmap(NULL, dataSize, PROT_WRITE, MAP_SHARED, fd, 0); if (writeData == MAP_FAILED) { perror("Error mmaping for write"); close(fd); return 1; } // Write some integer data for (int i = 0; i < dataSize/sizeof(int); ++i) { writeData[i] = i; } // Close the file and unmap memory if (munmap(writeData, dataSize) == -1) { perror("Error unmapping writeData"); } close(fd); // Reopen the file for reading and executing fd = open(filename, O_RDONLY); if (fd == -1) { perror("Error opening file for read|exec"); return 1; } int* readData = (int*)mmap(NULL, dataSize, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0); if (readData == MAP_FAILED) { perror("Error mmaping for read|exec"); close(fd); return 1; } // Assert the integer data is the same for (int i = 0; i < dataSize/sizeof(int); ++i) { assert(readData[i] == i); } std::cout << "Data verification succeeded!\n"; // Clean up if (munmap(readData, dataSize) == -1) { perror("Error unmapping readData"); } close(fd); unlink(filename); // Delete the file return 0; } mmap with PROT_READ | PROT_EXEC fails with EACCESS. and digging around the internet had led me to this commit: https://github.com/python/cpython/pull/109929/files what was the reasoning behind this change in the API, and where is it documented? it's quite unpleasant to find changes like that in a crucial low-level calls.
Posted
by byteshift.
Last updated
.
Post marked as solved
8 Replies
1.3k Views
I am trying to get ports used by processes. It can be done via lsof on macOS, i am trying to do it via libproc. #include <iostream> #include <libproc.h> int main(int argc, const char * argv[]) { pid_t pids[3072]; int count = proc_listpids(PROC_ALL_PIDS, 0, pids, sizeof(pids)); for (int i = 0; i < count; i++) { char buffer[1024]; for (int j = 1; j < 50000; j++) { //port range int ret = proc_pidfileportinfo(pids[i], j, PROC_PIDFILEPORTVNODEPATHINFO, buffer, sizeof(buffer)); if(ret != 0) { printf("proc_pidfileportinfo returned %d bytes of data\n", ret); printf("%s\n", name); } } } return 0; } proc_pidfileportinfo function is not working for any port, i tried iterating till 50K. What i am doing wrong with proc_pidfileportinfo? how to properly use proc_pidfileportinfo?
Posted Last updated
.
Post marked as solved
1 Replies
483 Views
Hello, found the following curious behaviour, If I try to run from within xcode (pressing Run) the following code: #include <unistd.h> int main(int argc, const char * argv[]) { char *args[] = {"/bin/ls", "-r", "-t", "-l", (char *) 0 }; execv(args[0], args); return 0; } the program does not print the expected list of files and folders but instead exits with: Message from debugger: Terminated due to signal 5 Program ended with exit code: 5 But if I try to run the exact same compiled program from the terminal, it works as expected. I lost so many hours wondering what I was doing wrong, but apparently it was the xcode console that does not play nice with execing? Could it be that changing the process image throws a wrench into xcode? Anybody has any idea why this could be? Thanks.
Posted
by CShen.
Last updated
.
Post not yet marked as solved
2 Replies
491 Views
On iOS we are trying to log energy use by our network extension to help us isolate troubled areas. task_info with TASK_POWER_INFO_V2 does return task_energy value and it's growing over time, but it's not clear what exactly it means and how it supposed to correlate to battery usage screen from iOS settings. Does anybody know anything about this?
Posted
by tandre.
Last updated
.
Post not yet marked as solved
0 Replies
850 Views
Hi! Any ETA for a KDK for 23A344? The most recent available KDK I see is 22G91. Thanks!
Posted Last updated
.
Post not yet marked as solved
4 Replies
1.4k Views
I know with Disk Arbitration framework, I can use DARegisterDiskMountApprovalCallback to prevent external disks from mounting. The disks includes thumb drive, external hard disk, etc., but there are many types of peripherals out there, like a usb wireless receiver or a USB ethernet adapter. Is there any other framework for us to use to enable/disable peripherals based on their I/O Registry properties? Thanks!
Posted Last updated
.
Post marked as solved
1 Replies
450 Views
I created a driver using DriverKit on Intel macOS 12.6.1 and Xcode 13.3. I enabled auto-manage signing, and set the signing certificate to 'Sign to Run Locally'. Then, I created a provision profile for the driver and selected my M1 test device. After installing the profile, I ran the app on the M1 device and successfully activated the driver. When I plugin the USB device, I can see the following log: DK: epusbfilter-0x100009dce::start(IOUSBHostInterface-0x10000946d) ok epusbfilter - init com.injection.epusbfilter.dext[57573] Corpse failure, too many 6 I also found a crash log ------------------------------------- Translated Report (Full Report Below) ------------------------------------- Process: com.injection.epusbfilter.dext [53185] Path: /Library/SystemExtensions/*/com.injection.epusbfilter.dext Identifier: com.injection.epusbfilter.dext Version: 1.0 (1) Code Type: ARM-64 (Native) Parent Process: launchd [1] User ID: 270 Date/Time: 2023-09-19 15:01:01.8502 +0800 OS Version: macOS 13.2 (22D49) Report Version: 12 Anonymous UUID: 5EB7EBD9-A435-FC45-73E6-C2C5844A8082 Time Awake Since Boot: 79000 seconds System Integrity Protection: disabled Crashed Thread: 1 Dispatch queue: Root Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000 Application Specific Information: abort() called Thread 0: 0 libsystem_kernel.dylib 0x1d5043b78 __semwait_signal_nocancel + 8 1 libsystem_c.dylib 0x1d4fcfec8 nanosleep$NOCANCEL + 212 2 libsystem_c.dylib 0x1d4fee204 sleep$NOCANCEL + 48 3 libdispatch.dylib 0x1d4f807b4 _dispatch_queue_cleanup2 + 200 4 libsystem_pthread.dylib 0x1d50fbc50 _pthread_tsd_cleanup + 132 5 libsystem_pthread.dylib 0x1d50f3220 _pthread_exit + 88 6 libsystem_pthread.dylib 0x1d50f4180 pthread_exit + 88 7 libdispatch.dylib 0x1d4f7bbcc dispatch_main + 128 8 DriverKit 0x1d4d33178 DriverExecutableMain + 84 9 dyld 0x104e95e50 start + 2544 Thread 1 Crashed:: Dispatch queue: Root 0 libsystem_kernel.dylib 0x1d5043720 __pthread_kill + 8 1 libsystem_pthread.dylib 0x1d50f40ec pthread_kill + 268 2 libsystem_c.dylib 0x1d5033cac abort + 180 3 DriverKit 0x1d4d5f890 panic + 256 4 DriverKit 0x1d4d5fa60 __assert_rtn + 88 5 DriverKit 0x1d4d60010 OSMetaClassBase::Invoke(IORPC) (.cold.1) + 44 6 DriverKit 0x1d4d32064 OSMetaClassBase::Invoke(IORPC) + 1396 7 DriverKit 0x1d4d32c5c Server(void*, mach_msg_header_t*, mach_msg_header_t*) + 520 8 DriverKit 0x1d4d3b420 uiomessage(void*) + 180 9 DriverKit 0x1d4d34694 uiomachchannel(void*, dispatch_mach_reason_t, dispatch_mach_msg_s*, int) + 380 10 libdispatch.dylib 0x1d4f8868c _dispatch_mach_msg_invoke + 472 11 libdispatch.dylib 0x1d4f74484 _dispatch_lane_serial_drain + 380 12 libdispatch.dylib 0x1d4f89620 _dispatch_mach_invoke + 852 13 libdispatch.dylib 0x1d4f74484 _dispatch_lane_serial_drain + 380 14 libdispatch.dylib 0x1d4f75130 _dispatch_lane_invoke + 436 15 libdispatch.dylib 0x1d4f7640c _dispatch_workloop_invoke + 1784 16 libdispatch.dylib 0x1d4f7ff5c _dispatch_workloop_worker_thread + 652 17 libsystem_pthread.dylib 0x1d50f5024 _pthread_wqthread + 404 18 libsystem_pthread.dylib 0x1d50fc678 start_wqthread + 8 Thread 2: 0 libsystem_pthread.dylib 0x1d50fc670 start_wqthread + 0 Thread 3: 0 libsystem_kernel.dylib 0x1d504401c __sigsuspend_nocancel + 8 1 libdispatch.dylib 0x1d4f808b4 _dispatch_sigsuspend + 48 2 libdispatch.dylib 0x1d4f80884 _dispatch_sig_thread + 56 Thread 1 crashed with ARM Thread State (64-bit): x0: 0x0000000000000000 x1: 0x0000000000000000 x2: 0x0000000000000000 x3: 0x0000000000000000 x4: 0xffffa0016a011948 x5: 0x0000000000000010 x6: 0x00006000010481b0 x7: 0x0000000000000000 x8: 0x725b4b6e56620c88 x9: 0x725b4b6f3d67bc88 x10: 0x00000000000001b0 x11: 0x0000600001048000 x12: 0x0000000000000090 x13: 0x00000000ffffff92 x14: 0x00000000000007fb x15: 0x0000000080636ffb x16: 0x0000000000000148 x17: 0x00000001d7176c60 x18: 0x0000000000000000 x19: 0x0000000000000006 x20: 0x0000000000004003 x21: 0x000000016b05b0e0 x22: 0x0000000000000000 x23: 0x00006000010480e8 x24: 0x0000600001048058 x25: 0xd200fde7d57ecca6 x26: 0x0000000000000085 x27: 0x000060000374c328 x28: 0x0000600001d4c000 fp: 0x000000016b059a90 lr: 0x00000001d50f40ec sp: 0x000000016b059a70 pc: 0x00000001d5043720 cpsr: 0x40001000 far: 0x0000600002c48000 esr: 0x56000080 Address size fault Binary Images: 0x1d503a000 - 0x1d5075fe3 libsystem_kernel.dylib (*) <60df52bd-fc1a-3888-b05b-24b44be3af15> /System/DriverKit/usr/lib/system/libsystem_kernel.dylib 0x1d4fc6000 - 0x1d5039fff libsystem_c.dylib (*) <eee04d9a-7574-3a74-8f4e-cfb05f89f7da> /System/DriverKit/usr/lib/system/libsystem_c.dylib 0x1d4f62000 - 0x1d4fadfff libdispatch.dylib (*) <4e310a5c-9629-305e-a1dd-6632bddd3362> /System/DriverKit/usr/lib/system/libdispatch.dylib 0x1d50ee000 - 0x1d50fdff3 libsystem_pthread.dylib (*) <c1ed564d-b480-3058-937e-b40c3d3df09d> /System/DriverKit/usr/lib/system/libsystem_pthread.dylib 0x1d4d27000 - 0x1d4d6b00d DriverKit (*) <839dc0a2-1e69-38e8-8bf5-ff0ecc531539> /System/DriverKit/System/Library/Frameworks/DriverKit.framework/DriverKit 0x104e90000 - 0x104f1bfff dyld (*) <fe8a9d9e-f65d-34ca-942c-175b99c0601b> /usr/lib/dyld Could anyone please help me with resolving this problem?
Posted
by emerys.
Last updated
.
Post marked as solved
3 Replies
954 Views
Hello, I have a Cocoa application from which I fork a new process (helper sort of) and it crashes on fork due to some cleanup code probably registered with pthreads_atfork() in Network framework. This is crash from the child process: Application Specific Information: *** multi-threaded process forked *** BUG IN CLIENT OF LIBPLATFORM: os_unfair_lock is corrupt Abort Cause 258 crashed on child side of fork pre-exec Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libsystem_platform.dylib 0x194551238 _os_unfair_lock_corruption_abort + 88 1 libsystem_platform.dylib 0x19454c788 _os_unfair_lock_lock_slow + 332 2 Network 0x19b1b4af0 nw_path_shared_necp_fd + 124 3 Network 0x19b1b4698 -[NWConcrete_nw_path_evaluator dealloc] + 72 4 Network 0x19af9d970 __nw_dictionary_dispose_block_invoke + 32 5 libxpc.dylib 0x194260210 _xpc_dictionary_apply_apply + 68 6 libxpc.dylib 0x19425c9a0 _xpc_dictionary_apply_node_f + 156 7 libxpc.dylib 0x1942600e8 xpc_dictionary_apply + 136 8 Network 0x19acd5210 -[OS_nw_dictionary dealloc] + 112 9 Network 0x19b1beb08 nw_path_release_globals + 120 10 Network 0x19b3d4fa0 nw_settings_child_has_forked() + 312 11 libsystem_pthread.dylib 0x100c8f7c8 _pthread_atfork_child_handlers + 76 12 libsystem_c.dylib 0x1943d9944 fork + 112 (...) I'm trying to create a child process with boost::process::child which does basically just a fork() followed by execv() and I do it before the - [NSApplication run] is called. Is it know bug or behavior which I've run into? Also what is a correct way to spawn child processes in Cocoa applications? As far as my understanding goes the basically all the available APIs (e.g. posix, NSTask) should be more or less the same thing calling the same syscalls. So forking the process early before main run loop starts and not starting another NSApplication in forked child should be ok ...or not?
Posted Last updated
.
Post not yet marked as solved
3 Replies
1.8k Views
Trying to get some minimum development working again, I've been waiting to be able to macOS in VMs on M1. Currently both VirtualBuddy, and UTM, can install macOS, I can go to Recovery Boot to disable SIP and enable 3rd party extensions. My M1 runs: ProductVersion: 13.0 BuildVersion: 22A5331f I've tested VM macOS versions of Monterey and Ventura. Here is my old kext (known to be working) loaded on M1 (Ventura) bare-metal 250 0 0xfffffe0006b70000 0x862ac 0x862ac org.openzfsonosx.zfs (2.1.0) BE4DF1D3-FF77-3E58-BC9A-C0B8E175DD97 &lt;21 7 5 4 3 1&gt; The same pkg, using the same steps in the VM, will after clicking Allow, ask to reboot (suspiciously fast), then come up with: System Extension Error: An error occurred with your system extensions during startup and they need to be rebuilt before they can be used. Of course clicking Allow just does the same, reboot, fail, ask to approve again, reboot..fail... Directly on the hardware, the dialog "rebuilding cache" pops up for a few seconds, but with the VMs I do not see it. I'm unfamiliar with the new system, so I'm not sure which log files to look at, but here is the output from kmtuil log, both at Allow and after reboot: https://www.lundman.net/kmutil-log.txt If I was going to make an uneducated guess and pull out some lines by random, maybe: 2022-08-29 20:01:13.169897+0900 0x251 Error 0x0 100 0 kernelmanagerd: Kcgen roundtrip failed with: Boot policy error: Error creating linked manifest: code BOOTPOLICY_ERROR_ACM 2022-08-29 20:01:13.170200+0900 0x251 Error 0x0 100 0 kernelmanagerd: Kcgen roundtrip failed checkpoint saveAuxkc: status:error fatalError:Optional("Boot policy error: Error creating linked manifest: code BOOTPOLICY_ERROR_ACM") 2022-08-29 20:01:13.170201+0900 0x251 Error 0x0 100 0 kernelmanagerd: Kcgen roundtrip failed: missing last checkpoint or errors found 2022-08-29 20:01:13.170242+0900 0x251 Default 0x0 100 0 kernelmanagerd: Deleting Preboot content Any work arounds? Loading kexts on my only M1 is a hard way to develop.
Posted
by lundman.
Last updated
.
Post marked as solved
3 Replies
631 Views
I have developed a kernel extension (KEXT) for driving SCSI devices and I am able to successfully use it to send commands to the underlying device. The driver class overrides the newUserClient method which gets called whenever IOServiceOpen is called from the user space so that apps can make use of the driver. Is there any way to restrict access to this kernel extension such that only my app would be able to open a user client to access the driver and communicate with it using IOConnectCallMethod?
Posted Last updated
.
Post not yet marked as solved
0 Replies
492 Views
Hi, How would I go about keeping a program at full capacity even as I leave the application? I am on Mac M1 (Ventura 13.5) and working in Unreal Engine. Specifically, I want to be able to stream my scene as I am working in it/playing the scene but this requires me to leave to application as I am working with 3 applications at the same time when doing so. Whenever I leave UE it becomes a background application and frame rate drops from 60 FPS to 3. I tried to renice but although UE is prioritized it nevertheless becomes treated as a background application and FPS drops. I suspect this has something to do with energy saving but I need to find a way around it. Anyone have an idea of how to sort this?
Posted Last updated
.
Post marked as solved
1 Replies
939 Views
Hello! We develop IOS application and it’s necessary for us to check if user changed the device time in Device settings or not. We use function "clock_gettime_nsec_np(CLOCK_MONOTONIC_RAW)" to check time and it works perfect. But I saw this function only here in Documentation(in discussion section): https://developer.apple.com/documentation/kernel/1646199-mach_continuous_time and this is for Mac OS so I want to know if I can use this function in IOS application that we are going to submit to App Store or we can face with some problems? For example this function can not work on some iPhones or on some IOS or we can have some problems during the review the application in App Store? Or maybe there are some alternatives function "clock_gettime_nsec_np(CLOCK_MONOTONIC_RAW)" for IOS? We tried to use ProcessInfo.processInfo.systemUptime but it counts only when device is awaken. So we can't use it to check if user changed the device time in Device settings or not.( Thank you for any help in advance!
Posted
by VStep.
Last updated
.
Post not yet marked as solved
0 Replies
795 Views
I regularly see folks confused by this point so I decided to write it up in a single place. If you have questions or comments about this, start a new thread here on DevForums. Tag your thread with something appropriate for the API you’re trying to use. If nothing leaps out, Kernel is a good option [1]. IMPORTANT I don’t work for App Review and can’t make definitive statements on their behalf. All of the following is about whether an API is available for you to use, not whether App Review will approve your specific usage. Share and Enjoy — Quinn “The Eskimo!” @ Developer Technical Support @ Apple let myEmail = "eskimo" + "1" + "@" + "apple.com" [1] Because of the KPI aspect of this, discussed below. Availability of Low-Level APIs Every now and again I see questions like this: The developer documentation has no entry for getpid. How can that not be API? Or this: I want to call mach_absolute_time on iOS. Its documentation says that it’s only available on macOS. But it’s in the iOS SDK and it works just fine. Is it OK for me to use it in my iOS app? These questions arise because: Apple Developer Documentation focuses on Apple’s frameworks. Most low-level APIs, specifically those with a BSD heritage, are documented in man pages. These man pages aren’t available in Apple Developer Documentation (r. 16512537). For information about how to access them, see Reading UNIX Manual Pages. Some low-level APIs are documented in both man pages and Apple Developer Documentation. A classic example of this is Dispatch, which has comprehensive man pages (start at dispatch 3) and good documentation in Apple Developer Documentation. Some low-level APIs, like Compression, are documented in Apple Developer Documentation but have no man pages. Some low-level APIs aren’t documented in either Apple Developer Documentation or the man pages. A classic example of this is SQLite. In such cases the best documentation may be the doc comments in the APIs headers, or on a third-party website. For Mach APIs, you often find that the best documentation is in the Darwin open source! Some low-level APIs have equivalent KPIs (Kernel Programming Interfaces). For example, mach_absolute_time is both an API and a KPI. Many KPIs are documented in Apple Developer Documentation, in a special area known as the Kernel framework [1]. Kernel development is only supported on macOS, so those KPIs are flagged as being macOS-only. However, their equivalent APIs are typically available on Apple’s other platforms. These questions most often crop up in the context of obscure low-level APIs, but many obviously valid low-level APIs have the same issue and no one worries about those. For example, Apple Developer Documentation has no documentation for the printf API [2], but no one asks whether it’s OK to use printf! Given the above, it’s clear that you can’t infer anything about the availability of an API based on its state in Apple Developer Documentation. Rather, the acid test for availability is: Is it declared in a platform SDK header? Is it not marked as unavailable on that platform? [3] Does the platform SDK have a stub library [4] that makes its symbol available to the linker? If the answer to all four questions is “yes”, the API is available on that platform. [1] The kernel does not support frameworks but bundling these KPIs into Kernel.framework within the macOS SDK makes some degree of sense from a tooling perspective, and that logic flows through to the documentation. [2] There’s a documentation page for the KPI, but that’s not the same thing. [3] Sorry for the double negative but it’s the only way to capture an important subtlety. If the header contains a declaration with no availability markings, that API is available on all platforms. A classic example of this is printf. [4] If you’re not familiar with the term stub library, see An Apple Library Primer. Hints and Tips In general, prefer high-level APIs over low-level ones. For example, prefer Date, or even gettimeofday or clock_gettime, over mach_absolute_time. However, that’s only a general guideline. In some cases the low-level API really is the right one to use. Just because something is an API doesn’t mean that there aren’t any restrictions on it. mach_absolute_time is a perfect example of this. Using it for highly accurate performance analysis of your code is fine, but using it for fingerprinting is not. See Describing use of required reason API. If you can’t find adequate documentation for an API you’re using, always look in the headers for doc comments. In some cases that’s the only source of documentation. However, even if the API is reasonably well documented, the headers might contain critical tidbits that slipped through the cracks.
Posted
by eskimo.
Last updated
.
Post not yet marked as solved
12 Replies
4k Views
"kmutil load" will fail to load a 3rd party kernel extension when booted into an OS on an external drive. The same kernel extension will load fine when booting from the internal "Macintosh HD" . "kmutil inspect" also fails with the same error message when booted using an external drive, but works fine when booting with the internal "Macintosh HD". Both errors indicate the "kernelcache" file cannot be found: sh-3.2/usr/bin/kmutil load -p /Library/Extensions/XXXX.kext Error Domain=KMErrorDomain Code=71 "Could not find: Unable to get contents of boot kernel collection collection at /System/Volumes/Preboot/3B670FAA-F124-41AB-98A8-7C3940B5ECAC/boot/16FE8A65862647F7F8752DA7C4EF320E4CADEB250FCF438FB84A55F822BEB4A98108829E21658BB08B1E314BBC85169A/System/Library/Caches/com.apple.kernelcaches/kernelcache" UserInfo={NSLocalizedDescription=Could not find: Unable to get contents of boot kernel collection collection at /System/Volumes/Preboot/3B670FAA-F124-41AB-98A8-7C3940B5ECAC/boot/16FE8A65862647F7F8752DA7C4EF320E4CADEB250FCF438FB84A55F822BEB4A98108829E21658BB08B1E314BBC85169A/System/Library/Caches/com.apple.kernelcaches/kernelcache} sh-3.2/usr/bin/kmutil inspect No variant specified, falling back to release Error Domain=KMErrorDomain Code=71 "Invalid argument: Unable to read contents of file at /System/Volumes/Preboot/3B670FAA-F124-41AB-98A8-7C3940B5ECAC/boot/904DF99E6EB1281F8E510B3FFB953383F530F313CCFB30A9C1F98231A81B02BDA5E6E0D2BF26FC5D4EB3D2E226B8BC1C/System/Library/Caches/com.apple.kernelcaches/kernelcache" UserInfo={NSLocalizedDescription=Invalid argument: Unable to read contents of file at /System/Volumes/Preboot/3B670FAA-F124-41AB-98A8-7C3940B5ECAC/boot/904DF99E6EB1281F8E510B3FFB953383F530F313CCFB30A9C1F98231A81B02BDA5E6E0D2BF26FC5D4EB3D2E226B8BC1C/System/Library/Caches/com.apple.kernelcaches/kernelcache} The "kernelcache" file path does not exist. In fact, there is no "904DF..." directory under "boot". There is no "kernelcache" file in the entire "/System/Volumes/Preboot" mount. sh-3.2find /System/Volumes/Preboot -name kernelcache find: /System/Volumes/Preboot/.Trashes: Operation not permitted This is on an M1 MacBook Air running 11.3. Is there a special way to load kernel extensions when booting from an external drive? Thanks.
Posted
by Cardano.
Last updated
.
Post not yet marked as solved
1 Replies
746 Views
FB9108925 FB10408005 Since Apple Silicon we've seen a lot of WebDAV instability in macOS 11.x, 12.x and now 13.x that isn't found on x86 Macs. Some were fixed in earlier minor OS upgrades (e.g. webdavfs-387.100.1 that added a missing mutex init), but it's still highly unreliable. The purpose of this post is to put more focus on the bug, see if there is something else we can do to help solve this, as well as hear about potential workarounds from other people experiencing the same problems. I've got a reproducible case described below that triggers a deadlock in VFS every time, requiring a hard reboot to fully recover. Before reboot I've captured this stack trace showing the WebDAV/VFS/UBC/VM layers getting tangled up (macOS 13.2 Build 22D49 running on Macmini9,1): Thread 0x16358 1001 samples (1-1001) priority 46 (base 31) 1001 thread_start + 8 (libsystem_pthread.dylib + 7724) [0x18d2e0e2c] 1001 _pthread_start + 148 (libsystem_pthread.dylib + 28780) [0x18d2e606c] 1001 ??? (diskarbitrationd + 99400) [0x100d5c448] 1001 unmount + 8 (libsystem_kernel.dylib + 55056) [0x18d2b2710] *1001 ??? (kernel.release.t8103 + 30712) [0xfffffe00083437f8] *1001 ??? (kernel.release.t8103 + 1775524) [0xfffffe00084ed7a4] *1001 ??? (kernel.release.t8103 + 7081508) [0xfffffe00089fce24] *1001 ??? (kernel.release.t8103 + 2522264) [0xfffffe00085a3c98] *1001 ??? (kernel.release.t8103 + 2523168) [0xfffffe00085a4020] *1001 vnode_iterate + 728 (kernel.release.t8103 + 2410988) [0xfffffe00085889ec] *1001 ??? (kernel.release.t8103 + 6095404) [0xfffffe000890c22c] *1001 ??? (kernel.release.t8103 + 1097172) [0xfffffe0008447dd4] *1001 ??? (kernel.release.t8103 + 1100852) [0xfffffe0008448c34] *1001 ??? (kernel.release.t8103 + 1024804) [0xfffffe0008436324] *1001 ??? (kernel.release.t8103 + 1025092) [0xfffffe0008436444] *1001 ??? (kernel.release.t8103 + 6497736) [0xfffffe000896e5c8] *1001 ??? (kernel.release.t8103 + 2705840) [0xfffffe00085d09b0] *1001 webdav_vnop_pageout + 432 (com.apple.filesystems.webdav + 16920) [0xfffffe000b2db7b8] *1001 webdav_vnop_close + 64 (com.apple.filesystems.webdav + 9492) [0xfffffe000b2d9ab4] *1001 webdav_vnop_close_locked + 96 (com.apple.filesystems.webdav + 19708) [0xfffffe000b2dc29c] *1001 webdav_close_mnomap + 264 (com.apple.filesystems.webdav + 20004) [0xfffffe000b2dc3c4] *1001 webdav_fsync + 404 (com.apple.filesystems.webdav + 20484) [0xfffffe000b2dc5a4] *1001 ubc_msync + 184 (kernel.release.t8103 + 6096856) [0xfffffe000890c7d8] *1001 ??? (kernel.release.t8103 + 1097172) [0xfffffe0008447dd4] *1001 ??? (kernel.release.t8103 + 1100728) [0xfffffe0008448bb8] *1001 lck_rw_sleep + 136 (kernel.release.t8103 + 505804) [0xfffffe00083b77cc] *1001 ??? (kernel.release.t8103 + 607656) [0xfffffe00083d05a8] *1001 ??? (kernel.release.t8103 + 613952) [0xfffffe00083d1e40] I've spent countless hours reading the xnu-8792.81.2 and webdavfs-392 sources trying to understand what happens. Symbols mapped back to the source code tell me it's trying to flush a dirty mmap'ed file back to the WebDAV host when the volume is about to get unmounted, but I suspect the pageout request is triggered recursively, perhaps because the mmap'ed file has shrunk and pages need to be released? The test case: Use Finder to connect to a WebDAV volume which holds a fairly large image (200 MB Photoshop file in my case). Navigate to this file in column mode so Finder renders a preview (using a QuickLook process). I believe this mmap's the file, but that alone isn't sufficient, so I think the Finder tries to write an updated thumbnail back to the volume as well. Click the Eject icon in the Finder to unmount the volume, which now deadlocks that file system. In the end something remains unreleased in the filesystem since the unmount request never completes, so whether that's a VNode lock or just open file refcount or something else I don't know. Now, why this deadlock is only seen on Apple Silicon is a mystery. Is Finder/QuickLook executing different code paths for generating or storing the thumbnail? Or is there yet more cases of uninitialized mutexes/locks that happen to be accidentally functional on x86 but expose a problem on AS? I've been through a lot of kernel source code trying to find any but have come up short. But since the above is easily reproduced I'm hoping someone with filesystem/kernel debug capability can succeed in pinpointing the bug. It's at least positive that the overall architecture works on x86 so I'm hoping it is a simple fix in the end. The reason I'm debugging this is we've got a lot of customers running WebDAV on M1/M2 and they find Finder file copying highly unreliable (i.e. writing many files to the WebDAV server, possibly overwriting existing files; some users have reported a need to reboot 20 times a day). I'm really looking for a bug that's common to all of these tasks, not just the mmap + unmount problem which is a minimal test case that I've cooked up in the lab. The few spindumps I've seen from end users have also included the combination of webdav_vnop_pageout + webdav_fsync + ubc_msync + lck_rw_sleep even if unmounting wasn't the initial op that forced the deadlock. This problem has been reproduced with different WebDAV server vendors, and there is a test account on a server running Apache provided in FB10408005 (though please select the PSD file, not just the tiny JPG). Thanks in advance!
Posted Last updated
.