Howto: efficiently get process info?

Given a pid_t, is there an efficient way to determine what child processes it has spawned?

I found proc_listchildpids() in <libproc.h>, but there is no documentation for it. (I've been able to figure out that the argument is an array of pid_t, but as far as I can tell there's no way to know up front how much space I should allocate.)

Somewhat related: given a pid_t, is there a way to get notified when that process spawns a child process, as well as when any child process exits? (I don't know in advance what processes will be created or when they'll terminate, so I can't keep track separately.) I know that DISPATCH_SOURCE_TYPE_PROC exists, and while that's in the general area, it looks like I'd have to do a fair amount of secondary bookkeeping to keep track.

Thanks for any advice. :-)

Accepted Reply

Given a pid_t, is there an efficient way to determine what child processes it has spawned?

Ideally you’d like to be notified of these events. You can use DISPATCH_PROC_{FORK,EXEC,EXIT) for this [1]. As with most Dispatch stuff, you need to use the notification as a trigger and then use some other API to catch up with reality. Which brings us to…

I found proc_listchildpids() in <libproc.h>, but there is no documentation for it.

The situation with libproc documentation is less than ideal. Even the techniques described in Availability of Low-Level APIs don’t help. I encourage you to file a bug requesting better docs.

Note It may be better to file your bug against macOS rather than against the developer documentation. This sort of stuff is usually documented in doc comments or man pages.

If you do file a bug, please post your bug number, just for the record.

On the plus side, the libproc code is in Darwin, nested in XNU source.

as far as I can tell there's no way to know up front how much space I should allocate.

Right. This is one of those Unix-y APIs that require you to guess and then tell if you’re wrong )-: I see a few choices here:

  • Do the ‘call it twice’ dance.

  • Pick a number that’s likely to work and call it again if you were wrong. The vast majority of processes only have a few active children at a time, so a number like 100 will work in most cases.

  • Allocate a buffer based on the value returned by the kern.maxproc sysctl. That value is currently 9,000 so the buffer wouldn’t be too big. And, as you’re doing this a bunch, you could reuse it.

As to what the ‘call it twice’ dance actually looks like, the call doesn’t return the number of processes but the number of processes returned. So, if the return value times sizeof(pid_t) matches the buffer size, you know you have to call it again with a bigger buffer.

given a pid_t, is there a way to get notified when that process spawns a child process … ?

See above.

while that's in the general area, it looks like I'd have to do a fair amount of secondary bookkeeping to keep track.

Yep. As I said, the nature of Dispatch makes this inevitable. The core issue is that there are notifications, which means that event N+1 can happen before you’re notified of event N. For example, if a process forks a child and the child immediately exits.


Another way to slice this problem would be to use an authorisation mechanism rather than a notification mechanism. The go-to authorisation mechanism for this stuff is Endpoint Security. However, based on my best guess as to what you’re doing with this, I suspect that won’t work for you.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] I’m surprised that works. Historically this (well, the underlying kqueue primitive) was just not implemented. But, hey, I guess we fixed that in the decades since I last tried this (-:

Replies

Given a pid_t, is there an efficient way to determine what child processes it has spawned?

Ideally you’d like to be notified of these events. You can use DISPATCH_PROC_{FORK,EXEC,EXIT) for this [1]. As with most Dispatch stuff, you need to use the notification as a trigger and then use some other API to catch up with reality. Which brings us to…

I found proc_listchildpids() in <libproc.h>, but there is no documentation for it.

The situation with libproc documentation is less than ideal. Even the techniques described in Availability of Low-Level APIs don’t help. I encourage you to file a bug requesting better docs.

Note It may be better to file your bug against macOS rather than against the developer documentation. This sort of stuff is usually documented in doc comments or man pages.

If you do file a bug, please post your bug number, just for the record.

On the plus side, the libproc code is in Darwin, nested in XNU source.

as far as I can tell there's no way to know up front how much space I should allocate.

Right. This is one of those Unix-y APIs that require you to guess and then tell if you’re wrong )-: I see a few choices here:

  • Do the ‘call it twice’ dance.

  • Pick a number that’s likely to work and call it again if you were wrong. The vast majority of processes only have a few active children at a time, so a number like 100 will work in most cases.

  • Allocate a buffer based on the value returned by the kern.maxproc sysctl. That value is currently 9,000 so the buffer wouldn’t be too big. And, as you’re doing this a bunch, you could reuse it.

As to what the ‘call it twice’ dance actually looks like, the call doesn’t return the number of processes but the number of processes returned. So, if the return value times sizeof(pid_t) matches the buffer size, you know you have to call it again with a bigger buffer.

given a pid_t, is there a way to get notified when that process spawns a child process … ?

See above.

while that's in the general area, it looks like I'd have to do a fair amount of secondary bookkeeping to keep track.

Yep. As I said, the nature of Dispatch makes this inevitable. The core issue is that there are notifications, which means that event N+1 can happen before you’re notified of event N. For example, if a process forks a child and the child immediately exits.


Another way to slice this problem would be to use an authorisation mechanism rather than a notification mechanism. The go-to authorisation mechanism for this stuff is Endpoint Security. However, based on my best guess as to what you’re doing with this, I suspect that won’t work for you.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] I’m surprised that works. Historically this (well, the underlying kqueue primitive) was just not implemented. But, hey, I guess we fixed that in the decades since I last tried this (-:

Thanks for the guidance as always, I will try to muddle through. :-) I have filed FB13459188 regarding the absence of documentation for the APIs in <libproc.h>, and I will do some exploration with the dispatch sources.