In this guest blog post, my good friend Mikhail Sosonkin (@hexlogic) reverses Apple's screencapture utility in order to peak behind the (figurative) curtain and uncover how it works. He also looks at some Mac malware from 2013 that captured desktop images, and suggests methods for detecting screen capturing!
Enjoy his writeup; "Who Moved My Pixels?!"
How does the (screen) capture work?
If you wish to reproduce or follow the steps I've taken, linked below are the binaries that I used for the reverse engineering. The binaries are from MacOS High Sierra version 10.13.3.
MacOS comes with a utility for capturing the screen pixels into an image file: /usr/sbin/screencapture. It is a useful utility and, I'm guessing, screencapture is what gets executed when I press the right key combinations on the desktop to take full or partial screenshots. So, I decided to reverse it and see how it actually does the capturing. Turns out it wasn't so complicated.
Starting the trace at the very beginning. This is where the command line arguments are processed; see __text:100002640 and __text:10000287E. So, there is a good chance that this is where we should start tracing.
To be user friendly, the utility uses a shutter sound to indicate that a the screen has been captured. So, I turned up my speakers and started debugging! The sound would serve as guiding light to help narrow down the useful code.
__text:100003D20 take_the_screenshot proc near
...
__text:100003D80 cmp cs:byte_100012553, 0
__text:100003D87 jnz short loc_100003D8E
__text:100003D89 call playTheScreenshotSound
Unfortunately, the sound is played very early in the process. At least, when I hear the sound, I know I'm on the right path.
I know this is the sound playing function because it is essentially the wrapper to these calls (below). Also, because I can hear the sound after the functions finish execution!
__text:100007DC0 call _AudioServicesSetProperty
__text:100007DC5 mov edi, [rbx]
__text:100007DC7 call _AudioServicesPlaySystemSound
Let's go back to the take_the_screenshot function (where the sound is played). Using a debugger, I step through a bunch of instructions (tedious!) when I notice a function that calls _CGDisplayCreateImage of the CoreGraphics framework. That looks promising!
I named this function doCapture but at this point I'm not 100% certain if the name is accurate. However, without going into that function, I notice that the calls after doCapture, within the take_the_screenshot function, record an image to disk. I'm guessing the image being written to disk is the screenshot in question. Seems like a reasonable assumption, so I decided to follow that thread.
I named this function writeImageToDisk. And if you look inside, there are all sorts of references to recording images to a file on disk. Particularly interesting are the error messages:
__text:1000073D9 lea rax, cfstr_YouDontHavePer
; "You dont have permission to save files \
in the location where screen shots are stored."
__text:1000073E0 mov cs:qword_1000139F8, rax
__text:1000073E7 mov rax, cs:___stderrp_ptr
__text:1000073EE mov rdi, [rax] ; FILE *
__text:1000073F1 lea rsi, aScreencaptur_6 ; "screencapture: cannot write file to int"
And so, this is more support that doCapture is the function that does all the interesting bits. Let's keep the name and dig into it some more.
__text:1000052EA call _CGRectIsEmpty
__text:1000052EF test al, al
__text:1000052F1 jz short loc_100005312
__text:1000052F3 mov edi, r12d
__text:1000052F6 call _CGDisplayCreateImage ; Returns an image containing the contents \
of the specified display
__text:1000052FB mov r15, rax
__text:1000052FE lea rbx, [rbp+var_C0]
__text:100005305 mov rdi, rbx
__text:100005308 mov esi, r12d
__text:10000530B call _CGDisplayBounds
__text:100005310 jmp short loc_100005343
CGDisplayCreateImage looks promising, but at this point it could have number meanings. However, I'm a reverse engineer, I'm not afraid of going down a few rabbit holes! Well, this function is actually just a stub:
Ummm, what? That function doesn't look like it does anything useful! Worse, it does not look like it can even execute. What's going on here? Well, we go to our trusty LLDB debugger! Obviously, there is some sort of a runtime linking mechanism that replaces the CoreGraphics function with something else.
Looking at the stack trace, it becomes obvious that the actual implementation used is actually the similarly named SLDisplayCreateImage function from the SkyLight private framework. So, what we saw in the CoreGraphics framework was some sort of a stub - makes sense, since there is non-executable content in there! Let's keep digging :-)
Looking at the assembly of _SLDisplayCreateImage, I can see that it is essentially a wrapper function for _SLSHWCaptureDesktop.
Intuitively, I'd expect that the actual contents for the screen pixels will be in a buffer of some service. So, I would not expect the user application to access that buffer directly in order to capture an image. That means there should be some sort of an IPC mechanism between the user application and the GUI service. On OS X, IPC means MACH PORTS [0].
Below is the disassembly of the section of the function that sends a mach port message to the GUI Service in order to obtain the actual pixel content.
Looking at the references to _bootstrap_look_up2, two names show up that look interesting:
com.apple.windowserver.active
om.apple.windowserver
We need to find out which service publishes these ports with these names. I wasn't quite sure how to do that directly, so I took a slightly different approach. Instead, I set a breakpoint on the _mach_msg and looked at the message header to obtain the remote port number:
Loading the WindowServer in IDAPro, I can see that it uses the same framework at its core as the screencapture utility. That's kinda cool!
The WindowServer program is basically a simple wrapper for the functionality in the SkyLight library that I've been analyzing all this time. This makes life easier in many ways. So, I looked for a corresponding capture function - just thinking that one should exist by, perhaps, a slightly different name. Doing a simple text search, I found _XHWCaptureDesktop. Without hesitation, I attached the debugger and set a breakpoint. This is the resulting backtrace which looks super interesting!
Setting a breakpoint on _XHWCaptureDesktop and triggering a screencapture, we get a nice trace that confirms the theory! This is great because if we want to keep an eye on who takes screenshots on the system, we can just look for calls to this function!
Detecting a screenshot
After analyzing the process of how the screencapture utility works, I became curious if there was a way to detect when my screen gets captured. One mechanism is to use the mdfind utility. This is what Dave DeLong used [6] in his method. However, it seems to depend on the capture utility to generate an image file and set the kMDItemIsScreenCapture = 1 attribute within the file. Fairly certain that malware wouldn't do that. Well, unless you're developing KitM.A malware (see the Malware section)! This section is my exploration for how to perform detection of someone capturing the pixels off of my screen using the method reverse engineered in this article.
Detecting if some process has requested a screen shot is actually quite easy with the right tools. Using lldb is too heavy and we don't really want to breakpoint a service that is being used. So, instead I decided to use Frida [1]. It is a great tool for dynamic analysis and uses techniques similar to those that would be applied by a production endpoint security tool.
# sudo frida-trace -a 'SkyLight!43287' WindowServer
Instrumenting functions...
sub_43287: Auto-generated handler at
"./__handlers__/SkyLight/sub_43287.js"
Started tracing 1 function. Press Ctrl+C to stop.
/* TID 0x307 */
6791 ms sub_43287()
For some reason Frida would not resolve the _XHWCaptureDesktop function, however I was able to specify it by the offset into the dynamic library. The name resolution is probably some sort of a bug within Frida because all the other tools I've used (IDAPro, lldb, nm) have resolved the symbol just fine.
Luckily for us, the mach message that contains the request from the client is passed in as an argument to the _XHWCaptureDesktop function. The pointer is passed in the RDI register.
We can see that the message ID is 0x0000732a (see the psuedocode above, in the screencapture reverse engineering section, for details) and the local port is 0x000153ab that is the port this request was sent from. Let's use lsmp to track this port.
# lsmp -a
Process (4460) : screencapture
name ipc-object rights identifier type
--------- ---------- ---------- -------- -----
0x00000103 0x56e57599 send TASK SELF (4460) screencapture
...
0x00000603 0x56e56729 recv
+ send-once 0x000153ab (157) WindowServer
There's not really a good way to format the output of lsmp, but if you scroll to the side you will see that 0x000153ab is connected to the WindowServer process. This is how we can derive the PID of the process that made the request.
Just to confirm, we can also see that the WindowServer process has a reference to this port as well:
Process (157) : WindowServer
name ipc-object rights identifier type
--------- ---------- ---------- -------- -----
0x000153ab 0x56e56729 send-once 0x00000603 (4460) screencapture
Malware
Back in 2013 there was some malware that had 'screenshotting' as one of its features. I obtained this sample (called MAC.OSX.Backdoor.KitM.A by F-Secure), and by now, it is detected by everyone. You can download it here: KitM.zip (password infect3d).
Doing some quick reverse engineering, it's easy to see that the malware actually uses the screencapture utility that comes with the OS. It generates the screenshot images and uploads them somewhere. What's interesting is that it means these screen capture images could be found using the mdfind kMDItemIsScreenCapture:1 command.
Building the grabber
Let's say I was a Russian Hacker and I wanted to covertly steal your pixels. Using the screencapture utility would work, but I don't want to give myself away by shouting the shutter sound. Luckily for me there's a super easy way of doing it myself! All I have to do is use the right libraries that are already on every OS X instance.
As it turns out, there is more than one way to grab screen pixels. In his blog [7], Felix Krause uses the CGWindowListCreateImage function to capture the image. He goes a step further and actually sends the image through an OCR tool to extract the text. Cool! Below is my code for leveraging the same mechanism as the screencapture utility was revealed in the previous section.
#include <CoreFoundation/CFURL.h>
#include <ImageIO/CGImageDestination.h>
#include <CoreGraphics/CGDirectDisplay.h>
void doCGCapture() {
CGDirectDisplayID displays[256];
uint32_t dispCount = 0;
// get a list of all displays
if(CGGetActiveDisplayList(256, displays, &dispCount)) {
printf("Error getting display list\n");
return;
}
// iterate screens and take the screenshots
for(int i = 0; i < dispCount; i++) {
CGDirectDisplayID dispId = displays[i];
// get the raw pixels
CGImageRef img = CGDisplayCreateImage(dispId);
char path_str[1024];
snprintf(path_str, 1023, "./image%d.png", i);
// output file
CFURLRef path =
CFURLCreateWithFileSystemPath(NULL,
__CFStringMakeConstantString(path_str),
kCFURLPOSIXPathStyle, false);
// file/format to save pixels to
CGImageDestinationRef destination =
CGImageDestinationCreateWithURL(
path, CFSTR("public.png"), 1, NULL); //[4]
// add our captured pixels
CGImageDestinationAddImage(destination, img, nil);
// generate the image
if (!CGImageDestinationFinalize(destination)) {
printf("Failed to finalize\n");
}
}
}
Let's see how this code works in action:
This first video shows screen capturing via the command line or SSH.
Now, a second video:
This shows the same thing by via a Cocoa App that is running from with in a very restrictive sandbox. The sandbox configuration that you would get if you get an application from the AppStore. Below is the screenshot of the sandbox configuration that the app was build with. I know it was taking affect because I had to allow the App to store files in the Downloads folder otherwise it would get blocked by the sandbox.
No other permission was given to the App. By default the App pretty much cannot do anything on the system. This means that malware could come fully sandboxed and still steal your precious pixels!
As far as I could tell, pretty much any user and any process that has access (which is a lot!) to the GUI window server can request all the pixels. The closest way I found, as far as prevention, was to use the sandbox, via sandbox-exec command, mechanism with a strongly defined policy.
I'm not really an OS X expert, but I read some blogs [2]. There I found that OSXReverser has developed a manual on how to configure the sandbox. The closest thing I could find was to prevent the process from looking up the WindowServer port via its name. However, this is not a practical mechanism because lots of applications will want to access the GUI and, more important, port numbers aren't that hard to bruteforce!
Instead, I really wish there was a mechanism to block mach messages with a specific message ID. For example, something like this:
(deny mach-msg
(mach-msg-id 0x732a))
Dare I say that we need a way to do deep message inspection and filtering on OS X? Ideally, there should be a mechanism where the WindowServer could white list the processes that are allowed to call certain RPC functions.
This way not every process would be allowed to steal pixels. Pixels that could contain private, confidential information like banking records, secret keys, or plans to the Lockheed Martin F-35 Lightning II [3]!