Monday, December 22, 2014

Working with Visual Studio 2012 & pintools

Hello there fellow pinheads, this post is going to be short but I think it will be useful for generations to come of pinheads. 

I've been working with PIN for quite a long time now and I've reached the conclusion that having to edit the example Visual Studio solution is too annoying. This also is a problem with newcomers as editing the sample solution may lead to difficult to debug problems. 

It is for that reason that I've delved into VS2012's dungeons and came up with a template that generates the skeleton of a pintool and all the compiling targets that make sense on Windows. That is, 64 and 32 bit targets.

Installation is quite easy, you just need to download the template zip file from the link below and copy it to the following directory:
C:\Users\agustin\Documents\Visual Studio 2012\Templates\ProjectTemplates\Visual C++ Project
You also need to add to your environment the variable PIN_ROOT pointing to the installation directory of pin. Make sure to give the absolute path without spaces. This allows us to create our pintools outside pin's tools directory.

The template is can be downloaded here. It is important to know that pin is quite version change averse, so make sure you have at least pin Version 2.13 Revision 65163. 

Let me know what you think. Cheers.

Sunday, May 06, 2012

IDA Pro 6.2 + Ubuntu 12.04 AMD64

Installing IDA Pro on Linux (AMD64) can be a pain. In previous versions of Ubuntu what I had to do to achieve such task, is to build a IA32 chroot environment (following this guide). While effective, this is in my opinion not ideal.

Fortunately, in the latest version of Ubuntu it is possible to install almost all the IA32 dependencies by hand following a simple scheme.

First we need to see which dynamic libraries are not found by the loader. To do so we can use the `ldd` command to print all the dynamic libraries missing:

$ ldd idaq | grep found => not found => not found => not found => not found => not found

Once we have the list of missing libraries, we need to see from which packaged they come from. One simple way is to use `dpkg`. So for each of the missing libraries we proceed like this:

$ dpkg -S
libxext6: /usr/lib/x86_64-linux-gnu/
As we can see, the file is provided by `libxext6`, but we need to take into account that we need the IA32 versions of the libraries.
Fortunately, Ubuntu does allow us to install both versions and it is just a matter of adding ":i386" at the end of the package name.

$ sudo apt-get install libXext6:i386
Once we have installed each one of the libraries IDA Pro will fire up, but we will receive a dissapointing message the the IDAPython plugin is not working due to missing dependencies.
dlopen(/home/agustin/opt/idapro/plugins/python.plx): cannot open shared object file: No such file or directory/home/agustin/opt/idapro/plugins/python.plx: can't load file
We need to proceed in the same way as we did before, but there is a slight difference. We need a dynamic library that comes from Python 2.6 and as the release notes says, Python 2.6 has been deprecated.

In a previous iteration of this blog entry what I did was to download these packages from an old Ubuntu repository. This was not ideal since I always ended up breaking some dependencies and the package manager was not happy about it.
So I took another way and tried to build it from the source. The steps you need to follow are described bellow and need to be issued in the Python2.6 source directory:

$ CC="gcc -m32" LDFLAGS="-L/lib32 -L/usr/lib32 \
-L`pwd`/lib32 -Wl,-rpath,/lib32 -Wl,-rpath,/usr/lib32" \
 ./configure --prefix=/opt/pym32 --enable-shared
$ make -j 8
$ sudo make install
This will install python in the directory `/opt/pym32` along with all the needed shared libraries for IDAPython to run.
The last step is to tell the loader where those libraries are. There are multiple options but for simplicity sake I choose to export the environment variable `LD_LIBRARY_PATH` and make it point to `/opt/pym32/lib`
$ export LD_LIBRARY_PATH="/opt/pym32/lib"
$ /home/agustin/opt/ida/idaq64
And that's it, now you have a running version of IDA Pro all with IDAPython running as it should. 

Friday, March 04, 2011

The rise of the undead (WebKit Heap Exploitation)

I remember one day Pablo told me, "forget about attacking metadata, that just wont help you" (or something along those lines, sorry about the inaccuracy).
He was right, as he almost always is, in that case.

Long gone are the days where heap exploitation was all about attacking (overwriting) the internal structures of the heap allocator.

But are they really gone?

It is well known that nowadays, attacking metadata on Windows Heap it is a real pain in the rear. That's one of the reason most of the current heap research papers are focused on crafting the heap layout in order to exploit use-after-free conditions (there are some great exceptions, like Chris Valasek's research on the Low Fragmentation Heap).

But heap overflows are not restricted just to the operating system heap implementation. There are plenty of custom heap allocators in the world of Open Source Software and some of them are widely used (without most of the people noticing).

Our research case was TCMalloc. The reason? Mainly because it is used by WebKit, one of the most popular Web Browser engines out there. WebKit is responsible of rendering web pages in Chrome, Safari a huge amount of Smart Phones (yes Android uses it).
I don't like to lie to myself, owning Bas's phone was enough reason to do the research, the rest was just a bonus.

Understanding the heap allocator (as usual) was the first step in the right direction that drove the whole research to a good end.

TCMalloc is a heap allocator initially designed by Sanjay Ghemawat and Paul Menage from Google. It was designed with two main premises, speed and multi-threaded environments. This two premises drove the whole design of all the
internal structures to be almost lock free (in order to reduce lock contention) and thread local.
From an attacker perspective, this means that we are going to need to control a lot of factors if we need to succeed at exploiting vulnerabilities that involve corruption of information between threads.

The allocator itself it is not complex. It is divided in three main subsystems
  • Page Allocator
  • Central Cache Allocator
  • Thread Allocator

The main place where memory is obtained is at the Page Allocator. This layer of abstraction deals with the system level memory allocators. It can allocate memory from several places depending on which operating system you are on. The options are VirtualAlloc, sbrk, mmap or even /dev/mem.
At this level, memory is handled in integral multiples of a page. This groups of pages are called Spans in TCMalloc terminology.
A global array of list keeps track of each of the span the system has allocated. This array is indexed by the number of pages the span contains. That is, page_heap_freelist[N] will contain a list of free spans of N pages each.
There are 256 entries in the page_heap_freelist (this depends if you are looking at Chrome or Safari, etc.). Those spans with more than 256 pages are handled by another free list of what is called a "really large span".

The Spans serves for two main purposes:
  • to hold a large object
  • to be used by the central cache

Memory in multiples of the page size is a little bit inconvenient to use directly from the application. The Central Cache allocator is in charge of making those Spans usable by the application.
The central cache keeps track of the status of spans and also to split the spans into units that are more manageable by the application.
To do so, the central heap has an array of free lists indexed by size class. Each free list contains a set of spans which are spitted into objects of the corresponding size class.
Those objects could have been handed directly to the application, but remember that this allocator is thread oriented, so in order to serve multiple threads, many of the structures of the Central Heap should be guarded by a lock in order to prevent races and stuff that we *do not like*.

That's the reason why there is another level of abstraction. The Thread Heap is a per thread structure where the central allocator stores memory chunks. Because of the nature of per thread structures, there is no need to hold locks to access them.
The Thread Cache consists mainly of an array of free lists indexed by size class. These free lists are populated with objects by the Central Cache allocator and consumed by each thread that needs some objects.

That is basically how the algorithm and structures are tied together. With an in-depth understanding of how these subsystems interact we as attackers can start to make some assumptions about how we can force TCMalloc to be our bitch^H^H^H^H^H friend.

The next step is too take a look into where all this structures are placed and how are they connected to each other, but that my friend is something that we are going to go delve into at Infiltrate (and Immunity's Masters Class).

To give the readers a quick glance over what we have been doing lately, here is a screen-shot of a specially crafted heap layout used by a Canvas exploit to pwn your browser (yes, it will *penetrate* your phone, your windows and your fancy Mac):

To say the least, the outcome of this research was more than satisfying. From an attacker perspective, everything started to look exploitable.

Thanks for reading.


Friday, August 14, 2009

QEMU Minimal Linux Kernel Config

This is a minimal .config file for the Linux Kernel ( that allows you to run this QEMU Image

Config File

One of the uses I gave to this minimal configuration is kernel development.
One of the benefits of using this .config file is that the compiling time is drastically reduced and also the code size of the kernel is reduced.

To run qemu with the bzImage of your kernel, copy the .config to your source directory
and compile as usual (make all, maybe). Then use the recently compiled image to run QEMU

gr00vy@kenny:~/OSDev/Qemu$ qemu -hda linux-0.2.img -kernel ../linux- -k es -append "root=\"/dev/hda\" ro"

Tuesday, November 21, 2006

File Fuzzer

Hello, now im writting to release a little file fuzzer.
It has been very useful for me, but now im bored of it, so its time for other people to play with it.


- Can trace deep into child child's. (Fuzzer->Child0->Child1->...->ChildN) :)
- You can define the file structure and then pass it to the fuzzer.
- It can "learn" the file format (In the case of ASCII Input files).
- Pretty fast (Compared to other file fuzzers).
- Works on Linux (Full support) and FreeBSD (No ptrace support, but it is not difficult to make the port, just change some #defines and then you will be OK).
- No makefile :P

Sample Output:
gr00vy@kenny:~/ffuzer/src$ ./gwar -D -i ../iput.elf -o ../tmp/output.elf -r -52 -t 3 -m 5 "/usr/bin/readelf -a %FILENAME%"
[%] Logging to readelf.log
[%] Loaded 190 fuzzing variables
[%] Fuzzing from 0 to 52
[%] Number of files to be generated 10070
[%] Proceding with fuzzing
[%] Byte [ 52] FuzzString [ 0] Process [ 0] Bugs? [40]
[%] Time elapsed 79.000000
[%] Number of succesful executions 10070
[%] Skipped executions due to fuzzing string size 0
[%] Number of "bugs" found: 40
Sample Bug Report

[i] Signal: Unknown signal 127
[i] Fuzzing string: 186 Offset: 52
[i] Detail: address not mapped to object - Address of exception: 0x4949e961
Registers dump:

eax = 0x4949e961 ebx = 0x00004141 ecx = 0x00000000
edx = 0x0806853c esi = 0x00001274 edi = 0x0808921f
ebp = 0xbfb8a398 esp = 0xbfb8a360 eip = 0x080685bf

Stack frame dump:

0xbfb8a360: 84 a3 b8 bf 5d 5c e4 b7 a4 a3 b8 bf 98 08 08 08 ....]\..........
0xbfb8a370: f4 cf f3 b7 04 00 00 00 1f 92 08 08 98 a3 b8 bf ................
0xbfb8a380: 03 68 e6 b7 c0 d4 f3 b7 8e 08 08 08 41 41 00 00 .h..........AA..
0xbfb8a390: 74 12 00 00 1f 92 08 08 t.......

Disassembly dump:

806858d ADD [EAX], AL
806858f ADD CL, CH
8068591 CMPSD
8068592 ADD [EAX], EAX
8068594 ADD [EBX-0x49f0f7bb], CL
806859a ADD [EDI], CL
806859c MOV DH, 0xd0
806859e MOV EAX, [EBP+0x8]
80685a1 INC EAX
80685a5 MOVZX EAX, AL
80685a8 SHL EAX, 0x8
80685ab OR EAX, EDX
80685ad MOV [EBP-0x20], EAX
80685b0 MOV DWORD [EBP-0x1c], 0x0
80685b7 JMP 0x806873c
80685bc MOV EAX, [EBP+0x8]
EIP -> 80685bf MOVZX EAX, BYTE [EAX]

Thats all for now, if you have any suggestions, please leave a comment.