Manually dumping PE files from memory

In this post, I will show you how to manually dump (non-memory-mapped) PE files from memory using IDA.

Before you cringe and spam me some links to plugins such as OllyDumpEx, consider that this post is oriented towards understanding the PE file format a bit deeper and not towards reinventing the wheel.

The basic structure of a PE file

A PE file is basically a bunch of blocks of memory (sections) put one after another. The references between them are defined in terms of relative addresses (i.e. amount of bytes from the start of the file), and look like this:

+----------------------------------------+ addr = 0 (size = 200)
|MZ   This program...         PE         |
|                                        | <--- PE headers
|                                        |
+----------------------------------------+ addr = 200 (size = 100)
|      Section                           |
+----------------------------------------+ addr = 300 (size = 150)
|      Section                           |
+----------------------------------------+ addr = 450 (size = 170)
|      Section                           |
+----------------------------------------+ addr = 620 (size = 80)
|      Section                           |
+----------------------------------------+ addr = 700

(the values are imaginary for simplicity)

The PE headers contain a bunch of information, such as the entry point for the binary, compilation timestamp, section addresses and sizes, etc.

We’re only interested in the sections: if you look closely at the graph above, you’ll see that, if you sum a section’s address plus its size, you get an address which points to the next section. And if you take the last section and sum its address and size, you get the size of the file! (address: 620, size: 80; total file size: 700 bytes)

Inspecting the file

Now that you know the start address of a binary in memory, you have to parse the structures in order to find the size. Don’t be scared, however: IDA gives us all we need to do that quickly.

To dissect the structures, we’ll need to load the type libraries that contain them; that is, IDA won’t ever load e.g. OS X structures when analyzing a PE file, or viceversa, and that’s done by loading the right type libraries for each file type. However, IDA sometimes doesn’t do the right thing, so I will cover all the steps, from scratch to dump.

To load the required structures, press Shift + F11 to open the Type Libraries window, press Ins to insert a new one, and then select the ntapi type library:

type libraries

After that, we need to load the structures we’ll use. In our case, we need IMAGE_DOS_HEADER, IMAGE_NT_HEADERS and IMAGE_SECTION_HEADER. To do this, press Shift + F9, then Ins to insert a new structure, then Alt + A to insert a pre-defined structure (so as to avoid redefining structures that IDA already knows about), and load the previously mentioned structures:

needed structures

As explained above, the last section’s address plus size is the size of the binary. The sections are located after the PE header, and the PE header is found by adding together the address of the MZ header (i.e. start of binary) and e_lfanew.

Now, go to your in-memory file, put the cursor at the start of your MZ (DOS) header, and press Alt + Q, to apply a structure to that offset. Pick IMAGE_DOS_HEADER, and you’ll see this:

DOS header of file

As mentioned, the PE header is always at MZ + e_lfanew, so just take the base address, add e_lfanew to it (0xB8 in this case), and you’ll end up in the PE header, to which you should apply the IMAGE_NT_HEADERS structure. You’ll get:

nt header

Now, take note of the number of sections (because, after all, we’re looking for the last section of the binary), scroll down to the end of the structure, and right there, you’ll find an array of contiguous structures, all of which are IMAGE_SECTION_HEADERs. There are NtHeader.NumberOfSections sections, which in our case is 5, so put the cursor on the first byte, apply the structure IMAGE_SECTION_HEADER to it, and then press * to make it an array, and specify 5 as the number of elements:

array of structures

Scroll down to the last structure, add together PointerToRawData and SizeOfRawData, and that’s the size of your in-memory binary!

Dumping the file

To dump the file, you can use IDAPython:

open("dump.bin", "wb").write(GetManyBytes(mz_addr, size, 1)) # '1' means 'read from debugger memory'

And done!

There’s a catch, however: if you’re using IDA 7.0 or newer, the above code won’t work (or will work only on the early 7.0 versions). The reason for that is that the IDA 7.0 update did breaking changes to the API, and only the earlier versions have a compatibility layer. Making it work for IDA 7+ is left as an exercise for the reader :-)