BGET [0][0] is a simple heap allocator used in OP-TEE [0][0][0]. It is simpler than glibc’s allocator, but some interesting quirks, which might make exploitation interesting. Today, we will look into the inner workings of BGET.

This content already is almost a year old, and refers to OP-TEE 3.6.0 – right now, they are at 3.10.0, and so some things might have changed.

Applicability of Exploits

BGET

Reallocation and Use-After-Free of Dangling Pointers

As described in last post, the BGET heap initially behaves like a stack. In scenarios where a buffer on the heap are allocated and immediately freed afterwards, leaving a dangling pointer, its memory will be reallocated next. After the next allocation, the upper bytes and padding of the previously freed and the newly allocated buffer will overlap. This also applies if multiple buffers are allocated, if all of them are freed afterwards. The order of deallocation does not have an influence on the reallocation, yet it may cause more or less data to be overwritten during enqueuing of the released blocks in the freelist.

Depending on the usage of the dangling pointer, this reallocation may be used to extract or modify data stored later in the same location. Alternatively, write accesses following dangling pointers will allow modification of freelist links, to either exploit the unlinking process or add non-heap memory to the freelist, which is discussed later.

If buffers were allocated later than the buffer used after free, but are not yet freed, reallocation will only occur if the heap becomes sufficiently filled, without causing the TA to crash due to unhandled out of memory errors. In this case, the queue-like behaviour may reallocate the buffer, when all buffers freed earlier have either been completely reallocated or merged into free buffers freed later, or are too small for the requested allocation.

Double-free

If the protection measures on a GLIBC heap did not identify a double free, one possibility was the double creation of freelist cycles, which could cause GLIBC to return the same buffer upon multiple calls to malloc multiple times [0][0]

Reinsertion into the Freelist

With BGET, reinsertion of the same buffer into the freelist cannot result in simple double allocation: The double freeing of a buffer may lead to an infinite circular chain of links within the freelist, but also sets the block to declare a negative size, preventing allocation. This requires to find other mechanisms to exploit a doubly-freed buffer.

Also, these cyclic list links are not final, and may be resolved to a normal list by unlinking of the first free block following an element of that cycle, which will cause it to update the preceding block’s forward link. This cannot happen by reallocation, but may occur when the block is unlinked due to the block before being freed and merged.

Generally, reallocation from the freelist may be scarce. Further, if reallocation occurs, the first reallocation will either negate the block’s size, causing BGET to deem it to be too small for any further reallocation, or split the block and reduce its size, creating new buffers not within the freelist, which cannot be reallocated.

Instead, reallocation of the first block within that circle will allow arbitrary data to overwrite the list links. Depending on whether the ordinary or circular incoming link most recently updated the backwards-directed link, the other incoming link will not be unlinked upon reallocation. This could be used to add additional within non-secure shared memory into the freelist.

Doubly-Freed Buffer Preceded by Free Block

If the doubly-freed buffer is preceded by a free block, its own metadata structures will not be updated. Instead, the preceding free block will simply be expanded by the freed block’s size again. This will lead to an incorrect address calculation of the following block’s header address. The resulting potentially non-existing block header is then either merged, or its assumed prevfree field is updated, depending on the expected bsize field. This is shown in this graphic:

BGET memory deallocation

(Visualization of wrong address calculation upon double free of a buffer succeeding a free block. Dashed arrow: intended header dereferenciation as on first free. Solid arrows: actual dereferenciation on double free. White: Metadata. Gray: buffers or unused memory with freelist links. Striped: undefined, potentially user-controlled memory.)

Note that due to the memory layout shown presented in our ARES 2020 paper [0], depending on the buffer’s size, and the heap’s capacity and its location in the .bss section, this imaginary block header may also be located within other .bss memory, within the stack, or within shared memory.

This offers multiple primitives suitable for exploitation, if other memory’s corruption and dereferenications can be limited not to crash the TA too soon:

  • The reallocation of the merged block, offering memory for a buffer exceeding its size, overwriting heap memory above; and either
  • Misuse of BGET unlinking the imaginary block above; or
  • Misuse of BGET updating the prevfree field of the imaginary block above.
Doubly-Freed Buffer Preceded by Used Block

Since the offset calculation of GLIBC does not require negation of size fields, a double-free might be idempotent considering its neighboring blocks, if no merging takes place; only adding the free buffer a second time in the freelist, if security checks do not prevent this.

BGET however, if the block is not merged into another block, negates the block header’s size field upon any free operation, which leads to wrong address calculations, and may corrupt not only the freelist, but neighboring buffers too, even without any buffer actually being reused.

In contrary to the previous situation, on a double-freed buffer’s block, preceded by a block in use, would be updated and added to the freelist itself. The unexpectedly positive size field on a buffer expected to be in use would cause the second invocation of free to look and write into preceding memory when intending to update the following buffer instead, as shown in this graphic:

BGET memory deallocation

(Visualization of wrong address calculation upon double free of a buffer surrounded by blocks in use. Dashed arrow: intended header dereferenciation as on first free. Solid arrows: actual dereferenciation on double free. White: Metadata. Gray: buffers or unused memory with freelist links. Striped: undefined, potentially user-controlled memory.)

This also offers the same possibilities as of earlier, however, the a block size increase would have to be specified within the block expected above, if it is unlinked and merged.

Capabilities of Available Primitives

Now, the capabilities of the primitives provided can be analyzed:

Wrong Buffer Allocation

The merging with imaginary blocks described by headers read from user-controlled buffers offers resizing of blocks, which can cause memory to be doubly allocated. Also, the insertion of freelist elements can introduce fake blocks, which can cause buffers to be allocated in non-secure memory.

To exploit such blocks, the first elements of the freelist has to be removed until the oversized buffer has advanced into a location where it will be reallocated at the desired point of time. To accomplish this, two strategies can be used:

User-Defined Allocation Size or Number

If an attacker can control the size or number of allocations within suitable ranges, and if the heap’s current layout can be predicted using reverse engineering, a straightforward strategy is available for removal of the freelist’s first element: The size of the first block, minus 16 bytes of overhead, can be allocated with a single or multiple allocations, until is has been fully consumed. The allocations may then be released again, and will result in a new block at the end of the list, not standing in the way of the desired block’s reallocation anymore.

Advancing Depletion

If only a limited size and number of allocations is possible at a time, the freelist’s first element can be depleted using a slower strategy: Two buffers have to be allocated, then alternatingly be released and allocated again. This will slowly transfer free memory from the freelist’s first element to a new last element, with the other buffer preventing merging of those blocks. Howeveer, if the size of these two buffers is larger than 16 bytes, it may be impossible to deplete a 16 byte remainder of the first block, preventing the first element of the freelist from being removed.

Prevfree-Overwrite via Merging

If the fake imaginary block header within user-controlled memory contains a positive size field, it will be considered to describe a free block. This will lead to both this imaginary block being merged with the currently freed block. During this process, the size of the block resulting from the merge will be written in the 8 bytes after that block. Alteration of the fake size field of the merged buffer by up to 7 bytes may offer additional values that can be written. This may be useful when modifying a simple boolean, likely offering both the possibility of writing a null or non-null byte into the target memory. However, it does not offer much more control over the data written, especially when targeting a multi-byte integer field.

Repeated application with multiple double-freed buffers and buffer sizes chosen to result in a desired low-order byte of the size field is unlikely to allow writing multi-byte data like pointers, since the actual buffer’s sizes will always be a multiple of 16.

Within the same merging process, the imaginary block merged will be unlinked. As every free block, it is expected to contain a forward and backward link. To merge it, it first has to be unlinked from its position in the list, which is done by connecting the following and preceding block in the list. At the two target locations, no data is validated if assertions are disabled. This allows to fill two memory locations with pointers to approximately each other’s location, as shown in this graphic:

BGET memory deallocation

(Visualization of memory locations written (gray) depending on the two link pointers (dashed) in the metadata of a block to be unlinked.)

Since AArch64 also allows unaligned memory access, and the user-controlled pointers are not restricted to a 16-byte alignment, any 8 bytes can be overwritten with any pointer location to writable memory. This would allow writing arbitrary data: by setting one link to point into unused memory, such as low stack addresses, any low-order byte can be chosen for this pointer. The second link then pointing in successive addresses at the target location will allow overwriting successive bytes by repeating this attack.

Since the bytes reserved for the previsize and bsize fields at the target location are not changed, writing a pointer to a pointer or a pointer to a jump instruction can be simpler, since the back-pointer in the memory location of the other link will not be evaluated.

The latter two primitives’ usage is mostly limited by the corruption of the memory around the link’s origin. Additionally, undesired calls to release memory may crash the TA during attempted block merges.

An Example

Let’s assume the following heap layout:

BGET memory deallocation

(Visualization of the heap layout of the exemplary heap TA when executing the exploit code. White background: buffers and metadata allocated due to command invocations. Starred pointer: incorrect variant of pointer)

Heap

For the demonstation of the heap exploit, the unlinking of an imaginary element is used to be able to edit a byte. The heap TA implements an exemplary internal user and session management, which allows a client to mark its session as logged in using a password, to gain access to a secret. A buffer doubly freed during session deallocation is used in combination with arbitrary data in the username buffers to change the flag indicating a successful login on another open session.

The address of the heap can be calculated as shown in the previous section. The adresses of further objects are predictable, since they are placed contiguously. The size of unknown structures, such as passphrases for existing users in this example, can – if necessary – be guessed with low effort, since the buffer’s size is guaranteed to be a multiple of 16.

Given the heap layout shown above, we can calculate the necessary lengths ( x​ ) and ( y ) of the two username fields to have BGET search for the block header following the username1 buffer at the start of the username2 buffer. The buffer headers are named after the following buffer, with the star marking the incorrect value calculated by BGET upon double freeing the username1 buffer.

The goal is to have the incorrect pointer to \( bhead_{optee} \) pointing to \( username2 \) instead: \[ bhead_{optee}^* \overset{!}{=} username2 \]

The size of the username2 buffer \( y \) is constrained to be large enough to hold the block header and the links of the block intended to be unlinked, as well as the header for the following block, whose prevfree field will be updated: \[ y\geq\texttt{0x30} \]

Upon freeing \( username1 \), BGET will subtract 16 to get the pointer of the buffer’s header: \[ bhead_{username1} = username1 - sizeof(bhead) \&= username1 - \texttt{0x10} \]

Usually, BGET will then add the size of the block to get a pointer \( bhead_{optee} \) to the next block’s header: \[ bhead_{optee} = bhead_{username1} + (sizeof(bhead) + sizeof(username1)) \&= username1 + x \]

Since the block already has been freed, the size field is negated and the size will be subtracted instead: \[ bhead_{optee}^* = bhead_{username1} - (sizeof(bhead) + sizeof(username1)) \&= username1 - \texttt{0x20} - x \]

Given these preconditions and the offsets between these buffers, we can now determine a relation between the two buffers’ sizes: \[ username1 - \texttt{0x20} - x = username2 \]

\[ username1 - \texttt{0x20} - x = username1 - (3*sizeof(bhead) + sizeof(session1) + sizeof(password1) + sizeof(username2)) \]

\[ username1 - \texttt{0x20} - x = username1 - (3*\texttt{0x10} + \texttt{0x30} + \texttt{0x20} + y) \]

\[ username1 - \texttt{0x20} - x = username1 - \texttt{0x80} - y \]

\[ x = \texttt{0x60} + y \]

Since 0x30 is a suitable value for the size of the \( username2 \) buffer, we can now choose this value, and calculate the appropriate size of the \( username1 \) buffer: \[ y := \texttt{0x30} \] \[ x = \texttt{0x90} \]