Post Views: 909

Android kernel backdoor vulnerabilities

Looking back at the history of Android kernel vulnerabilities, it can be found that most of the Android kernel vulnerabilities are memory vulnerabilities, while logic vulnerabilities are relatively rare. Because memory vulnerabilities have typical vulnerability patterns, obvious side effects, and more complete detection methods, such vulnerabilities are easier to find. Correspondingly, logic vulnerabilities have no typical vulnerability patterns (often closely related to functions), uncertain side effects, and a lack of universal detection methods. Therefore, it is relatively difficult to mine such vulnerabilities. Because of this, logical loopholes have their unique charm.

This article will analyze CVE-2021-28663 in-depth, which is a logic vulnerability in the ARM Mali GPU driver. This loophole can be called a backdoor:

1 Anti-fragmentation: Affects mobile phones using MediaTek, HiSilicon, and Orion SoCs. Almost all mobile phones that have been released in recent years are affected;

2 Attacks are concealed: The attack method of this vulnerability is very different from the common exploitation methods, according to, As far as I know, there is currently no public information on how to exploit this vulnerability;

3 Ordinary apps can easily steal other apps or kernel runtime data, or even modify the code of other apps, the whole process does not need to obtain any additional permissions;

4 ROOT escalation Has a 100% success rate;

Vulnerability impact

Since MediaTek, HiSilicon, and Orion SoCs all use ARM Mali GPUs, mobile phones using these chips may be affected. I have collected some mainstream chips or related source codes of mobile phones, and found that they are all affected:

time	Vendor	Phone model	Chip model	Driver version
2021	SAMSUNG	S21	Exynos 2100	v_r20p0
2020	HUAWEI	Mate40	Kirin 9000	r23p0-01rel0
2020	Redmi	K30U	Dimensity 1000+	v_r21p0
2020	Redmi	10X	Dimensity 820	v_r21p0
2020	SAMSUNG	S20	Exynos 990	v_r25p1
2019	HUAWEI	Mate30	Kirin 990	b-r18p0-01rel0
2019	Redmi	Note8 Pro	Helio G90T	b_r20p0
2019	SAMSUNG	S10	Exynos 9820	b_r16p0
2018	HUAWEI	Mate20	Kirin 980	b-r18p0-01rel0
2018	Redmi	Redmi 6	Helio P22	m-r20p0
2018	SAMSUNG	S9	Exynos 9810	b_r19p0
2017	HUAWEI	Mate10	Kirin 970	b-r14p0-00cet0
2017	LENOVO	K8 Plus	Heli0 P25	r18p0
2017	SAMSUNG	S8	Exynos 8895	b_r16p0
2016	HUAWEI	Mate9	Kirin 960	b-r14p0-00cet0
2016	Meizu	M3x	Helio P20	r12p1
2016	SAMSUNG	S7	Exynos 8890	r22p0
2015	HUAWEI	Mate8	Kirin 950	r20p0-01rel0
2015	SAMSUNG	S6	Exynos 7420	r15p0

As mentioned in the review, ordinary apps can use vulnerabilities to complete the following attacks:

1 Stealing the runtime memory data of
other apps 2 Modifying the code of other apps
3 Stealing the kernel runtime memory data
4 Obtaining ROOT permission stably

Relatively common kernel vulnerabilities, this vulnerability not only can stably obtain ROOT permissions, but also can obtain the runtime data of other apps and kernels in a very hidden way, and even modify the code of other apps. The whole process does not need to obtain any additional permissions. Judging from the attack process and results, it can be called a backdoor-level vulnerability.

Vulnerability analysis

In addition to the CPU, there are many processors specially made for specific business scenarios on an SoC, such as GPU. The main function of the GPU is to render graphics. With the help of IOMMU, the GPU can have its own virtual address space. By mapping physical pages, data can be efficiently transferred between GPU and CPU. The realization of the above functions depends on the kernel driver.

GPU mapping physical page process-return a fake virtual address

Specific to the GPU designed and implemented by ARM, it uses the Mali driver. An important function of the Mali driver is to maintain the IOMMU page table for the GPU. When an application (running on the CPU) wants the GPU to process data or render graphics for it, the driver needs to help map the physical page where the data is located into the address space of the GPU, so that the GPU can “see” the data immediately. There is no additional data copy operation in the whole process, which greatly improves the processing efficiency. The Mali driver implements the following related operations:

Serial number	Order	Function
1	KBASE_IOCTL_MEM_ALLOC	Allocate the memory area, the pages in the memory area will be mapped to the GPU, and you can choose to map to the CPU at the same time
2	KBASE_IOCTL_MEM_QUERY	Query memory area attributes
3	KBASE_IOCTL_MEM_FREE	Free memory area
4	KBASE_IOCTL_MEM_SYNC	Synchronize data so that the CPU and GPU can see the results of each other’s operations in time
5	KBASE_IOCTL_MEM_COMMIT	Change the number of pages in the memory area
6	KBASE_IOCTL_MEM_ALIAS	Create an alias for a memory area, that is, multiple GPU virtual addresses point to the same area
7	KBASE_IOCTL_MEM_IMPORT	Map the memory pages used by the CPU to the GPU address space
8	KBASE_IOCTL_MEM_FLAGS_CHANGE	Change memory area properties

The memory region mentioned in the table is actually a concept in the Mali driver, which contains the physical pages actually used. The following analysis is based on Samsung A71 source code

Let me first introduce the KBASE_IOCTL_MEM_ALLOCcommand processing process. Through this command, you can understand how the driver maps physical pages to the process address space (CPU) and GPU address space.

The parameters received by this command are as follows:

The main input parameters are:

va_pages Indicates the maximum number of physical pages that the memory area to be allocated can hold. The driver will leave a virtual address range of the corresponding size in the GPU space; it

commit_pagesindicates how many physical pages the driver needs to allocate to this memory area. The application can call KBASE_IOCTL_MEM_COMMIT commands to adjust it according to its own needs. The number of pages;

flags indicate the properties of the memory area, such as whether it is mapped to the CPU, whether it is readable and writable;

The output parameters are:

gpu_va Represents the virtual address of the allocated memory area in the GPU space, and the GPU can use this address to access the corresponding physical page;

The specific allocation process is as follows:

If the process is 64-bit, the default BASE_MEM_SAME_VAmethod is used to create the mapping, which means that the CPU and GPU use the same virtual address. The specific allocation process is kbase_mem_alloc()implemented.

It is first called kbase_check_alloc_flags()to check whether the flags (attributes) passed by the application are legal:

The above code snippets are mainly related to the mapping attributes. You can understand through the code:

1 The memory area must be mapped to the GPU, and the mapping attributes can be read-only, writable only, and readable and writable (line 2619);

2 At least one of the CPU and GPU can read the memory area, otherwise, the allocation of physical pages is meaningless (line 2619). 2593);

3 Similarly, at least one party can write to the memory area, otherwise, it is meaningless to allocate physical pages (line 2597);

After that, the driver calls kbase_alloc_free_region() to allocate a new memory area kbase_va_region:

I extracted the relevant fields:

nr_pages Indicates the maximum number of physical pages that this area can contain;

cpu_alloc used for CPU address space mapping;

gpu_alloc user GPU address space mapping;

kbase_reg_prepare_native() Responsible for initialization reg->cpu_alloc and reg->gpu_alloc:

Here we need to make reg->cpu_alloc and reg->gpu_alloc point to the same object (line 567), they are all kbase_mem_phy_alloc:

I only extracted the relevant fields:

kref Indicates the number of references to the object;

gpu_mapping sIndicates how many virtual addresses are mapped to the area (think of the KBASE_IOCTL_MEM_ALIAS command mentioned earlier );

nents Indicates how many physical pages are currently available;

pages Indicate the array of physical pages;

reg Pointing to the reg

type that contains the object; Indicates the memory type, here it is KBASE_MEM_TYPE_NATIVE;

The basic data structure has been established, the driver calls kbase_alloc_phy_pages() to reg->cpu_alloc allocate physical pages, and then mounts reg into the kctx->pending_regions array:

The logic here is very simple: kctx->pending_regions find a free position in the array (line 391), and then save reg (line 393). It should be noted that the return value is not the real address (line 405), but a temporary value (line 396/ 397), this value will be used in the subsequent process.

So far, kbase_api_mem_alloc() the main process has been analyzed:

GPU mapping physical page process-create CPU and GPU mapping

How should an application use a fake virtual address? Actually as a mmap system call parameter:

mmap

mmap The system call finally calls the Mali driver to register kbase_mmap(). The specific process of this function is as follows:

mmap The normal semantics of system calls are to map physical pages to the address space of the process. As the driver specifies BASE_MEM_SAME_VA, kbase_mmap()in addition to the normal mapping function, these physical pages must be mapped to the GPU address space. It should be noted that the virtual addresses mapped by the CPU and GPU are the same.

Only analyze here kbase_gpu_mmap():

kbase_gpu_mmap() The main function is to map the physical page to the IOMMU, that is, call kbase_mmu_insert_pages(), and then increase the alloc->gpu_mappings reference count by 1. This reference count is very important, and the driver determines whether the relevant operation can be applied to the corresponding memory area by looking at this reference count. Ultimately, the mmap return value of the system call is the virtual address mapped to the CPU and GPU.

When allocating physical pages, these pages are not mapped to the virtual address space of the GPU, so the reg->gpu_alloc->gpu_mappings count is 0; when kbase_gpu_mmap() the physical pages are mapped to the GPU space, the reg->gpu_alloc->gpu_mappings count is increased by 1. From a semantic point of view, this is very reasonable, and it gpu_alloc->gpu_mappings accurately and timely represents the mapping status of the physical page in the memory area. However, as the functions increase, the situation becomes more complicated.

GPU mapping physical page process-alias operation

As I mentioned before, Mali GPU implements the KBASE_IOCTL_MEM_ALIAS command, and its main function is to map the same memory area to multiple different virtual address spaces. The entire alias implementation process is similar KBASE_IOCTL_MEM_ALLOC, but it is also divided into two steps:

kbase_api_mem_alias() The main logic is kbase_mem_alias() completed, and its implementation is as follows:

First, kbase_mem_alias() check the flags passed in by the user. It can be seen from it (line 1696): The alias mapping allows the CPU to be read-only and the GPU to read and write. This condition restricts the use, and I will analyze it later. Then allocate a new reg (line 1727), and allocate gpu_alloc (line 1743) for it. Here, the previously allocated reg is not used directly (look back kbase_mem_alloc()), but a new reg is created.

Then find the reg (line 1777) according to the handle passed in by the user, and after some checks, reg->gpu_alloc->imported.alias.aliased[i].alloc the original reg is quoted. At the same time, 1 kbase_mem_phy_alloc_get() will be reg->ref added.

And kbase_mem_alloc(), like kbase_mem_alias() the reg mount kctx->pending_regions array (line 1831), returns false virtual address (line 1853).

After that, the user also needs to call mmap, kbase_gpu_mmapand the corresponding processing will be performed according to the type of reg (KBASE_MEM_TYPE_ALIAS):

kbase_gpu_mmap() The main logic seems very simple: map the kbase_mem_alias() collected memory area (line 1210) to the new address space (line 1211). If successful, the establishment of mapping the relevant reg->gpu_alloc a gpu_mappings plus one.

So far, the introduction of two important operations on the memory area is complete. From the above analysis, the related operations are accurate and reasonable, and there are no obvious problems.

GPU mapping physical page process-change attributes

Earlier I mentioned that the Mali driver implements the KBASE_IOCTL_MEM_FLAGS_CHANGE command, which can modify the properties of the memory area. The relevant implementation is kbase_api_mem_flags_change() in:

The main function of this function is to support the BASE_MEM_DONT_NEED operation, that is, the application no longer needs the physical pages on a certain memory area, the driver can cache these physical pages and release them at the right time (line 894); at the same time, the driver also supports reverse Operation: The application continues to use this memory area, and the driver needs to retrieve the cached physical page. If it has been released, a new physical page can be allocated (line 898).

A prerequisite for the above operation is that it reg->cpu_alloc->gpu_mappings cannot be greater than 1, which means that these pages are mapped to multiple virtual addresses. The Mali driver does not intend to deal with this complicated situation. If the memory area meets the above conditions, it kbase_mem_evictable_make() is called to clean up:

kbase_mem_evictable_make() First, cancel the previously established CPU mapping (line 771). At this point, the application can no longer access these physical pages through the virtual address. After that, the linked list will be gpu_alloc added kctx->evict_list. This linked list will actually be kbase_mem_evictable_reclaim_scan_objects() used:

kbase_mem_evictable_reclaim_scan_objects() The main function is to traverse the kctx->evict_list linked list (line 638), cancel the previously established GPU mapping (line 641), and finally release all physical pages (line 660).

At this point, the entire life cycle of the physical page has been analyzed. The vulnerability is actually hidden in the KBASE_IOCTL_MEM_ALIAS command and KBASE_IOCTL_MEM_FLAGS_CHANGE command. As mentioned earlier, kbase_mem_flags_change() there is a prerequisite: it reg->cpu_alloc->gpu_mappings cannot be greater than 1. The alias operation is implemented in two steps, and the gpu_mappings reference count is incremented by 1 kbase_gpu_mmap(). What if we only call kbase_mem_alias(), and then immediately call kbase_mem_flags_change()?

The answer is that we can map the released pages!

How to use

Through the above calling process, we can map almost all pages that can be allocated by the kernel to the CPU and GPU address spaces. As mentioned earlier, the alias mapping requirement is that the CPU is read-only and the GPU is readable and writable. We can steal the contents of these pages in the virtual address space of the process, but we cannot modify them. The GPU can read and write these pages, so the following analysis focuses on how to use the GPU to read and write physical pages.

Mesa

For Qualcomm’s Adreno GPU, whether it is the KGSL driver or the freeadreno project, you can find a large number of GPU private instructions to achieve GPU read and write memory. For ARM’s Mali GPU, there is no public information about its instruction set (commercial secret). The only clue is the Bifrost and Panfrost projects led by Alyssa Rosenzweig . I spent a long time trying to write a piece of binary code that can be run directly on Mali GPU. Finally found that this road was full of difficulties.

If there is no way to read and write physical pages on the GPU, this vulnerability can only achieve information disclosure. Do we really have nowhere to go?

We know that most of the software is a typical layered architecture, through continuous abstraction, and finally complete complex functions. Specific to the GPU, even if we don’t know anything about the instruction set, we can still make it draw graphics. This is due to OpenGL, which abstracts the bottom layer and shields the difference between hardware. However, OpenGL is more graphics-oriented, such as points, lines, projection, clipping, etc. I did not find an interface that can freely access the memory in a specific location.

In fact, the current GPU is not only for drawing graphics, it can also be used for intensive calculations. In conventional mathematical operations, reading a variable value from memory (reading memory) and writing calculation results to memory (writing memory) are basic operations. Can we achieve GPU read and write physical pages through the upper-layer encapsulation function??

OpenCL

While browsing Wikipedia’s introduction to OpenCL, I saw hope:

There are many OpenCL code examples on the Internet, so I won’t introduce them in detail here. Only show the OpenCL code used in the exploit I implemented.

Fragment 1: Leaking memory address

The OpenCL library itself allocates related memory, and I need to know the memory address it allocates. Through the above code, I can get the address.

Fragment 2: Reading from any address

The above code implements GPU arbitrary address reading. Since there are so many physical pages mapped, we can speed up this process through parallel programming 😉

Believe that you have already learned the essentials, we won’t show any address writing here.

ROOT escalation

Since we can map a large number of physical pages, these pages may be used to store application code or data, and may also store kernel code or data. In fact, the kernel exposes a large number of data structures, and there are many ways to achieve ROOT escalation. I won’t introduce them one by one here. The following is the local privilege escalation I realized on a certain mobile phone (100% success rate)

Patch

The main reason for the vulnerability is that the gpu_alloc->gpu_mappings increment count lags in the alias operation, which causes the mmap relevant physical page to be added to the list to be released before the system is called. The idea of the patch is to gpu_alloc->gpu_mappings advance the increment count to kbase_mem_alias():

Summarize

This article analyzes in detail a logic vulnerability in the ARM Mali GPU driver. This vulnerability can help an attacker:

1 Stealing the runtime memory data of other apps

2 Modifying the code of other apps

3 Stealing the kernel runtime memory data

4 Obtaining ROOT permission stably

Prior to this, as far as I know, there is no public information on how to exploit this vulnerability. This article points out a feasible method: By using OpenCL to bypass the GPU private instruction set, the GPU can read and write arbitrary memory.