Compare commits

..

39 Commits

Author SHA1 Message Date
Daniel Micay
749640c274 update copyright notice 2024-02-15 02:57:33 -05:00
Dmitry Muhomor
7268189933 mte: use tag 0 for freed slots, stop reserving tag 15 2024-01-23 12:56:54 -05:00
Dmitry Muhomor
3c1f40aff0 amend memory tagging README section
Memory tagging is enabled by default in bionic, but can be disabled at any point.
Memory tagging can't be re-enabled after it's disabled.
2024-01-23 12:56:54 -05:00
Dmitry Muhomor
5fbbdc2ef8 memtag_test: add test for MADV_DONTNEED behavior 2024-01-23 12:56:54 -05:00
Dmitry Muhomor
7d2151e40c mte: remove util.h dependency from arm_mte.h
It's needed for including arm_mte.h into memtag_test.cc
2024-01-23 12:56:54 -05:00
Dmitry Muhomor
4756716904 memtag_test: move SEGV code checks to device-side binary 2024-01-23 12:56:54 -05:00
Daniel Micay
a3bf742c3e remove trailing whitespace 2024-01-03 14:44:08 -05:00
Julien Voisin
53a45b4661 Improve a bit the formulation of the MTE documentation 2024-01-03 13:40:42 -05:00
Daniel Micay
abe54dba27 update memory tagging documentation 2024-01-03 12:22:56 -05:00
Dmitry Muhomor
365ee6900d android: restore the default SIGABRT handler in fatal_error()
async_safe_fatal() calls abort() at the end, which can be intercepted by a custom SIGABRT handler.

In particular, crashlytics installs such a handler and tries to fork() after catching SIGABRT.

hardened_malloc uses pthread_atfork() to register fork handlers. These handlers try to lock internal
hardened_malloc mutexes. If at least one of those mutexes is already locked, which is usually the
case, thread that called fatai_error() gets deadlocked, while the other threads (if there are any)
continue to run.
2023-12-31 11:21:28 -05:00
Christian Göttsche
7093fdc482 README: add note about AppArmor constraint on Debian 2023-12-14 09:06:32 -05:00
jvoisin
61821b02c8 Clarify a bit why a particular magic number was chosen 2023-11-16 14:25:54 -05:00
Daniel Micay
3c274731ba Revert "use safe_flag for -fstack-clash-protection"
This reverts commit 4171bd164e.
2023-11-14 16:19:33 -05:00
Daniel Micay
4171bd164e use safe_flag for -fstack-clash-protection 2023-11-08 14:21:04 -05:00
jvoisin
352c083f65 Run the testsuite on multiple compiler versions 2023-11-05 17:58:32 -05:00
Dmitry Muhomor
88b3c1acf9 memtag_test: fix sporadic failures of overflow/underflow tests 2023-11-01 17:33:20 -04:00
Daniel Micay
f793a3edf6 update README now that MTE is implemented 2023-10-30 14:23:48 -04:00
Dmitry Muhomor
fd75fc1ba8 mte: add scudo to CREDITS file 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
72dc236d5f mte: add untag_pointer() variant for const pointers 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
be08eeee2d mte: update comment about skipped tag array update in deallocate_small() 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
25f0fe9c69 remove an always-true sizeof(u8) assert 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
c75cb4c3f3 mte: refactor tag_and_clear_slab_slot()
Explicitly call is_memtag_enabled() before calling tag_and_clear_slab_slot() to make it clearer that
memory is not zeroed when MTE is disabled.
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
b560431c01 mte: note why 0 tag is excluded 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
009f2dad76 mte: note alignment requirements of arm_mte_tag_and_clear_mem() 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
03883eb2ce mte: rename arm_mte_store_tags_and_clear() to arm_mte_tag_and_clear_mem() 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
7a6dbd8152 mte: add comment about the reserved slab canary value 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
f16ef601d4 memtag_test: improve capturing of test results
Using debuggerd + logcat parsing is unreliable and slow, print SEGV signal code to stderr instead.
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
155800526a memtag_test: improve tag_distinctness test
- check that tag distinctess checks are actually reached (it was previously verified manually by
looking at the now-removed printf output)
- check that only non-reserved tags are used
- check that all of non-reserved tags are used
- print tag usage statistics at the end of run
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
28d5d394cf memtag_test: remove usages of rand()
It didn't work correctly due to not being seeded and its usage wasn't necessary.
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
577d9583eb mte: add licensing info for code that was copied from scudo 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
93aa9eefe4 mte: make h_malloc_disable_memory_tagging() thread-safe 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
01a199e19e mte: move is_memtag_enabled to read-only allocator data 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
576328b1b4 android: add MTE tests
To run them, connect an MTE-enabled device via adb and execute `atest HMallocTest:MemtagTest`.

Since these tests are not deterministic (and neither is hardened_malloc itself), it's better to run
them multiple times, e.g. `atest --iterations 30 HMallocTest:MemtagTest`.

There are also CTS tests that are useful for checking correctness of the Android integration:
`atest CtsTaggingHostTestCases`
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
5137d2da4d android: enable MTE on devices that declare having it 2023-10-30 14:20:53 -04:00
Dmitry Muhomor
f042a6b9b0 android: add function for disabling MTE at runtime
On Android, MTE is always enabled in Zygote, and is disabled after fork for apps that didn't opt-in
to MTE.

Depends on the slab canary adjustments in the previous commit.
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
001fc86585 mte: disable slab canaries when MTE is on
Canary with the "0" value is now reserved to support re-enabling slab canaries if MTE is turned off
at runtime.
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
70c91f4c3e mte: disable write-after-free check for slab allocations when MTE is on
Freed slab memory is tagged with a reserved tag value that is never used for live allocations.
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
e3686ae457 add support for Arm MTE memory tagging
- tag slab allocations with [1..14] tags
- tag freed slab allocations with the "15" tag value to detect accesses to freed slab memory
- when generating tag value for a slab slot, always exclude most recent tag value for that slot
(to make use-after-free detection more reliable) and most recent tag values of its immediate
neighbors (to detect linear overflows and underflows)
2023-10-30 14:20:53 -04:00
Dmitry Muhomor
19a46e0f96 add helper functions for using u8 array as u4 array 2023-10-30 14:20:53 -04:00
18 changed files with 1121 additions and 79 deletions

View File

@ -9,14 +9,28 @@ on:
jobs: jobs:
build-ubuntu-gcc: build-ubuntu-gcc:
runs-on: ubuntu-latest runs-on: ubuntu-latest
strategy:
matrix:
version: [12]
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- name: Setting up gcc version
run: |
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-${{ matrix.version }} 100
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-${{ matrix.version }} 100
- name: Build - name: Build
run: make test run: make test
build-ubuntu-clang: build-ubuntu-clang:
runs-on: ubuntu-latest runs-on: ubuntu-latest
strategy:
matrix:
version: [14, 15]
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- name: Setting up clang version
run: |
sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-${{ matrix.version }} 100
sudo update-alternatives --install /usr/bin/clang clang /usr/bin/clang-${{ matrix.version }} 100
- name: Build - name: Build
run: CC=clang CXX=clang++ make test run: CC=clang CXX=clang++ make test
build-musl: build-musl:

View File

@ -73,6 +73,9 @@ cc_library {
debuggable: { debuggable: {
cflags: ["-DLABEL_MEMORY"], cflags: ["-DLABEL_MEMORY"],
}, },
device_has_arm_mte: {
cflags: ["-DHAS_ARM_MTE", "-march=armv9-a+memtag"]
},
}, },
apex_available: [ apex_available: [
"com.android.runtime", "com.android.runtime",

227
CREDITS
View File

@ -54,3 +54,230 @@ libdivide:
random.c get_random_{type}_uniform functions are based on Fast Random Integer random.c get_random_{type}_uniform functions are based on Fast Random Integer
Generation in an Interval by Daniel Lemire Generation in an Interval by Daniel Lemire
arm_mte.h arm_mte_tag_and_clear_mem function contents were copied from storeTags function in scudo:
==============================================================================
The LLVM Project is under the Apache License v2.0 with LLVM Exceptions:
==============================================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---- LLVM Exceptions to the Apache 2.0 License ----
As an exception, if, as a result of your compiling your source code, portions
of this Software are embedded into an Object form of such source code, you
may redistribute such embedded portions in such Object form without complying
with the conditions of Sections 4(a), 4(b) and 4(d) of the License.
In addition, if you combine or link compiled forms of this Software with
software that is licensed under the GPLv2 ("Combined Software") and if a
court of competent jurisdiction determines that the patent provision (Section
3), the indemnity provision (Section 9) or other Section of the License
conflicts with the conditions of the GPLv2, you may retroactively and
prospectively choose to deem waived or otherwise exclude such Section(s) of
the License, but only in their entirety and only with respect to the Combined
Software.
==============================================================================

View File

@ -1,4 +1,4 @@
Copyright © 2018-2023 GrapheneOS Copyright © 2018-2024 GrapheneOS
Permission is hereby granted, free of charge, to any person obtaining a copy Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal of this software and associated documentation files (the "Software"), to deal

106
README.md
View File

@ -159,6 +159,9 @@ line to the `/etc/ld.so.preload` configuration file:
The format of this configuration file is a whitespace-separated list, so it's The format of this configuration file is a whitespace-separated list, so it's
good practice to put each library on a separate line. good practice to put each library on a separate line.
On Debian systems `libhardened_malloc.so` should be installed into `/usr/lib/`
to avoid preload failures caused by AppArmor profile restrictions.
Using the `LD_PRELOAD` environment variable to load it on a case-by-case basis Using the `LD_PRELOAD` environment variable to load it on a case-by-case basis
will not work when `AT_SECURE` is set such as with setuid binaries. It's also will not work when `AT_SECURE` is set such as with setuid binaries. It's also
generally not a recommended approach for production usage. The recommendation generally not a recommended approach for production usage. The recommendation
@ -470,16 +473,16 @@ was a bit less important and if a core goal was finding latent bugs.
* Errors other than ENOMEM from mmap, munmap, mprotect and mremap treated * Errors other than ENOMEM from mmap, munmap, mprotect and mremap treated
as fatal, which can help to detect memory management gone wrong elsewhere as fatal, which can help to detect memory management gone wrong elsewhere
in the process. in the process.
* [future] Memory tagging for slab allocations via MTE on ARMv8.5+ * Memory tagging for slab allocations via MTE on ARMv8.5+
* random memory tags as the baseline, providing probabilistic protection * random memory tags as the baseline, providing probabilistic protection
against various forms of memory corruption against various forms of memory corruption
* dedicated tag for free slots, set on free, for deterministic protection * dedicated tag for free slots, set on free, for deterministic protection
against accessing freed memory against accessing freed memory
* store previous random tag within freed slab allocations, and increment it
to get the next tag for that slot to provide deterministic use-after-free
detection through multiple cycles of memory reuse
* guarantee distinct tags for adjacent memory allocations by incrementing * guarantee distinct tags for adjacent memory allocations by incrementing
past matching values for deterministic detection of linear overflows past matching values for deterministic detection of linear overflows
* [future] store previous random tag and increment it to get the next tag
for that slot to provide deterministic use-after-free detection through
multiple cycles of memory reuse
## Randomness ## Randomness
@ -721,77 +724,46 @@ freeing as there would be if the kernel supported these features directly.
## Memory tagging ## Memory tagging
Integrating extensive support for ARMv8.5 memory tagging is planned and this Random tags are set for all slab allocations when allocated, with 4 excluded values:
section will be expanded to cover the details on the chosen design. The approach
for slab allocations is currently covered, but it can also be used for the
allocator metadata region and large allocations.
Memory allocations are already always multiples of naturally aligned 16 byte 1. the reserved `0` tag
units, so memory tags are a natural fit into a malloc implementation due to the 2. the previous tag used for the slot
16 byte alignment requirement. The only extra memory consumption will come from 3. the current (or previous) tag used for the slot to the left
the hardware supported storage for the tag values (4 bits per 16 bytes). 4. the current (or previous) tag used for the slot to the right
The baseline policy will be to generate random tags for each slab allocation When a slab allocation is freed, the reserved `0` tag is set for the slot.
slot on first use. The highest value will be reserved for marking freed memory Slab allocation slots are cleared before reuse when memory tagging is enabled.
allocations to detect any accesses to freed memory so it won't be part of the
generated range. Adjacent slots will be guaranteed to have distinct memory tags
in order to guarantee that linear overflows are detected. There are a few ways
of implementing this and it will end up depending on the performance costs of
different approaches. If there's an efficient way to fetch the adjacent tag
values without wasting extra memory, it will be possible to check for them and
skip them either by generating a new random value in a loop or incrementing
past them since the tiny bit of bias wouldn't matter. Another approach would be
alternating odd and even tag values but that would substantially reduce the
overall randomness of the tags and there's very little entropy from the start.
Once a slab allocation has been freed, the tag will be set to the reserved This ensures the following properties:
value for free memory and the previous tag value will be stored inside the
allocation itself. The next time the slot is allocated, the chosen tag value
will be the previous value incremented by one to provide use-after-free
detection between generations of allocations. The stored tag will be wiped
before retagging the memory, to avoid leaking it and as part of preserving the
security property of newly allocated memory being zeroed due to zero-on-free.
It will eventually wrap all the way around, but this ends up providing a strong
guarantee for many allocation cycles due to the combination of 4 bit tags with
the FIFO quarantine feature providing delayed free. It also benefits from
random slot allocation and the randomized portion of delayed free, which result
in a further delay along with preventing a deterministic bypass by forcing a
reuse after a certain number of allocation cycles. Similarly to the initial tag
generation, tag values for adjacent allocations will be skipped by incrementing
past them.
For example, consider this slab of allocations that are not yet used with 15 - Linear overflows are deterministically detected.
representing the tag for free memory. For the sake of simplicity, there will be - Use-after-free are deterministically detected until the freed slot goes through
no quarantine or other slabs for this example: both the random and FIFO quarantines, gets allocated again, goes through both
quarantines again and then finally gets allocated again for a 2nd time.
- Since the default `0` tag is reserved, untagged pointers can't access slab
allocations and vice versa.
| 15 | 15 | 15 | 15 | 15 | 15 | Slab allocations are done in a statically reserved region for each size class
and all metadata is in a statically reserved region, so interactions between
different uses of the same address space is not applicable.
Three slots are randomly chosen for allocations, with random tags assigned (2, Large allocations beyond the largest slab allocation size class (128k by
7, 14) since these slots haven't ever been used and don't have saved values: default) are guaranteed to have randomly sized guard regions to the left and
right. Random and FIFO address space quarantines provide use-after-free
detection. We need to test whether the cost of random tags is acceptable to enabled them by default,
since they would be useful for:
| 15 | 2 | 15 | 7 | 14 | 15 | - probabilistic detection of overflows
- probabilistic detection of use-after-free once the address space is
out of the quarantine and reused for another allocation
- deterministic detection of use-after-free for reuse by another allocator.
The 2nd allocation slot is freed, and is set back to the tag for free memory When memory tagging is enabled, checking for write-after-free at allocation
(15), but with the previous tag value stored in the freed space: time and checking canaries are both disabled. Canaries will be more thoroughly
disabled when using memory tagging in the future, but Android currently has
| 15 | 15 | 15 | 7 | 14 | 15 | [very dynamic memory tagging support](https://source.android.com/docs/security/test/memory-safety/arm-mte)
where it can be disabled at any time which creates a barrier to optimizing
The first slot is allocated for the first time, receiving the random value 3: by disabling redundant features.
| 3 | 15 | 15 | 7 | 14 | 15 |
The 2nd slot is randomly chosen again, so the previous tag (2) is retrieved and
incremented to 3 as part of the use-after-free mitigation. An adjacent
allocation already uses the tag 3, so the tag is further incremented to 4 (it
would be incremented to 5 if one of the adjacent tags was 4):
| 3 | 4 | 15 | 7 | 14 | 15 |
The last slot is randomly chosen for the next allocation, and is assigned the
random value 14. However, it's placed next to an allocation with the tag 14 so
the tag is incremented and wraps around to 0:
| 3 | 4 | 15 | 7 | 14 | 0 |
## API extensions ## API extensions

25
androidtest/Android.bp Normal file
View File

@ -0,0 +1,25 @@
java_test_host {
name: "HMallocTest",
srcs: [
"src/**/*.java",
],
libs: [
"tradefed",
"compatibility-tradefed",
"compatibility-host-util",
],
static_libs: [
"cts-host-utils",
"frameworks-base-hostutils",
],
test_suites: [
"general-tests",
],
data_device_bins_64: [
"memtag_test",
],
}

View File

@ -0,0 +1,13 @@
<?xml version="1.0" encoding="utf-8"?>
<configuration description="hardened_malloc test">
<target_preparer class="com.android.compatibility.common.tradefed.targetprep.FilePusher">
<option name="cleanup" value="true" />
<option name="push" value="memtag_test->/data/local/tmp/memtag_test" />
</target_preparer>
<test class="com.android.compatibility.common.tradefed.testtype.JarHostTest" >
<option name="jar" value="HMallocTest.jar" />
</test>
</configuration>

View File

@ -0,0 +1,17 @@
cc_test {
name: "memtag_test",
srcs: ["memtag_test.cc"],
cflags: [
"-Wall",
"-Werror",
"-Wextra",
"-O0",
"-march=armv9-a+memtag",
],
compile_multilib: "64",
sanitize: {
memtag_heap: true,
},
}

View File

@ -0,0 +1,351 @@
// needed to uncondionally enable assertions
#undef NDEBUG
#include <assert.h>
#include <malloc.h>
#include <signal.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/utsname.h>
#include <unistd.h>
#include <map>
#include <set>
#include <string>
#include <unordered_map>
#include "../../arm_mte.h"
using namespace std;
using u8 = uint8_t;
using uptr = uintptr_t;
using u64 = uint64_t;
const size_t DEFAULT_ALLOC_SIZE = 8;
const size_t CANARY_SIZE = 8;
void do_context_switch() {
utsname s;
uname(&s);
}
u8 get_pointer_tag(void *ptr) {
return (((uptr) ptr) >> 56) & 0xf;
}
void *untag_pointer(void *ptr) {
const uintptr_t mask = UINTPTR_MAX >> 8;
return (void *) ((uintptr_t) ptr & mask);
}
void *set_pointer_tag(void *ptr, u8 tag) {
return (void *) (((uintptr_t) tag << 56) | (uintptr_t) untag_pointer(ptr));
}
// This test checks that slab slot allocation uses tag that is distint from tags of its neighbors
// and from the tag of the previous allocation that used the same slot
void tag_distinctness() {
// tag 0 is reserved
const int min_tag = 1;
const int max_tag = 0xf;
struct SizeClass {
int size;
int slot_cnt;
};
// values from size_classes[] and size_class_slots[] in h_malloc.c
SizeClass size_classes[] = {
{ .size = 16, .slot_cnt = 256, },
{ .size = 32, .slot_cnt = 128, },
// this size class is used by allocations that are made by the addr_tag_map, which breaks
// tag distinctess checks
// { .size = 48, .slot_cnt = 85, },
{ .size = 64, .slot_cnt = 64, },
{ .size = 80, .slot_cnt = 51, },
{ .size = 96, .slot_cnt = 42, },
{ .size = 112, .slot_cnt = 36, },
{ .size = 128, .slot_cnt = 64, },
{ .size = 160, .slot_cnt = 51, },
{ .size = 192, .slot_cnt = 64, },
{ .size = 224, .slot_cnt = 54, },
{ .size = 10240, .slot_cnt = 6, },
{ .size = 20480, .slot_cnt = 1, },
};
int tag_usage[max_tag + 1];
for (size_t sc_idx = 0; sc_idx < sizeof(size_classes) / sizeof(SizeClass); ++sc_idx) {
SizeClass &sc = size_classes[sc_idx];
const size_t full_alloc_size = sc.size;
const size_t alloc_size = full_alloc_size - CANARY_SIZE;
// "tdc" is short for "tag distinctness check"
int left_neighbor_tdc_cnt = 0;
int right_neighbor_tdc_cnt = 0;
int prev_alloc_tdc_cnt = 0;
int iter_cnt = 600;
unordered_map<uptr, u8> addr_tag_map;
addr_tag_map.reserve(iter_cnt * sc.slot_cnt);
u64 seen_tags = 0;
for (int iter = 0; iter < iter_cnt; ++iter) {
uptr allocations[256]; // 256 is max slot count
for (int i = 0; i < sc.slot_cnt; ++i) {
u8 *p = (u8 *) malloc(alloc_size);
assert(p);
uptr addr = (uptr) untag_pointer(p);
u8 tag = get_pointer_tag(p);
assert(tag >= min_tag && tag <= max_tag);
seen_tags |= 1 << tag;
++tag_usage[tag];
// check most recent tags of left and right neighbors
auto left = addr_tag_map.find(addr - full_alloc_size);
if (left != addr_tag_map.end()) {
assert(left->second != tag);
++left_neighbor_tdc_cnt;
}
auto right = addr_tag_map.find(addr + full_alloc_size);
if (right != addr_tag_map.end()) {
assert(right->second != tag);
++right_neighbor_tdc_cnt;
}
// check previous tag of this slot
auto prev = addr_tag_map.find(addr);
if (prev != addr_tag_map.end()) {
assert(prev->second != tag);
++prev_alloc_tdc_cnt;
addr_tag_map.erase(addr);
}
addr_tag_map.emplace(addr, tag);
for (size_t j = 0; j < alloc_size; ++j) {
// check that slot is zeroed
assert(p[j] == 0);
// check that slot is readable and writable
p[j]++;
}
allocations[i] = addr;
}
// free some of allocations to allow their slots to be reused
for (int i = sc.slot_cnt - 1; i >= 0; i -= 2) {
free((void *) allocations[i]);
}
}
// check that all of the tags were used, except for the reserved tag 0
assert(seen_tags == (0xffff & ~(1 << 0)));
printf("size_class\t%i\t" "tdc_left %i\t" "tdc_right %i\t" "tdc_prev_alloc %i\n",
sc.size, left_neighbor_tdc_cnt, right_neighbor_tdc_cnt, prev_alloc_tdc_cnt);
// make sure tag distinctess checks were actually performed
int min_tdc_cnt = sc.slot_cnt * iter_cnt / 5;
assert(prev_alloc_tdc_cnt > min_tdc_cnt);
if (sc.slot_cnt > 1) {
assert(left_neighbor_tdc_cnt > min_tdc_cnt);
assert(right_neighbor_tdc_cnt > min_tdc_cnt);
}
// async tag check failures are reported on context switch
do_context_switch();
}
printf("\nTag use counters:\n");
int min = INT_MAX;
int max = 0;
double geomean = 0.0;
for (int i = min_tag; i <= max_tag; ++i) {
int v = tag_usage[i];
geomean += log(v);
min = std::min(min, v);
max = std::max(max, v);
printf("%i\t%i\n", i, tag_usage[i]);
}
int tag_cnt = 1 + max_tag - min_tag;
geomean = exp(geomean / tag_cnt);
double max_deviation = std::max((double) max - geomean, geomean - min);
printf("geomean: %.2f, max deviation from geomean: %.2f%%\n", geomean, (100.0 * max_deviation) / geomean);
}
u8* alloc_default() {
const size_t full_alloc_size = DEFAULT_ALLOC_SIZE + CANARY_SIZE;
set<uptr> addrs;
// make sure allocation has both left and right neighbors, otherwise overflow/underflow tests
// will fail when allocation is at the end/beginning of slab
for (;;) {
u8 *p = (u8 *) malloc(DEFAULT_ALLOC_SIZE);
assert(p);
uptr addr = (uptr) untag_pointer(p);
uptr left = addr - full_alloc_size;
if (addrs.find(left) != addrs.end()) {
uptr right = addr + full_alloc_size;
if (addrs.find(right) != addrs.end()) {
return p;
}
}
addrs.emplace(addr);
}
}
int expected_segv_code;
#define expect_segv(exp, segv_code) ({\
expected_segv_code = segv_code; \
volatile auto val = exp; \
(void) val; \
do_context_switch(); \
fprintf(stderr, "didn't receive SEGV code %i", segv_code); \
exit(1); })
// it's expected that the device is configured to use asymm MTE tag checking mode (sync read checks,
// async write checks)
#define expect_read_segv(exp) expect_segv(exp, SEGV_MTESERR)
#define expect_write_segv(exp) expect_segv(exp, SEGV_MTEAERR)
void read_after_free() {
u8 *p = alloc_default();
free(p);
expect_read_segv(p[0]);
}
void write_after_free() {
u8 *p = alloc_default();
free(p);
expect_write_segv(p[0] = 1);
}
void underflow_read() {
u8 *p = alloc_default();
expect_read_segv(p[-1]);
}
void underflow_write() {
u8 *p = alloc_default();
expect_write_segv(p[-1] = 1);
}
void overflow_read() {
u8 *p = alloc_default();
expect_read_segv(p[DEFAULT_ALLOC_SIZE + CANARY_SIZE]);
}
void overflow_write() {
u8 *p = alloc_default();
expect_write_segv(p[DEFAULT_ALLOC_SIZE + CANARY_SIZE] = 1);
}
void untagged_read() {
u8 *p = alloc_default();
p = (u8 *) untag_pointer(p);
expect_read_segv(p[0]);
}
void untagged_write() {
u8 *p = alloc_default();
p = (u8 *) untag_pointer(p);
expect_write_segv(p[0] = 1);
}
// checks that each of memory locations inside the buffer is tagged with expected_tag
void check_tag(void *buf, size_t len, u8 expected_tag) {
for (size_t i = 0; i < len; ++i) {
assert(get_pointer_tag(__arm_mte_get_tag((void *) ((uintptr_t) buf + i))) == expected_tag);
}
}
void madvise_dontneed() {
const size_t len = 100'000;
void *ptr = mmap(NULL, len, PROT_READ | PROT_WRITE | PROT_MTE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
assert(ptr != MAP_FAILED);
// check that 0 is the initial tag
check_tag(ptr, len, 0);
arm_mte_tag_and_clear_mem(set_pointer_tag(ptr, 1), len);
check_tag(ptr, len, 1);
memset(set_pointer_tag(ptr, 1), 1, len);
assert(madvise(ptr, len, MADV_DONTNEED) == 0);
// check that MADV_DONTNEED resets the tag
check_tag(ptr, len, 0);
// check that MADV_DONTNEED clears the memory
for (size_t i = 0; i < len; ++i) {
assert(((u8 *) ptr)[i] == 0);
}
// check that mistagged read after MADV_DONTNEED fails
expect_read_segv(*((u8 *) set_pointer_tag(ptr, 1)));
}
map<string, function<void()>> tests = {
#define TEST(s) { #s, s }
TEST(tag_distinctness),
TEST(read_after_free),
TEST(write_after_free),
TEST(overflow_read),
TEST(overflow_write),
TEST(underflow_read),
TEST(underflow_write),
TEST(untagged_read),
TEST(untagged_write),
TEST(madvise_dontneed),
#undef TEST
};
void segv_handler(int, siginfo_t *si, void *) {
if (expected_segv_code == 0 || expected_segv_code != si->si_code) {
fprintf(stderr, "received unexpected SEGV_CODE %i", si->si_code);
exit(139); // standard exit code for SIGSEGV
}
exit(0);
}
int main(int argc, char **argv) {
setbuf(stdout, NULL);
assert(argc == 2);
auto test_name = string(argv[1]);
auto test_fn = tests[test_name];
assert(test_fn != nullptr);
assert(mallopt(M_BIONIC_SET_HEAP_TAGGING_LEVEL, M_HEAP_TAGGING_LEVEL_ASYNC) == 1);
struct sigaction sa = {
.sa_sigaction = segv_handler,
.sa_flags = SA_SIGINFO,
};
assert(sigaction(SIGSEGV, &sa, nullptr) == 0);
test_fn();
do_context_switch();
return 0;
}

View File

@ -0,0 +1,79 @@
package grapheneos.hmalloc;
import com.android.tradefed.device.DeviceNotAvailableException;
import com.android.tradefed.testtype.DeviceJUnit4ClassRunner;
import com.android.tradefed.testtype.junit4.BaseHostJUnit4Test;
import org.junit.Test;
import org.junit.runner.RunWith;
import java.util.ArrayList;
import static org.junit.Assert.assertEquals;
@RunWith(DeviceJUnit4ClassRunner.class)
public class MemtagTest extends BaseHostJUnit4Test {
private static final String TEST_BINARY = "/data/local/tmp/memtag_test";
private void runTest(String name) throws DeviceNotAvailableException {
var args = new ArrayList<String>();
args.add(TEST_BINARY);
args.add(name);
String cmdLine = String.join(" ", args);
var result = getDevice().executeShellV2Command(cmdLine);
assertEquals("stderr", "", result.getStderr());
assertEquals("process exit code", 0, result.getExitCode().intValue());
}
@Test
public void tag_distinctness() throws DeviceNotAvailableException {
runTest("tag_distinctness");
}
@Test
public void read_after_free() throws DeviceNotAvailableException {
runTest("read_after_free");
}
@Test
public void write_after_free() throws DeviceNotAvailableException {
runTest("write_after_free");
}
@Test
public void underflow_read() throws DeviceNotAvailableException {
runTest("underflow_read");
}
@Test
public void underflow_write() throws DeviceNotAvailableException {
runTest("underflow_write");
}
@Test
public void overflow_read() throws DeviceNotAvailableException {
runTest("overflow_read");
}
@Test
public void overflow_write() throws DeviceNotAvailableException {
runTest("overflow_write");
}
@Test
public void untagged_read() throws DeviceNotAvailableException {
runTest("untagged_read");
}
@Test
public void untagged_write() throws DeviceNotAvailableException {
runTest("untagged_write");
}
@Test
public void madvise_dontneed() throws DeviceNotAvailableException {
runTest("madvise_dontneed");
}
}

91
arm_mte.h Normal file
View File

@ -0,0 +1,91 @@
#ifndef ARM_MTE_H
#define ARM_MTE_H
#include <arm_acle.h>
#include <stdint.h>
// Returns a tagged pointer.
// See https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/IRG--Insert-Random-Tag-
static inline void *arm_mte_create_random_tag(void *p, uint64_t exclusion_mask) {
return __arm_mte_create_random_tag(p, exclusion_mask);
}
// Tag the memory region with the tag specified in tag bits of tagged_ptr. Memory region itself is
// zeroed.
// tagged_ptr has to be aligned by 16, and len has to be a multiple of 16 (tag granule size).
//
// Arm's software optimization guide says:
// "it is recommended to use STZGM (or DCZGVA) to set tag if data is not a concern." (STZGM and
// DCGZVA are zeroing variants of tagging instructions).
//
// Contents of this function were copied from scudo:
// https://android.googlesource.com/platform/external/scudo/+/refs/tags/android-14.0.0_r1/standalone/memtag.h#167
//
// scudo is licensed under the Apache License v2.0 with LLVM Exceptions, which is compatible with
// the hardened_malloc's MIT license
static inline void arm_mte_tag_and_clear_mem(void *tagged_ptr, size_t len) {
uintptr_t Begin = (uintptr_t) tagged_ptr;
uintptr_t End = Begin + len;
uintptr_t LineSize, Next, Tmp;
__asm__ __volatile__(
".arch_extension memtag \n\t"
// Compute the cache line size in bytes (DCZID_EL0 stores it as the log2
// of the number of 4-byte words) and bail out to the slow path if DCZID_EL0
// indicates that the DC instructions are unavailable.
"DCZID .req %[Tmp] \n\t"
"mrs DCZID, dczid_el0 \n\t"
"tbnz DCZID, #4, 3f \n\t"
"and DCZID, DCZID, #15 \n\t"
"mov %[LineSize], #4 \n\t"
"lsl %[LineSize], %[LineSize], DCZID \n\t"
".unreq DCZID \n\t"
// Our main loop doesn't handle the case where we don't need to perform any
// DC GZVA operations. If the size of our tagged region is less than
// twice the cache line size, bail out to the slow path since it's not
// guaranteed that we'll be able to do a DC GZVA.
"Size .req %[Tmp] \n\t"
"sub Size, %[End], %[Cur] \n\t"
"cmp Size, %[LineSize], lsl #1 \n\t"
"b.lt 3f \n\t"
".unreq Size \n\t"
"LineMask .req %[Tmp] \n\t"
"sub LineMask, %[LineSize], #1 \n\t"
// STZG until the start of the next cache line.
"orr %[Next], %[Cur], LineMask \n\t"
"1:\n\t"
"stzg %[Cur], [%[Cur]], #16 \n\t"
"cmp %[Cur], %[Next] \n\t"
"b.lt 1b \n\t"
// DC GZVA cache lines until we have no more full cache lines.
"bic %[Next], %[End], LineMask \n\t"
".unreq LineMask \n\t"
"2: \n\t"
"dc gzva, %[Cur] \n\t"
"add %[Cur], %[Cur], %[LineSize] \n\t"
"cmp %[Cur], %[Next] \n\t"
"b.lt 2b \n\t"
// STZG until the end of the tagged region. This loop is also used to handle
// slow path cases.
"3: \n\t"
"cmp %[Cur], %[End] \n\t"
"b.ge 4f \n\t"
"stzg %[Cur], [%[Cur]], #16 \n\t"
"b 3b \n\t"
"4: \n\t"
: [Cur] "+&r"(Begin), [LineSize] "=&r"(LineSize), [Next] "=&r"(Next), [Tmp] "=&r"(Tmp)
: [End] "r"(End)
: "memory"
);
}
#endif

View File

@ -14,6 +14,7 @@
#include "h_malloc.h" #include "h_malloc.h"
#include "memory.h" #include "memory.h"
#include "memtag.h"
#include "mutex.h" #include "mutex.h"
#include "pages.h" #include "pages.h"
#include "random.h" #include "random.h"
@ -75,6 +76,9 @@ static union {
struct region_metadata *regions[2]; struct region_metadata *regions[2];
#ifdef USE_PKEY #ifdef USE_PKEY
int metadata_pkey; int metadata_pkey;
#endif
#ifdef MEMTAG
bool is_memtag_disabled;
#endif #endif
}; };
char padding[PAGE_SIZE]; char padding[PAGE_SIZE];
@ -84,6 +88,12 @@ static inline void *get_slab_region_end(void) {
return atomic_load_explicit(&ro.slab_region_end, memory_order_acquire); return atomic_load_explicit(&ro.slab_region_end, memory_order_acquire);
} }
#ifdef MEMTAG
static inline bool is_memtag_enabled(void) {
return !ro.is_memtag_disabled;
}
#endif
#define SLAB_METADATA_COUNT #define SLAB_METADATA_COUNT
struct slab_metadata { struct slab_metadata {
@ -99,6 +109,18 @@ struct slab_metadata {
#if SLAB_QUARANTINE #if SLAB_QUARANTINE
u64 quarantine_bitmap[4]; u64 quarantine_bitmap[4];
#endif #endif
#ifdef HAS_ARM_MTE
// arm_mte_tags is used as a u4 array (MTE tags are 4-bit wide)
//
// Its size is calculated by the following formula:
// (MAX_SLAB_SLOT_COUNT + 2) / 2
// MAX_SLAB_SLOT_COUNT is currently 256, 2 extra slots are needed for branchless handling of
// edge slots in tag_and_clear_slab_slot()
//
// It's intentionally placed at the end of struct to improve locality: for most size classes,
// slot count is far lower than MAX_SLAB_SLOT_COUNT.
u8 arm_mte_tags[129];
#endif
}; };
static const size_t min_align = 16; static const size_t min_align = 16;
@ -447,6 +469,12 @@ static void write_after_free_check(const char *p, size_t size) {
return; return;
} }
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
return;
}
#endif
for (size_t i = 0; i < size; i += sizeof(u64)) { for (size_t i = 0; i < size; i += sizeof(u64)) {
if (unlikely(*(const u64 *)(const void *)(p + i))) { if (unlikely(*(const u64 *)(const void *)(p + i))) {
fatal_error("detected write after free"); fatal_error("detected write after free");
@ -461,19 +489,48 @@ static void set_slab_canary_value(UNUSED struct slab_metadata *metadata, UNUSED
0x00ffffffffffffffUL; 0x00ffffffffffffffUL;
metadata->canary_value = get_random_u64(rng) & canary_mask; metadata->canary_value = get_random_u64(rng) & canary_mask;
#ifdef HAS_ARM_MTE
if (unlikely(metadata->canary_value == 0)) {
// 0 is reserved to support disabling MTE at runtime (this is required on Android).
// When MTE is enabled, writing and reading of canaries is disabled, i.e. canary remains zeroed.
// After MTE is disabled, canaries that are set to 0 are ignored, since they wouldn't match
// slab's metadata->canary_value.
// 0x100 was chosen arbitrarily, and can be encoded as an immediate value on ARM by the compiler.
metadata->canary_value = 0x100;
}
#endif
#endif #endif
} }
static void set_canary(UNUSED const struct slab_metadata *metadata, UNUSED void *p, UNUSED size_t size) { static void set_canary(UNUSED const struct slab_metadata *metadata, UNUSED void *p, UNUSED size_t size) {
#if SLAB_CANARY #if SLAB_CANARY
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
return;
}
#endif
memcpy((char *)p + size - canary_size, &metadata->canary_value, canary_size); memcpy((char *)p + size - canary_size, &metadata->canary_value, canary_size);
#endif #endif
} }
static void check_canary(UNUSED const struct slab_metadata *metadata, UNUSED const void *p, UNUSED size_t size) { static void check_canary(UNUSED const struct slab_metadata *metadata, UNUSED const void *p, UNUSED size_t size) {
#if SLAB_CANARY #if SLAB_CANARY
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
return;
}
#endif
u64 canary_value; u64 canary_value;
memcpy(&canary_value, (const char *)p + size - canary_size, canary_size); memcpy(&canary_value, (const char *)p + size - canary_size, canary_size);
#ifdef HAS_ARM_MTE
if (unlikely(canary_value == 0)) {
return;
}
#endif
if (unlikely(canary_value != metadata->canary_value)) { if (unlikely(canary_value != metadata->canary_value)) {
fatal_error("canary corrupted"); fatal_error("canary corrupted");
} }
@ -506,6 +563,38 @@ static inline void stats_slab_deallocate(UNUSED struct size_class *c, UNUSED siz
#endif #endif
} }
#ifdef HAS_ARM_MTE
static void *tag_and_clear_slab_slot(struct slab_metadata *metadata, void *slot_ptr, size_t slot_idx, size_t slot_size) {
// arm_mte_tags is an array of 4-bit unsigned integers stored as u8 array (MTE tags are 4-bit wide)
//
// It stores the most recent tag for each slab slot, or 0 if the slot was never used.
// Slab indices in arm_mte_tags array are shifted to the right by 1, and size of this array
// is (MAX_SLAB_SLOT_COUNT + 2). This means that first and last values of arm_mte_tags array
// are always 0, which allows to handle edge slots in a branchless way when tag exclusion mask
// is constructed.
u8 *slot_tags = metadata->arm_mte_tags;
// tag exclusion mask
u64 tem = (1 << RESERVED_TAG);
// current or previous tag of left neighbor or 0 if there's no left neighbor or if it was never used
tem |= (1 << u4_arr_get(slot_tags, slot_idx));
// previous tag of this slot or 0 if it was never used
tem |= (1 << u4_arr_get(slot_tags, slot_idx + 1));
// current or previous tag of right neighbor or 0 if there's no right neighbor or if it was never used
tem |= (1 << u4_arr_get(slot_tags, slot_idx + 2));
void *tagged_ptr = arm_mte_create_random_tag(slot_ptr, tem);
// slot addresses and sizes are always aligned by 16
arm_mte_tag_and_clear_mem(tagged_ptr, slot_size);
// store new tag of this slot
u4_arr_set(slot_tags, slot_idx + 1, get_pointer_tag(tagged_ptr));
return tagged_ptr;
}
#endif
static inline void *allocate_small(unsigned arena, size_t requested_size) { static inline void *allocate_small(unsigned arena, size_t requested_size) {
struct size_info info = get_size_info(requested_size); struct size_info info = get_size_info(requested_size);
size_t size = likely(info.size) ? info.size : 16; size_t size = likely(info.size) ? info.size : 16;
@ -534,6 +623,11 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
if (requested_size) { if (requested_size) {
write_after_free_check(p, size - canary_size); write_after_free_check(p, size - canary_size);
set_canary(metadata, p, size); set_canary(metadata, p, size);
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
p = tag_and_clear_slab_slot(metadata, p, slot, size);
}
#endif
} }
stats_small_allocate(c, size); stats_small_allocate(c, size);
@ -566,6 +660,11 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
void *p = slot_pointer(size, slab, slot); void *p = slot_pointer(size, slab, slot);
if (requested_size) { if (requested_size) {
set_canary(metadata, p, size); set_canary(metadata, p, size);
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
p = tag_and_clear_slab_slot(metadata, p, slot, size);
}
#endif
} }
stats_slab_allocate(c, slab_size); stats_slab_allocate(c, slab_size);
stats_small_allocate(c, size); stats_small_allocate(c, size);
@ -588,6 +687,11 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
void *p = slot_pointer(size, slab, slot); void *p = slot_pointer(size, slab, slot);
if (requested_size) { if (requested_size) {
set_canary(metadata, p, size); set_canary(metadata, p, size);
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
p = tag_and_clear_slab_slot(metadata, p, slot, size);
}
#endif
} }
stats_slab_allocate(c, slab_size); stats_slab_allocate(c, slab_size);
stats_small_allocate(c, size); stats_small_allocate(c, size);
@ -612,6 +716,11 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
if (requested_size) { if (requested_size) {
write_after_free_check(p, size - canary_size); write_after_free_check(p, size - canary_size);
set_canary(metadata, p, size); set_canary(metadata, p, size);
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
p = tag_and_clear_slab_slot(metadata, p, slot, size);
}
#endif
} }
stats_small_allocate(c, size); stats_small_allocate(c, size);
@ -694,7 +803,16 @@ static inline void deallocate_small(void *p, const size_t *expected_size) {
if (likely(!is_zero_size)) { if (likely(!is_zero_size)) {
check_canary(metadata, p, size); check_canary(metadata, p, size);
if (ZERO_ON_FREE) { bool skip_zero = false;
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
arm_mte_tag_and_clear_mem(set_pointer_tag(p, RESERVED_TAG), size);
// metadata->arm_mte_tags is intentionally not updated, see tag_and_clear_slab_slot()
skip_zero = true;
}
#endif
if (ZERO_ON_FREE && !skip_zero) {
memset(p, 0, size - canary_size); memset(p, 0, size - canary_size);
} }
} }
@ -1074,13 +1192,14 @@ static inline void enforce_init(void) {
} }
} }
COLD static void init_slow_path(void) { static struct mutex init_lock = MUTEX_INITIALIZER;
static struct mutex lock = MUTEX_INITIALIZER;
mutex_lock(&lock); COLD static void init_slow_path(void) {
mutex_lock(&init_lock);
if (unlikely(is_init())) { if (unlikely(is_init())) {
mutex_unlock(&lock); mutex_unlock(&init_lock);
return; return;
} }
@ -1123,8 +1242,15 @@ COLD static void init_slow_path(void) {
if (unlikely(memory_protect_rw_metadata(ra->regions, ra->total * sizeof(struct region_metadata)))) { if (unlikely(memory_protect_rw_metadata(ra->regions, ra->total * sizeof(struct region_metadata)))) {
fatal_error("failed to unprotect memory for regions table"); fatal_error("failed to unprotect memory for regions table");
} }
#ifdef HAS_ARM_MTE
if (likely(is_memtag_enabled())) {
ro.slab_region_start = memory_map_mte(slab_region_size);
} else {
ro.slab_region_start = memory_map(slab_region_size);
}
#else
ro.slab_region_start = memory_map(slab_region_size); ro.slab_region_start = memory_map(slab_region_size);
#endif
if (unlikely(ro.slab_region_start == NULL)) { if (unlikely(ro.slab_region_start == NULL)) {
fatal_error("failed to allocate slab region"); fatal_error("failed to allocate slab region");
} }
@ -1164,7 +1290,7 @@ COLD static void init_slow_path(void) {
} }
memory_set_name(&ro, sizeof(ro), "malloc read-only after init"); memory_set_name(&ro, sizeof(ro), "malloc read-only after init");
mutex_unlock(&lock); mutex_unlock(&init_lock);
// may allocate, so wait until the allocator is initialized to avoid deadlocking // may allocate, so wait until the allocator is initialized to avoid deadlocking
if (unlikely(pthread_atfork(full_lock, full_unlock, post_fork_child))) { if (unlikely(pthread_atfork(full_lock, full_unlock, post_fork_child))) {
@ -1368,6 +1494,11 @@ EXPORT void *h_calloc(size_t nmemb, size_t size) {
if (!ZERO_ON_FREE && likely(p != NULL) && total_size && total_size <= max_slab_size_class) { if (!ZERO_ON_FREE && likely(p != NULL) && total_size && total_size <= max_slab_size_class) {
memset(p, 0, total_size - canary_size); memset(p, 0, total_size - canary_size);
} }
#ifdef HAS_ARM_MTE
// use an assert instead of adding a conditional to memset() above (freed memory is always
// zeroed when MTE is enabled)
static_assert(ZERO_ON_FREE, "disabling ZERO_ON_FREE reduces performance when ARM MTE is enabled");
#endif
return p; return p;
} }
@ -1385,11 +1516,14 @@ EXPORT void *h_realloc(void *old, size_t size) {
} }
} }
void *old_orig = old;
old = untag_pointer(old);
size_t old_size; size_t old_size;
if (old < get_slab_region_end() && old >= ro.slab_region_start) { if (old < get_slab_region_end() && old >= ro.slab_region_start) {
old_size = slab_usable_size(old); old_size = slab_usable_size(old);
if (size <= max_slab_size_class && get_size_info(size).size == old_size) { if (size <= max_slab_size_class && get_size_info(size).size == old_size) {
return old; return old_orig;
} }
thread_unseal_metadata(); thread_unseal_metadata();
} else { } else {
@ -1502,7 +1636,7 @@ EXPORT void *h_realloc(void *old, size_t size) {
if (copy_size > 0 && copy_size <= max_slab_size_class) { if (copy_size > 0 && copy_size <= max_slab_size_class) {
copy_size -= canary_size; copy_size -= canary_size;
} }
memcpy(new, old, copy_size); memcpy(new, old_orig, copy_size);
if (old_size <= max_slab_size_class) { if (old_size <= max_slab_size_class) {
deallocate_small(old, NULL); deallocate_small(old, NULL);
} else { } else {
@ -1543,6 +1677,8 @@ EXPORT void h_free(void *p) {
return; return;
} }
p = untag_pointer(p);
if (p < get_slab_region_end() && p >= ro.slab_region_start) { if (p < get_slab_region_end() && p >= ro.slab_region_start) {
thread_unseal_metadata(); thread_unseal_metadata();
deallocate_small(p, NULL); deallocate_small(p, NULL);
@ -1566,6 +1702,8 @@ EXPORT void h_free_sized(void *p, size_t expected_size) {
return; return;
} }
p = untag_pointer(p);
expected_size = adjust_size_for_canary(expected_size); expected_size = adjust_size_for_canary(expected_size);
if (p < get_slab_region_end() && p >= ro.slab_region_start) { if (p < get_slab_region_end() && p >= ro.slab_region_start) {
@ -1619,11 +1757,13 @@ static inline void memory_corruption_check_small(const void *p) {
mutex_unlock(&c->lock); mutex_unlock(&c->lock);
} }
EXPORT size_t h_malloc_usable_size(H_MALLOC_USABLE_SIZE_CONST void *p) { EXPORT size_t h_malloc_usable_size(H_MALLOC_USABLE_SIZE_CONST void *arg) {
if (p == NULL) { if (arg == NULL) {
return 0; return 0;
} }
const void *p = untag_const_pointer(arg);
if (p < get_slab_region_end() && p >= ro.slab_region_start) { if (p < get_slab_region_end() && p >= ro.slab_region_start) {
thread_unseal_metadata(); thread_unseal_metadata();
memory_corruption_check_small(p); memory_corruption_check_small(p);
@ -2025,3 +2165,26 @@ COLD EXPORT int h_malloc_set_state(UNUSED void *state) {
return -2; return -2;
} }
#endif #endif
#ifdef __ANDROID__
COLD EXPORT void h_malloc_disable_memory_tagging(void) {
#ifdef HAS_ARM_MTE
mutex_lock(&init_lock);
if (!ro.is_memtag_disabled) {
if (is_init()) {
if (unlikely(memory_protect_rw(&ro, sizeof(ro)))) {
fatal_error("failed to unprotect allocator data");
}
ro.is_memtag_disabled = true;
if (unlikely(memory_protect_ro(&ro, sizeof(ro)))) {
fatal_error("failed to protect allocator data");
}
} else {
// bionic calls this function very early in some cases
ro.is_memtag_disabled = true;
}
}
mutex_unlock(&init_lock);
#endif
}
#endif

View File

@ -99,6 +99,7 @@ int h_malloc_iterate(uintptr_t base, size_t size, void (*callback)(uintptr_t ptr
void *arg); void *arg);
void h_malloc_disable(void); void h_malloc_disable(void);
void h_malloc_enable(void); void h_malloc_enable(void);
void h_malloc_disable_memory_tagging(void);
#endif #endif
// hardened_malloc extensions // hardened_malloc extensions

View File

@ -28,6 +28,20 @@ void *memory_map(size_t size) {
return p; return p;
} }
#ifdef HAS_ARM_MTE
// Note that PROT_MTE can't be cleared via mprotect
void *memory_map_mte(size_t size) {
void *p = mmap(NULL, size, PROT_MTE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if (unlikely(p == MAP_FAILED)) {
if (errno != ENOMEM) {
fatal_error("non-ENOMEM MTE mmap failure");
}
return NULL;
}
return p;
}
#endif
bool memory_map_fixed(void *ptr, size_t size) { bool memory_map_fixed(void *ptr, size_t size) {
void *p = mmap(ptr, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0); void *p = mmap(ptr, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
bool ret = p == MAP_FAILED; bool ret = p == MAP_FAILED;

View File

@ -11,6 +11,9 @@
int get_metadata_key(void); int get_metadata_key(void);
void *memory_map(size_t size); void *memory_map(size_t size);
#ifdef HAS_ARM_MTE
void *memory_map_mte(size_t size);
#endif
bool memory_map_fixed(void *ptr, size_t size); bool memory_map_fixed(void *ptr, size_t size);
bool memory_unmap(void *ptr, size_t size); bool memory_unmap(void *ptr, size_t size);
bool memory_protect_ro(void *ptr, size_t size); bool memory_protect_ro(void *ptr, size_t size);

50
memtag.h Normal file
View File

@ -0,0 +1,50 @@
#ifndef MEMTAG_H
#define MEMTAG_H
#include "util.h"
#ifdef HAS_ARM_MTE
#include "arm_mte.h"
#define MEMTAG 1
// Note that bionic libc always reserves tag 0 via PR_MTE_TAG_MASK prctl
#define RESERVED_TAG 0
#define TAG_WIDTH 4
#endif
static inline void *untag_pointer(void *ptr) {
#ifdef HAS_ARM_MTE
const uintptr_t mask = UINTPTR_MAX >> 8;
return (void *) ((uintptr_t) ptr & mask);
#else
return ptr;
#endif
}
static inline const void *untag_const_pointer(const void *ptr) {
#ifdef HAS_ARM_MTE
const uintptr_t mask = UINTPTR_MAX >> 8;
return (const void *) ((uintptr_t) ptr & mask);
#else
return ptr;
#endif
}
static inline void *set_pointer_tag(void *ptr, u8 tag) {
#ifdef HAS_ARM_MTE
return (void *) (((uintptr_t) tag << 56) | (uintptr_t) untag_pointer(ptr));
#else
(void) tag;
return ptr;
#endif
}
static inline u8 get_pointer_tag(void *ptr) {
#ifdef HAS_ARM_MTE
return (((uintptr_t) ptr) >> 56) & 0xf;
#else
(void) ptr;
return 0;
#endif
}
#endif

3
util.c
View File

@ -6,6 +6,8 @@
#ifdef __ANDROID__ #ifdef __ANDROID__
#include <async_safe/log.h> #include <async_safe/log.h>
int mallopt(int param, int value);
#define M_BIONIC_RESTORE_DEFAULT_SIGABRT_HANDLER (-1003)
#endif #endif
#include "util.h" #include "util.h"
@ -30,6 +32,7 @@ static int write_full(int fd, const char *buf, size_t length) {
COLD noreturn void fatal_error(const char *s) { COLD noreturn void fatal_error(const char *s) {
#ifdef __ANDROID__ #ifdef __ANDROID__
mallopt(M_BIONIC_RESTORE_DEFAULT_SIGABRT_HANDLER, 0);
async_safe_fatal("hardened_malloc: fatal allocator error: %s", s); async_safe_fatal("hardened_malloc: fatal allocator error: %s", s);
#else #else
const char *prefix = "fatal allocator error: "; const char *prefix = "fatal allocator error: ";

16
util.h
View File

@ -57,6 +57,22 @@ static inline size_t align(size_t size, size_t align) {
return (size + mask) & ~mask; return (size + mask) & ~mask;
} }
// u4_arr_{set,get} are helper functions for using u8 array as an array of unsigned 4-bit values.
// val is treated as a 4-bit value
static inline void u4_arr_set(u8 *arr, size_t idx, u8 val) {
size_t off = idx >> 1;
size_t shift = (idx & 1) << 2;
u8 mask = (u8) (0xf0 >> shift);
arr[off] = (arr[off] & mask) | (val << shift);
}
static inline u8 u4_arr_get(const u8 *arr, size_t idx) {
size_t off = idx >> 1;
size_t shift = (idx & 1) << 2;
return (u8) ((arr[off] >> shift) & 0xf);
}
COLD noreturn void fatal_error(const char *s); COLD noreturn void fatal_error(const char *s);
#if CONFIG_SEAL_METADATA #if CONFIG_SEAL_METADATA