## **ZENHAMMER: Rowhammer Attacks on AMD Zen-based Platforms**

Patrick Jattke\* Max Wipfli\* Flavien Solt

Michele Marazzi Matej Bölcskei Kaveh Razavi



in devices on AMD Zen 7 and Zen 5, suspectively, enabling

not optimized writing of a Rowhammer stack and in-

33<sup>RD</sup> USENIX SECURITY SYMPOSIUM

August 14–16, 2024 Philadelphia, PA, USA

## **Executive Summary**



• Today, every third sold x86 CPU is from AMD



- We find bit flips on 7/10 DIMMs (Zen 2) and 6/10 DIMMs (Zen 3).
- Up to 46x more bit flips on Zen 3 than on Coffee Lake.
- First bit flips on one **DDR5 DIMM** on Zen 4.

## Background

## **DIMM Organization**



## **Rowhammer**



## **DRAM** addressing



# C1 Recovering DRAM address mappings



DRAMA [1] could not recover mappings. ⇒ Functions only worked on **limited** memory regions.

**O1.** DRAM functions are non-linear and require an address offset.

**O2.** Memory blocks >1 GiB need to be accessed for mapping recovery.



[1] P. Pessl, D. Gruss, C. Maurice, M. Schwarz, and S. Mangard, "DRAMA: Exploiting DRAM Addressing for Cross-CPU Attacks," in USENIX Security '16.



# C1 Recovering DRAM address mappings

See our paper for more DRAM configurations!

#### Table 3. Reverse engineered address mappings and offsets for different single-DIMM, with the tuple indicating the DIMM's geometry (#ranks,

| Sys.  | Geometry                    | Size  | Offt. |             |          |  |
|-------|-----------------------------|-------|-------|-------------|----------|--|
|       | (RK,BG,BA,R)                | [GiB] | [MiB] | Rank (RK)   | Bank Gro |  |
| $Z_+$ | $(1, 4, 4, 2^{16})$         | 8     | 1024  | n/a         | 0x08888  |  |
|       | $(2, 4, 4, 2^{16})$         | 16    | 1024  | 0x3fffe0000 | 0×11110  |  |
|       | $(2, 4, 4, 2^{17})$         | 32    | 1024  | 0x7fffe0000 | 0×11110  |  |
| $Z_2$ | (1, 4, 4, 2 <sup>16</sup> ) | 8     | 512   | n/a         | 0×08888  |  |
|       | $(2, 4, 4, 2^{16})$         | 16    | 512   | 0x3fffe0000 | 0×11110  |  |

## Visualization of <1 RK, 4 BG, 4 BK, 2<sup>16</sup> rows> functions.



## **Testing for Rowhammer**

- We created the Blacksmith [2] fork *ZenHammer* with our found DRAM address mappings.
- DIMMs from major manufacturers:



• 6h fuzzing runs on each DIMM.

⇒ Porting the DRAM address functions is insufficient
 to do Rowhammer on AMD Zen-based systems.

|                       | Z            | en 2       | Z            | en 3       | Coffee Lake  |            |  |  |
|-----------------------|--------------|------------|--------------|------------|--------------|------------|--|--|
| DIMM                  | #Patt.       | #Bit Flips | #Patt.       | #Bit Flips | #Patt.       | #Bit Flips |  |  |
| S <sub>0</sub>        | 14           | 19         | 0            | 0          | 122          | 3'502      |  |  |
| S <sub>1</sub>        | 4            | 4          | 0            | 0          | 102          | 1'374      |  |  |
| S <sub>2</sub>        | 14           | 28         | 0            | 0          | 782          | 22'339     |  |  |
| S <sub>3</sub>        | 0            | 0          | 0            | 0          | 3            | 3          |  |  |
| S <sub>4</sub>        | 4            | 5          | 0            | 0          | 47           | 654        |  |  |
| <b>S</b> <sub>5</sub> | 6            | 7          | 0            | 0          | 155          | 4'131      |  |  |
| H <sub>0</sub>        | 0            | 0          | 0            | 0          | 24           | 35         |  |  |
| M <sub>0</sub>        | 0            | 0          | 0            | 0          | 16           | 23         |  |  |
|                       | <b>5</b> /10 | devices    | <b>0</b> /10 | devices    | <b>8</b> /10 | devices    |  |  |

[2] P. Jattke, V. van der Veen, P. Frigo, S. Gunter, and K. Razavi, "BLACKSMITH: Scalable Rowhammering in the Frequency Domain," in *IEEE S&P '22*.

# In-DRAM TRR: REF synchronization

• Rowhammer mitigations (TRR) act at the same time as periodic REFs.



## Requirement for doing Rowhammer: ⇒ proper synchronization with REFs (C2)

[3] F. de Ridder, P. Frigo, E. Vannacci, H. Bos, C. Giuffrida, and K. Razavi, "SMASH: Synchronized Many-sided Rowhammer Attacks From JavaScript," in USENIX Security '21 [4] P. Jattke, V. van der Veen, P. Frigo, S. Gunter, and K. Razavi, "BLACKSMITH: Scalable Rowhammering in the Frequency Domain," in IEEE S&P '22.

# C2 Adapting timing-based **REF** synchronization

We measured the time between REFs. ⇒ Synchronization does not work on **Zen 3.** 

REF

t0

t1<sub>1</sub>t2

REF



Solution: Continuous, nonrepeating refresh synchronization.

**EF** sync

REF

REF



= Flush

# C2 Adapting timing-based REF synchronization



# In-DRAM TRR: activation count



Synchronized hammering

Requirements for doing Rowhammer: ⇒ sufficient **activation count** to the aggressors (C3)

## C3 Increasing the ACT rate and preserving order

BEST

On average, ACTs/tREFI on Z+ (41.9) and Z3 (37.2) are halved compared to CL (76.8).

40 ACTs/tREFI gives 36K ACTs with 18 aggs. ⇒ too low for many devices

- **Systematic testing** of different hammering instruction sequences:
  - Cache flushing (e.g., CLFLUSH/CLFLUSHOPT, gathered/scattered)
  - Memory barriers

     (e.g., mfence, lfence, sfence)

## • Access types

- (e.g., load vs store)
- Vector instructions (e.g., vpgatherdd)



# Rest Rowhammer pattern

- Optimizations drastically increased #effective patterns.
- Higher #bitflips in 4 cases (Z2) and 5 cases (Z3) compared to Coffee Lake.
- Bit flips on DIMM H<sub>0</sub> on Z2 where we found none on Coffee Lake.
- We also analyzed the **impact on exploitation**, see our paper for results!  $\begin{bmatrix} \frac{\text{PTE [36]}}{2m^2} & \frac{2m^3}{2m^3} & \frac{2m^3}{2m^2} & \frac{2m^3}{2m^3} & \frac{2$

#### 7.2 Effectiveness and Exploitability

The results of our evaluation are presented in Table 9. We show for each tested platform (AMD Zen 2 and Zen 3, Intel Coffee Lake) and each DDR4 device, the number of tive patterns found ( $|\mathbb{P}^+|$ ) and the number found during fuzzing with the solicy (SP<sub>m</sub>) that

7.12m.29

6 1m 14

|                       | Z                      | len 2      | Z            | len 3               | Coff                 | ee Lake     |  |  |
|-----------------------|------------------------|------------|--------------|---------------------|----------------------|-------------|--|--|
| DIMM                  | #Patt.                 | #Bit Flips | #Patt.       | #Bit Flips          | #Patt.               | #Bit Flips  |  |  |
| S <sub>0</sub>        | 51                     | 6'945      | 31           | 17'775              | 122                  | 6'782       |  |  |
| S <sub>1</sub>        | 26                     | 1'758      | 25           | <mark>15'613</mark> | 102                  | 10'106      |  |  |
| S <sub>2</sub>        | 97 <mark>12'893</mark> |            | 45           | 79'306              | 782                  | 1'708       |  |  |
| S <sub>3</sub>        | 8 2'020<br>60 1'183    |            | 1            | 667                 | 3                    | 0<br>18'357 |  |  |
| <b>S</b> <sub>4</sub> |                        |            | 43           | 13                  | 47                   |             |  |  |
| <b>S</b> <sub>5</sub> | 25                     | 1'911      | 26           | 10'741              | 155                  | 5'860       |  |  |
| H <sub>0</sub>        | 6                      | 182        | 0            | 0 0                 |                      | 0           |  |  |
| H <sub>1</sub>        | 0                      | 0          | 0            | 0                   | 24                   | 0           |  |  |
| M <sub>0</sub>        | 0                      | 0          | 0            | 0                   | 0                    | 0           |  |  |
| M <sub>1</sub>        | 0 0                    |            | 0            | 0                   | 16                   | 2           |  |  |
|                       | <b>7</b> /10           | devices    | <b>6</b> /10 | devices             | <b>8</b> /10 devices |             |  |  |
| -                     | 5/10                   | devices    | <b>0</b> /10 | devices             |                      |             |  |  |

# Demo: PTE Attack on AMD Zen 3

# Revaluation: ZENHAMMER on DDR5

- Upon the request of reviewers, we extended our evaluation to **Zen 4.**
- We repeated all experiments and tested **10 random DDR5 DIMMs.**



- We found bit flips on **1**/10 DIMMs:
  - **41'995 bit flips** during 256 MiB sweep

**Reviewer A:** Do you have any early results/thoughts on **Zen4** applicability?

**Reviewer C:** However, the newest microarchitecture that is evaluated is Zen 3 from 2020. Since then, there have been [...] **Zen 4** (2022) [...]

| Microarch. | Release Date   | CPU           |
|------------|----------------|---------------|
| Zen 4      | September 2022 | Ryzen 7 7700X |
| Zen 3      | November 2020  | Ryzen 5 5600G |
| Zen 2      | July 2019      | Ryzen 5 3600X |
| Zen+       | April 2018     | Ryzen 5 2600X |



**33**RD USENIX

## Current AMD Zen-based systems are equally vulnerable to Rowhammer as Intel systems.





 $\mathbf{X}$  pjattke

In

**DRAM** addr. mappings for Zen 2/3/4 incl. offsets.

Patrick Jattke

#### ZenHammer bit flips Zen 2: 7/10 DIMMs Zen 3: 6/10 DIMMs Check out our paper Zen 4: 1/10 DIMMs for more information! First ever reported DDR5 bit flips!

pjattke@ethz.ch

#### Up to 46x more bit flips

on Zen 3 compared to Coffee Lake.



#### Exploitation in the best case (PTE) in just 6s (Zen 2) and 2s (Zen 3).



End-to-end PTE exploit on Zen 3.

**E** *H* zürich

linkedin.com/in/pjattke

# Are current AMD Zen-based platforms vulnerable to Rowhammer attacks?



### Our test systems

| Microarchitecture | Release Date  | CPU           |
|-------------------|---------------|---------------|
| Zen 3             | November 2020 | Ryzen 5 5600G |
| Zen 2             | July 2019     | Ryzen 5 3600X |
| Zen+              | April 2018    | Ryzen 5 2600X |

• We find bit flips on **7/6 DIMMs** on Zen 2/3

- 46x more bit flips on Zen 3 than on Coffee Lake ⇒ devices are easier exploitable
- First bit flips on one **DDR5 DIMM** on Zen 4

## **Executive Summary**





-Intel -AMD

# C3 Increasing the ACT rate and preserving order

We designed and evaluated six fence scheduling policies during 6h fuzzing runs on **all devices/platforms.** 



## Rest Rowhammer pattern

• ZenHammer fuzzing in three stages



## C3 Increasing the ACT rate and preserving order

- On average, **ACTs/tREFI** on Z+ (41.9) and Z3 (37.2) are **halved** compared to CL (76.8).
  - 40 ACTs/tREFI gives HC of 36K for n=18
     ⇒ too low for many devices.
- **Systematic testing** of different hammering instruction sequences:
  - Cache flushing (R1) (e.g., CLFLUSH vs CLFLUSHOPT, gather vs scatter)
  - Memory barriers

     (e.g., mfence, lfence, sfence)
  - Access types (e.g., load vs store)
  - Vector instructions (e.g., vpgatherdd)



**O4.** Memory loads following a CLFLUSH(OPT) never incur cache hits on Zen 3 but on Zen+/2.

## C3 Increasing the ACT rate and preserving order

• We designed six **fence scheduling** policies and evaluated them during 6h fuzzing runs on **all devices/platforms.** 

|                      | Policy             | Fencing Frequency                  | Pattern-<br>Aware | Cache-<br>Avoiding | Optimal Policy SP <sub>OPT</sub> |
|----------------------|--------------------|------------------------------------|-------------------|--------------------|----------------------------------|
| Stronger<br>ordering | SP <sub>NONE</sub> | No fences                          | X                 | X                  |                                  |
|                      | SPBP               | Every base period                  | $\checkmark$      | X                  |                                  |
|                      | SP <sub>BP/2</sub> | Every half base period             | $\checkmark$      | X                  |                                  |
|                      | SPPAIR             | Between different aggressor pairs  | $\checkmark$      | X                  | ⇔ Zen 2 (75%), Zen 3 (43%)       |
|                      | SPREP              | Between aggressor pair repetitions | $\checkmark$      | <b>~</b>           |                                  |
|                      | SP <sub>FULL</sub> | Every access (Blacksmith default)  | X                 | ~                  | ⇔ Coffee Lake (100%)             |

# R Evaluation: Exploitation



**4.4 Enabling Exploitation** On our Intel *Coffee Lake* system the bank, bank group, and rank bits all fall within the lower 21 bits, i.e., within a transparent huge page (THP). However, we noticed that the address functions on AMD *Zen 2* and *Zon 2* 

- We simulate attacks using Hammertime: page table flipping (**PTE**;+PoC), flip feng shui (**RSA-2048**), sudo binary (**sudo**).
- High number of bit flips **significantly reduces** the time for exploitation and increases the number of exploitable devices.

|                       | PTE  |       |       |         |        | RSA-2048 |      |         |      |         | sudo  |        |      |      |      |        |      |         |
|-----------------------|------|-------|-------|---------|--------|----------|------|---------|------|---------|-------|--------|------|------|------|--------|------|---------|
|                       | Ze   | en 2  | Z     | en3     | Coffee | Lake     | Z    | len 2   | Z    | len3    | Coffe | e Lake | Ze   | n 2  | Z    | Zen3   | Coff | ee Lake |
| DIMM                  | #Ex. | Time  | #Ex.  | Time    | #Ex.   | Time     | #Ex. | Time    | #Ex. | Time    | #Ex.  | Time   | #Ex. | Time | #Ex. | Time   | #Ex. | Time    |
| S <sub>0</sub>        | 7    | 6m 4s | 7     | 2m 55s  | 3      | 15s      | 17   | 2m 47s  | 37   | 46s     | 14    | 1m 36s | _    | -    | 4    | 3m 13s | 1    | 23m 49s |
| S <sub>1</sub>        | 90   | 9s    | 1'474 | 2s      | 846    | 2s       | 6    | 2m 2s   | 27   | 30s     | 21    | 26s    | _    | _    | 1    | 6m 50s | 1    | 1m 20s  |
| <b>S</b> <sub>2</sub> | 641  | 21s   | 5'326 | 1s      | 126    | 11s      | 30   | 2m 16s  | 170  | 6s      | 6     | 1m 59s | _    | _    | 12   | 1m 17s | _    | _       |
| S <sub>3</sub>        | 142  | 9s    | 61    | 32s     | _      | _        | 7    | 2m 21s  | _    | _       | _     | _      | -    | _    | –    | _      | _    | _       |
| S <sub>4</sub>        | 220  | 28s   | 3     | 23m 52s | 2'658  | 1s       | 7    | 12m 29s | 1    | 23m 52s | 53    | 26s    | -    | -    | _    | -      | 4    | 5m 16s  |
| <b>S</b> <sub>5</sub> | 102  | 6s    | 625   | 2s      | 330    | 4s       | 6    | 1m 14s  | 28   | 33s     | 11    | 5s     | _    | _    | 2    | 5m 58s | 3    | 2m 34s  |
| H <sub>o</sub>        | 11   | 53s   | _     | _       | _      | _        | _    | _       | _    | _       | _     | _      | _    | _    | _    | _      | _    | _       |
| Median                |      | 21s   |       | 17s     |        | 4s       |      | 2m 19s  |      | 33s     |       | 1m 5s  |      |      |      | 4m 36s |      | 3m 55s  |

## In-DRAM TRR: Order of accesses and ACT rate

Issue: the memory controller reorders accesses
 ⇒ enforce order by adding memory fences: which fence? where?



Issue: too few ACTs due to hammering "too slowly"
 ⇒ the ACT rate should be maximized to make bit flips more likely



## Conclusion



- AMD Zen-based systems are equally vulnerable to Rowhammer as Intel systems.
- We disclose the secret **DRAM mappings** for AMD Zen-based systems including their address offsets.
- We found bit flips on **7 DIMMs** (Zen 2) and **6 DIMMs** (Zen 3) compared to 8 DIMMs on Intel Coffee Lake.
- We show 46x more bit flips on Zen 3 than on Coffee Lake ⇒ devices are easier exploitable
- In the best case, we only need 6s (Zen 2) and
   2s (Zen 3) to mount an attack (PTE).

Check out our paper for more information!

