RISC-V HCF instructions

Its authors have recently updated the GhostWrite academic research paper. The MMU bypass vulnerability in T-Head's C910 processor core is the highlight of the paper, and gave it its nickname.

But the updated paper also reveals a so-called HFC Halt and Catch Fire instruction in the newer SpacemiT X60 processor core. This comes on top of another HFC found in T-Head's C908 already in the first public preprint version of the article. This means both the first and second commercially available processor designs featuring the RISC-V Vector extension (RVV) are susceptible to an HFC.

By the way

RISC-V support and optimisation in the open-source multimedia FFmpeg framework and the VLC media player are currently almost at a standstill due to lack of time for code reviews and further work.

FFmpeg AVC/H.264 decoding optimisations were sponsored by RISE. Otherwise I have been acting as maintainer and main developer in my free time. Unfortunately, this is unsustainable and I am struggling to keep up particularly (but not only) with regards to code reviews.

Decoding RISC-V instructions

The HFC instructions can be found within GhostWrite artifacts on GitHub in RISC-V inline assembler:

T-Head C908: .fill 1, 4, 0x20b00087
SpacemiT X60: xor s0, s0, s0 .fill 1, 4, 0xe0815407

Even by the low standards of assembler, this is rather esoteric. So let's unpack this step-by-step.

SPOILER ALERT: if you did not read the GhostWrite article, those instructions are invalid/reserved, but we can still try to decode them.

`.fill`

.fill is a pseudo-operation of the GNU assembler which repeats a pattern of a given byte size a given number of times. Here the pattern is repeated only 1 time and adds up to 4 bytes. On RISC-V, instructions are 4 bytes by default, and since there is no actual repeatition, so the .insn pseudo-operation would have been simpler, e.g.:

.insn 0x20b00087

Thus now we have to actually make sense of those two hexadecimal values 0x20b00087 and 0xe0815407, which are the actual machine instructions.

Opcode `0x07`

According to specifications, RISC-V instructions terminate with a 7-bit opcode. More precisely, the bottom 2 bits are always ones with other values reserved for compression, and the 5 bits above constitute the true opcode.

Either way, the bottom 7 bits are 0b0000111 (7) in both cases. If you look it up in the main RISC-V Unprivileged ISA specification, you will find that that is the opcode for "LOAD-FP" for floating point register load instructions.

`LOAD-FP` instruction format

If you already read the research paper, you can probably guess that the instruction is in fact not an FP load. But if we pretend for a minute that we do not know that yet, then we will need to check the floating point instruction format in the same ISA specification.

Bit fields	31-25	19-15	14-12	11-7	6-0
I-type	`imm[11:0]`	`rs1`	`funct3`	`rd`	`opcode`
F extension	offset[11:0]	base	width	dest	LOAD-FP
C908 HFC	001000001011	00000	000	00001	0000111
X60 HFC	111000001000	00010	101	01000	0000111

We already identified the opcode, so the next step is to decode the 3-bit funct3 field. In these cases, it is called the width field: it indicates the floating point data type coded as the binary logarithm of the type's byte size:

0b001 (1): half precision (16-bit), requiring the Zfh extension
0b010 (2): single precision (32-bit), requiring the F extension
0b011 (3): double precision (64-bit), requiring the D extension
0b100 (4): quadruple precision (128-bit), requiring the Q extension

It was a foregone conclusion, but width values 0b000 and 0b101 are not actually defined in the floating point extensions but in the Vector (V) extension instead. They nominally indicate the vector element size:

0b000 (0): vector of 8-bit elements
0b101 (5): vector of 16-bit elements
0b110 (6): vector of 32-bit elements
0b111 (7): vector of 64-bit elements

Note that these four values only are only assigned if bit 28 is zero. Otherwise they are reserved for future extensions (which could be either scalar or vector).

Vector load instruction formats

The vector load instruction format breaks the 12-bit immediate bit-field into many small parts:

Bit fields	31-29	28	27-26	25	24-20	19-15	14-12	11-7	6-0
Vector Load	`nf`	`mew`	`mop`	`vm`	`lumop rs2 vs2`	`rs1`	`width`	`vd`	`opcode`
C908 HFC	001	0	00	0	01011	00000	000	00001	0000111
X60 HFC	111	0	00	0	01000	00010	101	01000	0000111

Depending on the mop field value, bits 24-20 can indicate either an additional code lumop, a second source general-purpose register rs2, or a source vector register vs2 operand. In the later two cases, the instruction format is technically the S-type rather than the I-type. However today we are concerned with mop value 0b00 for unit-stride loads, so we have lumop's rather than a second source operands.

More specifically, we now have two different cases, one for each core model:

X60's 0b01000 means whole register load,
C908's 0b01011 means mask load.

X60: vector whole register load

In the X60, we have a whole register load instruction: vlNreW.v. This is a somewhat exotic instruction category intended to restore vector registers in context switching code or with custom ABIs.

In fact, it is so exotic that the Linux kernel even fails to use it where it should in its context switching code, preferring the usual unit-stride instructions for that purpose.

Accounting for all the remaining fields, we have:

NF=7 meaning N=7+1=8 registers are loaded,
VM=0 meaning the instruction is masked,
RS1=2 meaning the base address operand is x2, a.k.a. the stack pointer sp,
VD=8 meaning the destination vector register is v8.

Conclusion: the instruction sequence is:

xor s0, s0, s0 vl8re16.v v8, (sp), v0.t

...or at least it would be if that was a legal sequence, but whole register loads cannot be masked. That is to say VM must equal 1.

Indeed the closest valid instruction sequence is:

xor s0, s0, s0 vl8re16.v v8, (sp)

...where the second instruction assembles as 0xE2815407 rather than 0xE0815407.

C908: vector mask load

On the slightly older C908 processor, we have a mask load instruction: vlm.v. In this case, even if the instruction is completely nonsensical, as RS1=0 means that the source address is the zero register. In other words, the instruction loads data from a NULL pointer.

Leaving that aside, we again see VM=0 when the instruction requires VM=1, making the instruction not only nonsensical but plainly invalid. Similarly, NF=1 is invalid, as vector load mask instructions cannot be segmented (NF must equal 0). If we ignore all of those considerations, the imaginary instruction could be something like:

vlseg2m.v v1, (zero), v0.t

The closest valid instruction that I could find is:

vlm.v v1, (zero)

...which assembles as 0x02b00087 instead of 0x20b00087, though even that nominally valid instruction makes no practical sense on account of using zero as source address.

Notes

The exact 32-bit instructions are straight from the researchers' artefacts. It is likely that other similar 32-bit instruction would also halt, but I did not test this hypothesis.

Also this should be obvious, but I must emphasise that: These instructions constitute bugs in two specific processor proprietary core designs.

They DO NOT constitute bugs in the RISC-V specifications.
They DO NOT affect other RISC-V design vendors.
They DO NOT affect other RISC-V processors from T-Head or SpacemiT.

Remlab

Projects

RISC-V HCF instructions

By the way

Decoding RISC-V instructions

`.fill`

Opcode `0x07`

`LOAD-FP` instruction format

Vector load instruction formats

X60: vector whole register load

C908: vector mask load

Notes

Remlab

Projects

RISC-V HCF instructions

By the way

Decoding RISC-V instructions

.fill

Opcode 0x07

LOAD-FP instruction format

Vector load instruction formats

X60: vector whole register load

C908: vector mask load

Notes

`.fill`

Opcode `0x07`

`LOAD-FP` instruction format