May 15, 2026 8 min read 9 views

The SAM-32 ISA

The ISA for my CPU, and the why behind it!

The SAM-32 ISA

Before making a CPU, it is important to come up with how the CPU will perform the Fetch, Decode, Execute cycle. I want to make my CPU 32-bit (16 isn't enough to do cool stuff, and 64 is overkill).

The Name - SAM32

Obviously, the most important part of designing an ISA is the name itself. I've chosen to do SAM32 - Super Awesome Machine, the greatest Little-Endian machine of all time.

The Base

Being a guy in High School, designing the entire ISA from scratch (surprisingly doable) isn't worth my time, especially with AP tests coming up. I've chosen to essentially modify RISC for my own needs and plans for the future, so that it becomes something Mine. I'm going to remove the stuff I feel like are unnecessarily complex and add features that I think make programming easier (like my GOAT, the status register).

Philosophy

Simplicity means I will actually complete this project, and a complete project is better than an incomplete one, no matter how impressive it is. Many of my decisions throughout this entire project, not just the ISA are based on this.

Registers

Registers are high-speed memory located inside the CPU itself. Think of it as a list of values the CPU needs right now. The values in registers constantly get added, multiplied, incremented, etc. Having more is usually better, as it means your CPU can 'remember' more things at once (it can do this with RAM as well, but STORE-ing and LOAD-ing from RAM is much slower than manipulating stuff already in registers). I've decided to stick with the 32 registers RISC-V uses. It's considered the sweet spot between latency (although I don't have to worry about signal distance), complexity and ease-of-use.

Each of these 32 registers are 32 bits wide, and can be accessed via r0-r31. r0 will be hardwired to 0, and cannot be overwritten (writes to this register will just be ignored). It will be used as a constant sink and synthesize pseudo-instructions like CLEAR and NOP.

The Program Counter (referred to as PC) is a dedicated register that tracks the current instruction's address. It will be incremented at the end of each cycle to allow for the program to move forward.

The Status Register SR is a new register that tracks the result of the Arithmetic Logic Unit (ALU). It will be automatically updated by the ALU and contains 4 primary flags. As a result, it will be 4 bits wide.

BIT FLAG Purpose
0 Z ALU result is exactly 0
1 N ALU result is negative
2 C ALU addition carried over (result was >32 bits)
3 V ALU math caused an overflow (bits wrapped around)

Instructions

There's 3 different types of instructions, each starting with the opcode and the destination register, then their respective arguments.

R-Type - Register to Register

Opcode
[31:25]
Rd
[24:20]
Rs1
[19:15]
Rs2
[14:10]
Funct
[9:0]
#### I-Type - Immediate/Constants
Opcode
[31:25]
Rd
[24:20]
Rs1
[19:15]
Immediate
[14:0]
#### J-Type - Jumps/Calls
Opcode
[31:25]
Rd
[24:20]
Target Offset
[19:0]
##### Key

opcode register field immediate / offset function code

## Control Flow ### Branching Branching allows for the CPU to execute if statements and other conditionals. Essentially, the way it works is: you tell the CPU to make a comparison, and give it a `Target Offset`. If this condition is true, then the CPU 'jumps' (changes the PC) by the `Target Offset`, skipping or looping code.

Looping

Looping, as mentioned in the previous paragraph, is done entirely via software (decrement, compare, branch). Hardware loops are too complex for the scope of this project, and I don't really have plans to implement hardware loops in the future.

Instructions

Now, for the instructions themselves! These define the actual functions the CPU can perform!

Arithmetic and Logic

Instruction Format Operands Action Opcode Function Code
ADD R Rd, Rs1, Rs2 Rd = Rs1 + Rs2 0x01 0x001
ADDI I Rd, Rs1, #imm Rd = Rs1 + imm 0x08
SUB R Rd, Rs1, Rs2 Rd = Rs1 - Rs2 0x01 0x002
MUL R Rd, Rs1, Rs2 Rd = Rs1 * Rs2 0x01 0x003
MULI I Rd, Rs1, #imm Rd = Rs1 * imm 0x09
DIV R Rd, Rs1, Rs2 Rd = Rs1 / Rs2 0x01 0x004
DIVI I Rd, Rs1, #imm Rd = Rs1 / imm 0x0A
MOD R Rd, Rs1, Rs2 Rd = Rs1 % Rs2 0x01 0x005
MODI I Rd, Rs1, #imm Rd = Rs1 % imm 0x0B
LUI J Rd, #imm20 Take 20-bit immediate, shift left by 12 bits and store in Rd 0x12
AND R Rd, Rs1, Rs2 Rd = Rs1 & Rs2 0x01 0x007
ANDI I Rd, Rs1, #imm Rd = Rs1 & imm 0x0C
OR R Rd, Rs1, Rs2 Rd = Rs1 | Rs2 0x01 0x008
ORI I Rd, Rs1, #imm Rd = Rs1 | imm 0x0D
XOR R Rd, Rs1, Rs2 Rd = Rs1 ^ Rs2 0x01 0x009
XORI I Rd, Rs1, #imm Rd = Rs1 ^ imm 0x0E
SLL R Rd, Rs1, Rs2 Rd = Rs1 << Rs2 0x01 0x00A
SLLI I Rd, Rs1, #imm Rd = Rs1 << imm 0x0F
SRL R Rd, Rs1, Rs2 Rd = Rs1 >> Rs2 0x01 0x00B
SRLI I Rd, Rs1, #imm Rd = Rs1 >> imm 0x10
SRA R Rd, Rs1, Rs2 Rd = Rs1 >> Rs2
(Preserves Sign Bit)
0x01 0x00C
SRAI I Rd, Rs1, #imm Rd = Rs1 >> imm
(Preserves Sign Bit)
0x11

Memory

Instruction Format Operands Action Opcode Function Code
LW I Rd, [Rs1 + #off] Load Word: Read 32 bits from address Rs1 + #off into Rd 0x18
SW I Rs2, [Rs1 + #off] Store Word: Write 32 bits from Rs2 into address Rs1 + #off 0x19
LB I Rd, [Rs1 + #off] Load Byte: Read 8 bits, signed and padded to 32 bits from address Rs1 + #off into Rd 0x1A
LBU I Rd, [Rs1 + #off] Load Byte: Read 8 bits, padded to 32 bits from address Rs1 + #off into Rd 0x1B
SB I Rs2, [Rs1 + #off] Store Word: Write last 8 bits from Rs2 into address Rs1 + #off 0x1C

Control Flow & Branching

Jump means set PC to

Instruction Format Operands Action Opcode Function Code
JMP J #target Jump to PC + #target 0x30
CALL J Rd, #target Jump to PC + #target, Rd = PC + 4 0x31
RET I Rs1 Jump to Rs1 0x32
BEQ J #target Jump if Z to #target 0x20
BNE J #target Jump if not Z to #target 0x21
BLT J #target Jump if N != V to #target 0x22
BGE J `#target Jump if N == V to #target 0x23
BHI J `#targe Jump if not C & Z to #target 0x24
BLS J `#targ Jump if C | Z to #target 0x25

System & Stack

Yes, you saw PUSH and POP right! I love those guys!

Instruction Format Operands Action Opcode Function Code
PUSH I Rs1 R2 = R2 - 4, Stores Rs1 to memory address at R2 0x40
POP I Rd R2 = R2 + 4, Load from memory address at R2 to Rd 0x41
MFSR I Rd Copy SR to Rd, pad to 32 bits 0x42
MTSR I Rs1 Copy bottom 4 bits of Rs1 to SR 0x43
HALT I Suspends the CPU clock (future use) 0x44

Custom Misc. Instructions!

These are instructions I feel like are important, but aren't implemented in RISC-V

Instruction Format Operands Action Opcode Function Code
MAC R Rd, Rs1, Rs2 Rd = Rd + (Rs1 * Rs2) 0x01 0x006
SLICE R Rd, Rs1, Rs2 Extract bits based on start/length packed into Rs2, shift to bottom of Rd 0x01 0x00D
BSWAP R Rd, Rs1 Reverse byte order of Rs1 0x01 0x00E
CLZ R Rd, Rs1 Count leading zeros in Rs1, stores count in Rd 0x01 0x00F

Pseudo-Instructions

These instructions are meant to only be a 'thing' in the assembler, and get translated to one of the instructions above

Pseudo-Instruction Assembled Instruction Why?
SUBI Rd, Rs1, #imm ADDI Rd, Rs1, #-imm Flips the immediate to negative and uses the hardware ADD circuit.
NOP ADD R0, R0, R0 Burns a clock cycle
CLEAR Rd ADD Rd, R0, R0 Wipes the register to zero
MOV Rd, Rs1 ADD Rd, Rs1, R0 Copies a register from Rs1 to Rd