I sat down during lunch today and doodled up a design for a replacement CPU for the Hack machine with the same API, but with a few more registers in it, and which has a more traditional machine language. Here's the design, hopefully my flat text document will come out. Comments on the design are most welcome!
Mack: Warren's Idea for a Slightly More Powerful CPU
Goal: to build a different CPU, using the existing builtin Tecs chips with some extra control logic. Keep the same ALU and the same branch logic, but have multiple registers, and some CPU state flags, as per more popular CPUs.
Basic architecture: ALU; four 16-bit registers: R0, R1, R2, R3; two flag bits (negative and zero) which are set by the last ALU operation. Each instruction is 16 bits. There is up to one RAM fetch and one RAM store in each instruction.
Instruction formats: there are 3 instruction formats: load R0 with a constant, perform an ALU operation, and branch or jump. Here they are:
Load R0 instruction
0 <15-bit unsigned constant> Load R0 <- constant
ALU operation instruction
10 idd iss oooooo ss meaning dst = src1 op src2
dst = src1 op src2 Same 6-bits to control existing ALU,
and 2 bits to identify the dst, src1
and src2 registers.
When i bit is set in iss, we load
from RAM[src1], not src1. When i bit
is set in idd, we save result to
RAM[dst], not the dst register.
Result of the ALU operation clears or
sets the two flag bits: zero, negative.
This instruction can encode a relative branch of up to 2048 instructions, or it can encode an absolute jump to the value store in R0.
11 jjj <11-bit signed constant> When any of the j bits are set,
they act like the Hack jump bits.
Examine the zero and negative flags.
If a branch, sign extend the constant
to 16 bits, and add it to the PC.
However, when all the j bits are off,
this is an unconditional jump. Load
the contents of R0 into the PC.
label: identifier followed by colon
R0 = constant Load R0 with constant instruction
[*]Rn = [[*]Rn] op Rn ALU instruction, * means indirect addressing,
[ ] means that this is optional
Bxx label Relative branch to label, +/- 2048 instructions
Bxx is BEQ, BNE, BLT, BLE, BGT, BGE, or BR
for an unconditional branch.
JMP label This is an absolute jump "macro" instruction.
The assembler translates this into: load R0
with the label, followed by a jump instruction.
Also another macro instruction:
Rn = constant If Rn != R0, do R0 = constant, Rn = R0.
Thus, R0 becomes a sort of "scratch" register.
Sum the Numbers from 1 to 100
R1 = 1 // i = 1
R2 = 0 // sum = 0
R2 = R2 + R1 // sum = sum + i
R1 = R1 + 1 // i++
R0 = 100
R0 = R1 - R0 // calculate i - 100
BLE loop // loop while i <= 100
JMP end // end with infinite loop
I found a bug. The general Hack computer architecture has only one address bus from CPU to memory, used to both identify the location of the data to fetch from RAM and write back to RAM. Therefore, the CPU cannot read from one address and write back to a different address in an instruction.
This means my CPU cannot do, for example, *R3 = *R2 + 1, i.e. load from the address which R2 points to, and write back to the address in R3.
So, the solution is, when the i bit is set in both idd and iss, the src1 value (i.e the ss in iss) is replaced with the destination value. This allow the CPU to do *R3 = *R3 + 1.
Along the way I discovered some interesting details.
Easily extended Jump instruction
As I was wiring up the PC load address for the Jump instruction (Branch with jjj = 0) I had a rather snaky
wire running its way to R0. It went past the ALU on it way and I got to thinking...
The Jump instruction, 11 000 xxxxxxxxxxx, has all the correct bits available in the undefined region
to turn it into a PC= instruction with full ALU computation including indirect src1. 11 000 iss oooooo ss
This may be quite useful for return IPs for assembly language subroutines, computed jumps and implementation
of call and return stack functions.
It appears useful if the PC= instruction does not set the status bits; there may be a way to use the
zr flag as a subroutine pass/fail return value.
R0 = RIP_1
R3 = R0
R0 = 0 // good result
PC = R3
Status bits and overloaded R0=n instruction
There is a problem with the above code snippet. Depending on how the assembler encodes the R0=0 instruction, it may be an A-instruction that does not set the
I changed the data routing for the A-instruction—instead of muxing the data directly into R0, I am muxing it
into the ALU x input, setting the ALU function to out=x, and loading the status register.
If R0 is considered a scratch register that the assembler my use at will, then additional useful macros are possible.
How does the relative branch work? For example in the code snippet shared:
// Sum the Numbers from 1 to 100
0: R1 = 1 // i = 1
1: R2 = 0 // sum = 0
2: R2 = R2 + R1 // sum = sum + i
3: R1 = R1 + 1 // i++
4: R0 = 100
5: R0 = R1 - R0 // calculate i - 100
6: BLE loop // loop while i <= 100
7: JMP end // end with infinite loop
What would be the 11bit constant used for the BLE?
loop is at address 2. The BLE instruction is at address 6. Given that for a branch we "sign extend the constant to 16 bits, and add it to the PC", we would have to add -4 to the PC to get to loop from the BLE instruction. -4 is 65532 which is more than can be represented with 11 bits...
-1 is bin(65535) is 0b1111111111111111
-2 is bin(65534) is 0b1111111111111110
-3 is bin(65533) is 0b1111111111111101
-4 is bin(65532) is 0b1111111111111100
-5 is bin(65531) is 0b1111111111111011
Conditional jumps to locations more than +/- 1023 away are quite rare because they are primarily only used in conditional and looping control structures. The dependent code in ifs and loops should be relatively small.
The only way to code an absolute conditional jump would be to use a relative "skip" around a jump.
BNE skip // BEQ far_away
The assembler would need to flag any conditional jumps that are too far away as errors and the programmer would need to fix them. The assembler can not do this automatically because the absolute jump uses R0, and it cannot know if R0 is safe to use.