Week: 5
Points: 25
Time: ~6 hours
Deliverable: verilog/cpu/ directory + synthesis report + UART output photo + diary/week-05.md
What you ship
verilog/cpu/decoder.v— instruction decoderverilog/cpu/immgen.v— immediate generatorverilog/cpu/cpu.v— top-level CPU moduleverilog/cpu/top.v— Tang Primer 25K top module (UART + reset + clock)lab5_simulation_output.txt— sum-to-N result = 55 in simulationlab5_synthesis_report.txt— Gowin/Apicula synthesis report (LUT count, Fmax, BRAM)lab5_uart_output.jpg— photo or screenshot of UART terminal showing result on siliconlab5_seeded_bug_analysis.md— description of the seeded bug and how you found itdiary/week-05.md
Lab 5.1: Instruction decoder
Write decoder.v. Input: 32-bit instruction word. Outputs: control signals for every component in the data path.
Minimum required outputs:
| Signal | Width | Description |
|---|---|---|
reg_we |
1 | Register file write enable |
mem_we |
1 | Data memory write enable |
mem_re |
1 | Data memory read enable |
alu_op |
3 | ALU operation (matches lab-2 op encoding) |
alu_src |
1 | 0 = rs2, 1 = immediate |
branch |
1 | This is a branch instruction |
jump |
1 | This is JAL or JALR |
mem_to_reg |
2 | 0 = ALU, 1 = memory read, 2 = PC+4 (for JAL) |
branch_type |
1 | 0 = BEQ, 1 = BNE |
Run lab5_decoder_tb.v which feeds one representative instruction of each type and checks all control signals. All 11 instruction variants must produce correct outputs.
Lab 5.2: Immediate generator
Write immgen.v. Input: 32-bit instruction word. Output: 32-bit sign-extended immediate.
Handle all four immediate-producing formats: I-type (ADDI, LW, JALR), S-type (SW), B-type (BEQ, BNE), J-type (JAL).
Remember: B-type and J-type immediates have their bits reordered across the instruction word. The output must be the reconstructed, sign-extended offset.
module immgen (
input wire [31:0] instr,
output reg [31:0] imm
);
wire [6:0] opcode = instr[6:0];
always @(*) begin
case (opcode)
7'b0010011, // I-type (ADDI, etc.)
7'b0000011, // I-type (LW)
7'b1100111: // I-type (JALR)
imm = {{20{instr[31]}}, instr[31:20]};
// ...
endcase
end
endmodule
Run the immgen testbench for all four format types with positive and negative immediates.
Lab 5.3: CPU integration and simulation
Write cpu.v. Instantiate and connect:
pc_reg(a 32-bit DFF holding the program counter)imem(instruction memory, initialized from a hex file)decoderimmgenregfilealudmem(data memory)- Writeback mux (selects between ALU result, memory read, PC+4)
- Branch logic (computes next PC for branches and jumps)
Load sum-to-n.hex (your Lab 4.4 output) into imem. Simulate for 200 clock cycles. Verify that the data memory at address 0 contains 55 (0x00000037) after the program completes.
iverilog -o cpu_sim verilog/cpu/cpu.v verilog/cpu/decoder.v verilog/cpu/immgen.v \
verilog/alu/alu.v verilog/mem/regfile.v verilog/mem/mem.v \
worksheets/csa-110/lab5_cpu_tb.v
vvp cpu_sim | tee lab5_simulation_output.txt
# Expected: "[PASS] mem[0] = 0x00000037 (55)"
Lab 5.4: Synthesize and boot
Write top.v for the Tang Primer 25K. Wrap your cpu.v with:
- PLL or clock-divider to generate your target clock (start at 4 MHz; increase if synthesis Fmax allows)
- UART transmitter (the academy provides
uart_tx.vinworksheets/csa-110/) - Reset logic (active-low reset from the Tang Primer's button)
- UART output: at program end (when PC reaches the infinite loop), transmit the value in data memory address 0 as an ASCII hex string
# Synthesize with Apicula
yosys -p "synth_gowin -top top -json top.json" verilog/cpu/top.v ...
nextpnr-himbaechel --device GW5A-LV25MG121 --json top.json --write top_pnr.json
gowin_pack --device GW5A-LV25MG121 top_pnr.json -o top.fs
openFPGALoader -b tangnano20k top.fs # or tang_primer_25k
Connect a UART terminal at 115200 baud. Observe the sum-to-N result. Take a photo or screenshot.
Record from the synthesis report: LUT count, Fmax estimate, BRAM blocks used.
Lab 5.5: Seeded failure drill
The testbench lab5_cpu_tb.v includes a version with one deliberately broken instruction: a BEQ with an off-by-one branch offset. The broken version produces the wrong answer (not 55).
Your task: find the bug. Steps:
- Run the broken testbench:
iverilog ... worksheets/csa-110/lab5_seeded_tb.v - Observe the wrong answer
- Add
$displaystatements or use GTKWave to trace execution - Find the instruction with the wrong offset
- Fix it and verify the correct answer reappears
Record the debugging session in lab5_seeded_bug_analysis.md: what symptom did you observe, what tool or technique found the bug, what the fix was.
Toolchain Diary
Record in diary/week-05.md:
- Your CPU's line count vs Arlet's
cpu.vfrom CSA-101 (with explanation) - Synthesis report: LUT count, Fmax, BRAM blocks
- The critical path (which module limits Fmax)
- The seeded bug: describe it and how you found it
Grading
| Component | Points |
|---|---|
decoder.v testbench: all 11 instruction variants pass |
5 |
immgen.v testbench: all four format types with positive and negative immediates |
3 |
| CPU simulation: sum-to-N produces 55 in simulation | 7 |
| Synthesis report: LUT count, Fmax, BRAM noted | 3 |
| UART output on silicon (photo or screenshot) | 4 |
| Seeded bug analysis: found, fixed, and explained | 3 |