Total points: 20
Estimated time: 2.5 hours
Prerequisites: Labs 3-5 complete; full optimizer pipeline working; browser access to godbolt.org
Overview
This lab uses godbolt.org to compare your CSA-201 compiler's output against gcc 14.1 at -O0, -O2, and -O3 on identical source. The goal is not to match production quality; it is to identify the specific optimization gaps and name the technique that would close each one.
Setup
Open godbolt.org. Set:
- Language: C
- Compiler:
RISC-V rv32gc gcc 14.1 - Compiler flags: start with
-march=rv32im -mabi=ilp32 -O0
Keep the compiler flags box open. You will change flags for each part.
Part A: Constant function (4 pts)
A1: Compile and compare
Compile the following function on godbolt.org at -O0 and -O2:
int square_sum(void) {
return 3*3 + 4*4;
}
Also compile the equivalent in your compiler (use a Virtus OS function that returns Math.multiply(3,3) + Math.multiply(4,4) from a zero-argument function).
Fill in:
| Compiler | Flags | Instructions emitted | Notes |
|---|---|---|---|
| Your compiler (full pipeline) | |||
| gcc | -O0 | ||
| gcc | -O2 | ||
| gcc | -O3 |
A2: Analysis
If gcc -O2 and your compiler both produce li a0, 25; ret, explain what optimization they both applied. If either produces more instructions, explain why.
Part B: Short arithmetic function (5 pts)
B1: Compile and compare
int clamp(int x, int lo, int hi) {
if (x < lo) return lo;
if (x > hi) return hi;
return x;
}
Compile on godbolt.org at -O0 and -O2. Count instructions. Translate the function to your compiler's input language (either Virtus OS Jack or directly as assembly target) and compile with your full pipeline.
Fill in:
| Compiler | Flags | Instructions | Branches | Moves |
|---|---|---|---|---|
| Your compiler | ||||
| gcc | -O0 | |||
| gcc | -O2 |
B2: Analysis
Identify one specific optimization that gcc -O2 applies to clamp that your compiler does not. Name the optimization technique. Write 100 words explaining what information gcc has that your compiler would need to apply the same transformation.
Part C: Loop function (7 pts)
C1: Compile the loop
int sum_array(int* arr, int n) {
int s = 0;
for (int i = 0; i < n; i++) {
s += arr[i];
}
return s;
}
Compile on godbolt.org at -O0, -O2, and -O3. Record the instruction count and flag the most significant optimization that each level applies.
For your compiler: translate sum_array to the closest equivalent in your compiler's language. Compile with the full pipeline. Record instruction count.
C2: Gap analysis (4 pts)
For the loop function, complete this gap analysis table:
| Optimization | Present in gcc -O2? | Present in your compiler? | Technique name |
|---|---|---|---|
| Loop counter held in register (not reloaded from stack each iteration) | |||
Array pointer incremented instead of arr[i] recomputed each iteration |
|||
| Loop body instructions reduced vs -O0 | |||
| Any auto-vectorization (look for multiple lw/sw in single iteration) |
For each row where gcc has the optimization and yours does not, write one sentence naming the analysis your compiler would need to implement it.
Part D: Your best function (4 pts)
Choose one function from your CSA-201 compiler's output where the result is close to or better than gcc -O0 on the same logic. Paste both outputs (godbolt.org output and your compiler's output). Explain why your compiler does well on this function: which passes from Labs 3-5 fire most helpfully?
Grading
| Part | Criteria | Points |
|---|---|---|
| A1 | Table complete with correct instruction counts | 2 |
| A2 | Correct identification of constant folding; explains any discrepancy | 2 |
| B1 | Table complete; all three instruction counts measured | 3 |
| B2 | Names a specific optimization; 100-word explanation is technically correct | 2 |
| C1 | Loop instruction counts at all three flag levels measured | 3 |
| C2 | Gap analysis table complete; technique names are correct | 4 |
| D | Example chosen is genuinely competitive; explanation cites specific passes | 4 |
| Total | 20 |