Assignment Chef icon Assignment Chef
All English tutorials

Programming lesson

Build Your Own E20 Assembler in Java: A Step-by-Step Programming Guide

Learn how to write an E20 assembler in Java that converts assembly language to machine code. This tutorial covers instruction formats, binary encoding, label handling, and output formatting, with timely examples from AI hardware and gaming trends.

E20 assembler Java assembly to machine code converter Java assembler tutorial E20 instruction set two-pass assembler Java computer architecture assignment binary encoding Java symbol table assembler E20 machine code output Java bitwise operations low-level programming 2026 AI hardware assembly gaming console assembler embedded systems programming assembler project help ATOM assignment 2 solution

Introduction: Why Build an Assembler in 2026?

Assemblers are the bridge between human-readable assembly language and machine code that processors execute. In 2026, with the rise of custom AI accelerators and RISC-V based gaming consoles, understanding low-level programming is more relevant than ever. This tutorial walks you through building an E20 assembler in Java, a classic assignment that teaches you instruction encoding, symbol tables, and binary manipulation. By the end, you'll have a working assembler that converts E20 assembly to 16-bit machine code, just like the ones used in embedded systems and educational simulators.

Understanding the E20 Instruction Set

The E20 is a 16-bit processor with a simple instruction set. Each instruction is 16 bits wide, with fields for opcode, registers, and immediates. Unlike the E15, the E20 has multiple instruction formats. For example, the addi instruction uses a 3-bit opcode, 3-bit source register, 3-bit destination register, and 7-bit immediate. The j (jump) instruction uses a 3-bit opcode and a 13-bit address. Your assembler must handle these formats correctly.

Key Instruction Formats

  • R-type: Arithmetic operations (e.g., add). Opcode (3 bits), rs (3), rt (3), rd (3), unused (4).
  • I-type: Immediate operations (e.g., addi, jeq). Opcode (3), rs (3), rt (3), immediate (7).
  • J-type: Jump operations (e.g., j). Opcode (3), address (13).

Check the E20 manual for the exact opcode values. For instance, addi is opcode 001, jeq is 110, j is 010, and halt is 010 with address 0.

Step 1: Parsing Assembly Instructions

Your Java program reads an assembly file line by line. For each line, you need to extract the instruction mnemonic, registers, immediates, and labels. Use String.split() or a Scanner to tokenize. Ignore comments (anything after //).

String line = "addi $1, $2, 3";
String[] parts = line.replace(",", "").split(" ");
// parts = ["addi", "$1", "$2", "3"]

Handle labels like beginning: by stripping the colon and storing the label with its current address in a symbol table. Use a HashMap<String, Integer> for labels.

Step 2: First Pass – Build Symbol Table

In a two-pass assembler, the first pass scans all lines to assign addresses and record label locations. Start address at 0 and increment by 1 for each instruction (since each is 16 bits, but addresses are word-aligned). Store labels and their addresses in the symbol table.

HashMap<String, Integer> symbolTable = new HashMap<>();
int address = 0;
for (String line : lines) {
    if (line.contains(":")) {
        String label = line.split(":")[0].trim();
        symbolTable.put(label, address);
    }
    address++;
}

Step 3: Second Pass – Encode Instructions

Now iterate again, this time generating machine code. For each instruction, identify its type and encode bits accordingly. Use bitwise operations in Java: <<, |, and masks.

Encoding Example: addi $1, $2, 3

Opcode addi = 001 (binary). Source register $2 = 010, destination $1 = 001, immediate 3 = 0000011 (7 bits, sign-extended). Combine:

int machineCode = (opcode << 13) | (rs << 10) | (rt << 7) | (imm & 0x7F);
// For addi: (1 << 13) | (2 << 10) | (1 << 7) | (3 & 0x7F) = 0x2A23

For jumps, the address field is 13 bits. Use the symbol table to resolve label addresses. For j beginning, encode the target address (e.g., 1) into the lower 13 bits.

Step 4: Output in Verilog Format

Your output must match the exact format: ram[address] = 16'bxxxxxxxxxxxxxxxx; // instruction. Use System.out.printf to format the binary string. Java's Integer.toBinaryString() can help, but pad to 16 bits with leading zeros.

String binary = String.format("%16s", Integer.toBinaryString(machineCode)).replace(' ', '0');
System.out.printf("ram[%d] = 16'b%s; // %s\n", address, binary, originalLine);

Example: Assembling loop2.s

Given the input file:

movi $1, 10
beginning: jeq $1, $0, done
addi $1, $1, -1
j beginning
done: halt

Your assembler should output:

ram[0] = 16'b0010000010001010; // movi $1,10
ram[1] = 16'b1100010000000010; // beginning: jeq $1,$0,done
ram[2] = 16'b0010010011111111; // addi $1,$1,-1
ram[3] = 16'b0100000000000001; // j beginning
ram[4] = 16'b0100000000000100; // done: halt

Handling Edge Cases

Your assembler must handle negative immediates (e.g., -1) by converting to two's complement 7-bit. For addi $1, $1, -1, immediate = -1 = 0x7F (7 bits). Also, ensure that labels are case-sensitive and that you skip comment lines.

Testing with Provided Examples

Use the starter test files to verify your output matches exactly. Develop your own test cases for each instruction type. Remember, the autograder will compare your output character by character.

Trend Connection: Why Low-Level Skills Matter in 2026

With the explosion of AI hardware like NVIDIA's custom instruction sets and Apple's M-series chips, understanding assembly and machine code gives you an edge in optimizing performance. Even in gaming, engine developers often inline assembly for critical loops. This assignment isn't just academic—it's a foundation for careers in embedded systems, compiler design, and hardware verification.

Conclusion

Building an assembler in Java teaches you instruction encoding, symbol tables, and binary I/O. Follow the steps: parse, first pass, second pass, output. Test thoroughly. Once your assembler works, you'll have a tool that can compile any E20 program into machine code ready for simulation. Good luck!