Help with the Assembler

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Help with the Assembler

Alex93
I'm writing the Assembler in Python. So far, I have a Translate() function that can successfully translate a single command into binary:

# Function that turns an A command into a binary value
def Acommand(input):
    input = input[1:]
    integer = int(input)
    output = str(bin(integer)[2:])
    while len(output)<16:
        output = "0" + output
    return output

#Function that translates a C command into binary instruction
def Ccommand(input):
    output="111"
    dest = ""
    jump = ""
    IsDest = False
    IsJump = False
    if "=" in input: #Checks if there's a destination component
        equals = input.index("=")
        dest = input[:equals]
        input = input[(equals + 1):]
        IsDest = True
    if ";" in input: #Checks if there's a jump component
        semicolon = input.index(";")
        jump = input[(semicolon + 1):]
        input = input[:semicolon]
        IsJump = True
    if "M" in input: #Adds 'a' bit
        output = output + "1"
    else:
        output = output + "0"
    for comp_instruction in comp_table: #Adds comp bits
        if input == comp_instruction:
            output = output + comp_table[comp_instruction]
            break
    if IsDest: #Adds dest bits
        for dest_instruction in dest_table:
            if dest == dest_instruction:
                output = output + dest_table[dest_instruction]
                break
    else:
        output = output + "000"
    if IsJump: #Adds jump bits
        for jump_instruction in jump_table:
            if jump == jump_instruction:
                output = output + jump_table[jump_instruction]
                break
    else:
        output = output + "000"
    return output

# Checks if input is A or C instruction, then calls appropriate function
def Translate(input):
    if "@" in input:
        return Acommand(input)
    else:
        return Ccommand(input)

Now I'm trying to iterate through the lines of a source file and call the Translate() function for each one, writing the result to a destination file. Code looks like this:

import Assembler

source = open("source.txt", "r")
dest = open("destination.txt", "a")

for line in source:
    dest.write(Assembler.Translate(line) + "\n")
    
source.close()
dest.close()

The output is very strange: all the A commands are correctly translated into binary, but the C commands are only between 10 and 13 bits in length! I don't understand what's going on, because if the problem was with the Translate() function, it shouldn't work when individual C commands are plugged in, but if the problem is with the for loop, it shouldn't work for the A commands. What am I missing?
Reply | Threaded
Open this post in threaded view
|

Re: Help with the Assembler

dolomiti7
Without having tested it - and without knowing the exact contents of your tables:

Could it be that your inner loops are sometimes failing to find a match in the tables?

For example:
for comp_instruction in comp_table: #Adds comp bits
        if input == comp_instruction:
            output = output + comp_table[comp_instruction]
            break

What happens if input never matches comp_instruction? Then no bits will be appended to output. Same applies to dest and jmp loops.
Potential reasons for that could be wrong data in the tables (spelling, uppercase/lowercase mixup...), control characters in the string (i.e. "JMP" != "JMP\n" ), non-compliant source code, etc.
I would probably add some debug statements inside these loops to narrow down the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Help with the Assembler

Alex93
Thank you! It was indeed the invisible "\n" at the end of every line in the source file which was causing the comparison to fail. Have fixed it with a simple .strip()
Reply | Threaded
Open this post in threaded view
|

Re: Help with the Assembler

WBahn
Administrator
Glad you found and fixed the mistake.

A good way to find bugs is to instrument your code. In this case, you know that the code is misbehaving when producing assembler output for C-instructions. So print the C-instruction output to the screen to see what it actually is and confirm that it is wrong. This will tell you whether the problem is in generating the output string, or in writing it to the file. If it is generating it wrong, then put in print statements as each part of the string is produced. This will quickly let you focus on the part of the code that is causing the problem.

When printing out strings for debugging purposes, I find that it is very helpful to include delimiters so that you can more easily see hidden whitespace. For instance, something like:

s = some_function()
print("[%s]" % (s))

Then, if there is a space or tab or newline at either end of the string, it becomes glaringly apparent.

Another way -- a 'better' way -- is to use assertions. You know that various parts of your assembler output are supposed to be certain lengths, so put in assertions that will stop of the code if they aren't those lengths.