|
|
Hi. I've been trying to write this assembler for weeks and I can't come up with a good solution. How do you actually write it? I've tried basically using many loops and if/else checks, for checking the syntax, and then a "look up table" for converting the letters into bits, but it feels like a very inefficient solution.
Is there a way to avoid all the checks and loops to convert it more directly? Or rather, what's the most efficient way to write such an assembler or parser? Thanks in advance.
|
Administrator
|
One thing to keep in mind is that the authors have explicitly stated that the tools you write can assume that the input files are error-free. So you don't need to check for correctness.
Take things one step at a time and reduce the assembly language input file into a stream of strings that are very standardized.
First, of course, you need to be able to read the file successfully one line at a time.
So write a program that does nothing but reads the file one line at a time and outputs that line to the screen. You'd be surprised at how many people struggle with converting a command to a machine instruction and their root problem is that they are not even reading the line from the file successfully, so don't assume that you are, prove it by outputting the line. I'd recommend printing the line between a pair of delimiters, such as square brackets, so that you can see exactly where your program thinks the line starts and stops.
Once you have that, start whittling away at lines. First, look at each string (representing the line from the input file) and remove any comments. Focus on the question of how you will recognize that a string has a comment and then how you will edit that line to remove it from the string. Then leverage the fact that the Hack assembly language ignores whitespace (other than line breaks), so figure out a way to remove all of the whitespace from each line.
Get that far and then see if you can see something you can do to get you one step further. Don't try to tackle it all at once.
|
Administrator
|
Oh, and I completely forgot one other thing.
The authors give you a recommended API to guide you. We can debate whether or not it is the best API that could have been put together, but it is at least a workable API.
So sketch out a program that implements the assembler using that API. Assume that all of the functions described by the API exist and work correctly. What does your top-level code look like? The goal is to use just their functions and to minimize the code you write so that it just calls those functions and uses their return values in order to implement the functionality.
This will force you to think carefully about what each of those functions has to accomplish when you get around to writing it.
You are, of course, free to completely ignore the authors' API recommendation, but unless you have a pretty clear idea about how to structure and develop your own code, I would recommend against that.
Now go ahead and write your top level code using the API functions and implement those functions as stubs -- functions that have barely enough code in them so that they can be called without hanging or causing the program to crash. In some cases, they can merely just return immediately. In other cases, hard code them to do one thing and one thing only. For instance, the comp() command in the Code module returns a 7-character string. So have it return "1010111" and move on. For the hasMoreCommands() function in the Parser module, set up a counter variable that is advanced every time the function is called and have it return True the first 50 times and False on the next go.
By stubbing out the functions, you can quickly get a program that generates a Hack machine code file and verify that it is doing so correctly (with respect to the stubs). You don't even need to be able to open the input file yet. Your Hack file may start out as just 50 copies of the same C-type command that the stubbed function produced. Then you can upgrade the commandType() function so that each time it is called it randomly decides whether is a C-type or an A-type and, if it's an A-type, have the other functions stubbed out to treat the command as if it were @42. Once that's working, have it no randomly pick between all three command times and, when it's an L_COMMAND, have the stub claim that the string is "Fred".
Once you have that, you can have the program open the actual input file and read it line by line, but ignore what it is actually reading. The goal here is to just update the hasMoreCommands() function so that it reads each line and returns False once the end of the file is reached.
Do you see the pattern? Incrementally develop your program, making small changes as you go. Any part that you are not ready to implement, sidestep it by brute forcing the program to make assumptions. As you proceed, remove those assumptions little by little.
|
|