Project 6 - The Hello World Assembler

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Project 6 - The Hello World Assembler

WBahn
Administrator
As you likely saw when you first learned programming, the first program is traditionally one that does nothing but print "Hello World" to the screen.

New programmers seldom grasp the significance of this program and often think that it is pointless. To the contrary, it is one of the most important programs you can write. This gets lost because most people start out with a well-defined programming environment and their Hello World program works without a hitch. So let's take a moment and see some of the things that go into making Hello World run.

First, of course, you need some way to edit and save your program. Perhaps you are doing this within some IDE (Integrated Development Environment), or perhaps you are writing your program in some generic text editor. In either case, you have to know how to create a new file, edit it, and save it in a specific location of your choosing. You also have to be able to use the right character set, such as ASCII, and be sure that you are not saving extraneous formatting information, which would likely be the case if you choose to use something like a word processor to write your programs. Just accomplishing this much can sometimes be a challenge, particularly if you are using an unfamiliar IDE or are new to the operating system you are using.

Next, you need to have the proper tools installed on your machine and they must be accessible. In the case of the Hack assembler, you need to run your program from a command prompt (also known as a terminal window or a shell). Accomplishing this can be a bit involved and challenging if you have never done it before. Not only do you need to know how to open a command w and you need to be able to pass arguments to your program from the command prompt window and how to navigate to the right folder, but you need to know how to run your program from that folder in such a way that it can access the files in that folder. How this all is done varies greatly depending on the programming language and/or operating system you are using.

Then you need to be able to open and read the contents of one file (the file containing the assembly language program) while also opening another file to which you write the Hack machine code. In later projects, this will get more complicated because you will need to be able to identify all of the files of a particular type in a folder and process each of them in turn. Fortunately, the Assembler project starts of very tame in this regard in that you only need to read a single file and the name of that file must be supplied by the user via the command line.

There are numerous issues that can prevent you from successfully getting this far and the real purpose of a Hello World program is to let you focus on identifying and resolving these issues before you start trying to develop your actual program.

In software development (and most other engineering endeavors), incrementalism is the key to rapid success. Do not try to write the finished program all at once. Instead, take baby steps. In the case of your assembler, the first step truly is the traditional Hello World, but make sure that you can run your Hello World program from the folder that your project files are located in.

Once your Hello World program is working, get your program so that it can accept command line arguments. You don't need to do anything more than be able to print each one out to the screen -- doing so will establish that you can access them as needed.

The next version should do nothing but open up the .asm file that is passed as the command-line argument, read it line-by-line, and close it. One simple way to test that you are being successful is to print each line to the screen, enclosed in square brackets, to visually confirm that you are reading it one line at a time, and after you close the file, print out how many lines the file contained. This simple exercise might reveal some subtleties that you might not have thought about. For instance, each line in a text file ends with some kind of end-of-line delimiter and not all programming languages deal with them the same way. Most will strip them off the string that is read in, but some will not. It will usually be very evident what your program is doing based on whether the closing square bracket that you added is on the same line as the string that was read. Another subtlety that can cause problems is how the last line in the file is handled -- is it the last non-blank line, or is it a blank line the follows the last non-blank one? Spending the time now to get your program to read in the file contents the way you want it to can save an enormous amount of time and effort later.

The next version is to add the ability to write the Hack output file. To accomplish this, you need to be able to do a bit of string manipulation on the command line argument since if the user supplies "example.asm" at the command line, you must open the file "example.hack". Don't worry about the actual output that you generate making any sense -- there is no need to examine the contents of each line of assembly in order to decide what to write to the hack file. You can simply write a string of sixteen zeros to the file for each line you read from the .asm file. Then open the .hack file in a text editor and confirm that it looks like what you expected. Common issues include all of the output being on a single line or blank lines appearing between each line of machine code. Also, confirm that the correct number of lines was written to the file -- common programming mistakes can result in there being one-too-many or one-too-few.

Once you get to this point, you are actually in a very strong starting position to begin developing your actual assembler. But now is not the time to abandon an incremental approach. Generous screen output is your friend. Your first step might be to do nothing but print each line containing actual code or labels, followed by an indication of whether it is an A-command, a C-command, or and L-command. As you add additional processing steps, produce meaningful screen output. For instance, you might output the following:

MD = M - 1; JGT  [C] (AD)(D-1)(JGT) dest=011 comp=1110010 jump=001

Output such as this is invaluable when it comes time to find and fix bugs in your code.