How the real assemblers work?

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

How the real assemblers work?

BLOB
Questions...

1) How does the assembler in the actual computer work? I am sure it does not use a high level language to convert the .asm file to binary... How does it does it then? using Machine code itself?

2)If the assembler in actual computers are written in machine code itself, cant I implement "project 6" using the hack assembly language itself? Why use high level language?

Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

milythael
Higher level languages are more suited to developer productivity.  They provide higher level language constructs that manage smaller details for you.  They are generally more concise, more understandable, and easier to write and read.

Most assemblers are written in high level languages.  What makes you believe they are not?
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

BLOB
I am talking about real computer... if a high level language converts .asm files to binary, who/what will convert the instruction code of the high level language to binary? I think I hav created a chicken and egg analogy here..

So, my question is this...

How was the first assembler written? Obviously, there was no high level language in the time of first assembler and thus I assumed they were written in machine code...

Also please tell me if I can complete "Project 6" using the hack language...
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

BLOB
Thus,
Am I correct to assume...

- First programs were written in machine code (no assembler or any high level language)
- then using machine code and .asm data from text file (which is converted to binary code by the keyboard output itself), they created assembler to convert the data from text file to binary instructions

?
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

milythael
This is akin to asking about the history of tools.  Today, if we want to build something, even at home in the garage, say a go-cart, we use modern tools to do so.  These tools make us more productive and allow us to build almost anything we can imagine.  Writing code in machine language or even assembly language most of the time is like saying, I want to build a go-cart with an internal combustion engine but I can only use stone tools to do it.

Yes, it is possible to complete any programming task in any language.  No, completing the assignment in a language other than directed is probably not the best way to learn about how real computers work.  ENIAC ( http://en.wikipedia.org/wiki/ENIAC ) was the first general purpose turing complete computer.  Programming it required punching hundreds, even millions of cards.  Learning to program ENIAC is unlikely to be valuable in learning how real computers work today.

This course, as written, teaches you real skills, real knowledge about how real, modern computers work.

Writing the exercise in assembler might be fun.  It is not progress to understanding real computers better.  

"if a high level language converts .asm files to binary, who/what will convert the instruction code of the high level language to binary"  In modern computers, translating a high level language into usable binary code is the work of a compiler, linker, assembler tool chain.  A compiler takes high level language and converts it into binary or assembler.

I hope my responses are helpful.  I am a fellow student of the material, but my day job is software developer.
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

cadet1620
Administrator
In reply to this post by BLOB
BLOB wrote
2)If the assembler in actual computers are written in machine code itself, cant I implement "project 6" using the hack assembly language itself? Why use high level language?
The limiting factor in writing your assembler in Hack assembly language is that the Computer Simulator has no I/O facilities to access files on the host computer. For project 6, you need to process .ASM files that are stored on your computer and write .HACK files that will get loaded into the simulator.
BLOB wrote
1) How does the assembler in the actual computer work? I am sure it does not use a high level language to convert the .asm file to binary... How does it does it then? using Machine code itself?
40 years ago many large computer programs were still written in assembly language because every byte counted. For example, the IBM 370 model 145 was IBM's first mainframe with semiconductor memory and it had a maximum configuration of 512 kilobytes of RAM.  Base configuration was only 112 KB. Memory cycle rate was about 3 MHz.

As memory got bigger and speeds increased, we could stop worrying so much about the size of programs and use higher level languages to make programming faster and less error prone.

The assemblers for modern computers are almost certainly written in a high-level language, very likely to be 'C'. See for example http://en.wikipedia.org/wiki/GNU_Assembler.

--Mark

Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

BLOB
Thanx milythael and Mark.. I understand it now... Now off I go to write the assembler in C++.. Hopefully I will learn how a compiler works in later chapters..
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

cadet1620
Administrator
BLOB wrote
Thanx milythael and Mark.. I understand it now... Now off I go to write the assembler in C++.. Hopefully I will learn how a compiler works in later chapters..
This might be a good time to try a language like Python or Ruby.  They make working with symbol tables a whole lot easier that using the C++ STL map templates.

--Mark
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

mysteray
In reply to this post by cadet1620
This kind of got me thinking, also. There appears to be a a big gap between chapter 5 and 6. Here the bottom up methodology is no longer adhered to (Not counting chapter 4 which introduced machine language before the actual implemantation in the architecture happened).

When the first computer was built, an assembler for it must have been coded in machine language, as there wasn't anything else to begin with. It was probably an evolutionary software process in which parts that have been machine coded before, have been used again, eventually leading to a fully fledged assembler, as we know it these days.

I understand that this book sketches how a modern computer could be developed today, using the tools we now have. Moreover, it would most certainly most certainly have been extremely tedious and probably utterly impractical to develop an assembler from the ground up.  I would have enjoyed it though, if it could have explained the evolutionary process that let us to where we are today a little more.

I really love this course and I don't intend to give it a bad rap in any way.

-Michael-
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

cadet1620
Administrator
mysteray wrote
When the first computer was built, an assembler for it must have been coded in machine language, as there wasn't anything else to begin with. It was probably an evolutionary software process in which parts that have been machine coded before, have been used again, eventually leading to a fully fledged assembler, as we know it these days.
Indeed, we are resting on the shoulders of giants!  For my Assembly Language Programming course in college we had to write an assembler using the system supplied assembler. We had to write all the string handling and symbol table storage and lookup code ourselves. Fortunately, we got to use the system library for I/O. The final was to use our assembler to assemble itself and produce a binary identical output to the system assembler.

"Fully fledged" assemblers can be quite complex, including macro processing (like C's #defines, but more capable).  For a TECS follow-on project I did I added macros to my Hack assembler. Here's an example.

Assembly language source code:
(printString.3)
    D=arg1                // String.charAt (s, ++i)
    PushD
    @printString$i
    DM=M+1
    PushD
    Call2   String.charAt

Sample macro definitions:
MACRO   D=arg1    
    A=&arg1
    D=M
ENDM

MACRO   A=&arg1    
    @ARG
    A=M
ENDM

MACRO   Call2   function
    @R14            // R14 = 2 args
    M=1
    M=M+1
    DoCall  {function}             
ENDM

MACRO   DoCall  function
    @{function}
    D=A
    @R15            // R15 = function
    M=D
    @?RIP           
    D=A             // D = RIP
    @Vm..fnCall
    0;JMP
(?RIP)
    PopD
ENDM

Generated code showing expanded macros:
267        (printString.3)
                 D=arg1
             +1     A=&arg1
  267     2  +2     @ARG
  268  FC20  +2     A=M
  269  FC10  +1     D=M
                 PushD               // arg1 = s
  270     0  +1     @SP
  271  FDE8  +1     AM=M+1
  272  ECA0  +1     A=A-1
  273  E308  +1     M=D
  274    25      @printString$i      // arg2 = ++i
  275  FDD8      DM=M+1
                 PushD
  276     0  +1     @SP
  277  FDE8  +1     AM=M+1
  278  ECA0  +1     A=A-1
  279  E308  +1     M=D
                 Call2   String.charAt
  280    14  +1     @R14
  281  EFC8  +1     M=1
  282  FDC8  +1     M=M+1
             +1     DoCall  String.charAt
  283  ----  +2     @String.charAt
  284  EC10  +2     D=A
  285    15  +2     @R15
  286  E308  +2     M=D
  287   291  +2     @$21_1_RIP
  288  EC10  +2     D=A
  289  ----  +2     @Vm..fnCall
  290  EA87  +2     0;JMP
  291        +2 ($21_1_RIP)
             +2     PopD
  291     0  +3     @SP
  292  FCA8  +3     AM=M-1
  293  FC10  +3     D=M

--Mark
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

mysteray
The more low level stuff you do, the more it makes you realize and appreciate, how much work has already been done to pave us the road. And moreover, with every step down the road, everything gets so much easier with the new tools at hand. It has been amazing journey.

-Michael-
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

MustGoDeeper
In reply to this post by milythael
This is akin to asking about the history of tools.  Today, if we want to build something, even at home in the garage, say a go-cart, we use modern tools to do so.  These tools make us more productive and allow us to build almost anything we can imagine.  Writing code in machine language or even assembly language most of the time is like saying, I want to build a go-cart with an internal combustion engine but I can only use stone tools to do it.
Except modern tools don't need to be broken down to stones in order to work. High level programming languages still need to be broken down and assembled which is where the question derived. If high level languages are used to assemble programs at lower levels, what is assembling the higher level language? Itself? As Mitch Hedberg would say, "Who is the real hero?!"
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

Shimon Schocken
Administrator
I enjoyed reading this thread -- all the good questions and answers.  The computer science field, like any other field of science, was built by numerous scientists and prcatitioners, each contributing a small piece. So, there are many heroes.  

Every programming language, be it a low-level assembly language or a high-level modern language, is an abstraction. The abstraction was created by a language designer, a person who had some purpose in mind.

In order to implement the abstraction, i.e. turn the language from a formal specification to an practical tool, you must be able to translate programs written in this language into another language which we already know to execute.  Depending on the abstraction level of the language that we wish to translate, this translation agent is called "compiler", "VM translator", "assembler", etc.  Whatever the name, it always translates one text file (e.g. containing C code) into another text file (e.g. containing assembly code). Because this translator is essentially very fancy text-processing program, it can be written in any language of your choice.

You are right to assume though that the very first assembler was written in machine-language. But once this first step was achieved, the resulting assembler could be used to translate symbolic programs, so from that point onward there was no more need to write in binary code. And the rest is history ...

"Intelligence is the faculty of making artificial objects, especially tools to make tools." (Henry Bergson, 1859-1941)

-- Shimon  
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

wsaleem
I am happy to find this thread. I am going through the book in order to teach the course later and am up to Chapter 6 so far. One of the appeals of the course to me was to build a self-contained computer. As one of the earlier commenters pointed out, the assembler is a break from that ideology.

In reality, the same machine performs the task of collapsing the abstraction (down to machine code) and executing it. Chapter 6 necessitates an external machine in the Hack tool chain. Without knowledge of further chapters, I think it would make for a nice project to port the assembler to Hack.
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

ivant
One point that wasn't explicitly mentioned in this thread was cross-assembling (and cross-compiling). Once you have a computer with at least assembler, you can use it to generate binary code for another computer, so you don't need to write in machine code even for a brand new processor. I think it's very unlikely that there was just one assembler written directly in machine code, I think it's safe to assume that their number is quite small.

And there are many computers, for which it doesn't make sense to run their own dev tools. E.g. embedded systems, smart phones and game consoles to name a few. In fact HACK is quite similar to cartridge-based game consoles. The games (programs) are on the cartridge (ROM). It has RAM and I/O devices but no storage.

It's impractical to create a HACK assembler in HACK, because you don't have no long-term storage. You'll have to somehow enter the program in RAM, but HACK's only input method is the keyboard. So you'll have to ask the user to enter the assembly program by hand.

Then, when that's done the assembler can translate it to machine code, but it can't store the result anywhere except in the RAM. And then your only output option is to print it to the screen, which means that the user would have to enter it in a file again by hand.

A third problem is, that the HACK assembler is a two-pass assembler. The first pass, as described in the book, would just construct the symbol table, and the second pass generates the code. But this would require the poor user to have to enter the program twice. So instead, the first pass would have to store the program in some form in RAM. Clearly storing it as stings isn't an option, because they use too much space. A better option would be to compile the program on the first pass and store the places it needs to revisit for the second pass.

The HACK architecture has 32K words ROM and 24K words RAM. Of the latter, 8K words are used for memory-mapped I/O. We can use most of it (except the keyboard, but it's just 1 word) to store the symbol table because it's only used between passes. but we shouldn't store any of the final program there, because we'll need the video to show the result. And the assembler would need some RAM to work with, so it will be able to produce programs which are less than 16K words.

Then comes the data structures and I/O operations. The assembler needs to use at least strings, I/O and a symbol table. The first two are implemented in JACK in chapter 12 and previous chapters have access to them in the form of the provided JackOS .vm files. One option would be to further translate them to HACK machine code and to provide them as ready-made code. But HACK doesn't have a standard way to call subroutines, it only has jumps. We'll have to explain the stack and the calling convention used by the VM. And the symbol table is quite complex by itself.

Implementing all this in machine code, or even in assembler, for a machine which lacks push and pop, call and return, has only two real registers and a very limited memory addressing would be quite challenging.
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

Jane Wu
In reply to this post by Shimon Schocken
Hi, Shimon
 I am so fortunate to learn this course that lets me have a higher and new perspective of the computer science. I have finished the first part , and I learned that the second one is coming in this year. Does it have the definite date? The course helps me a lot ,and I want to continue to study that.
Thanks Shimon, thanks for you to give us a so amazing course.
Jane
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

ybakos
@Jane Wu: Stay tuned, part 2 will eventually be available.
Reply | Threaded
Open this post in threaded view
|

Re: How the real assemblers work?

63rrit
In reply to this post by Shimon Schocken
Shimon Schocken wrote
...this translator is essentially very fancy text-processing program, it can be written in any language of your choice.

You are right to assume though that the very first assembler was written in machine-language. But once this first step was achieved, the resulting assembler could be used to translate symbolic programs, so from that point onward there was no more need to write in binary code. And the rest is history ...

"Intelligence is the faculty of making artificial objects, especially tools to make tools." (Henry Bergson, 1859-1941)
Those explanations (a few sentences are enough) in this thread should be incorporated in the next edition of the book!

i am happy i found this thread :)