Direct/Indirect Addressing Question

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Direct/Indirect Addressing Question

dgnunch
So I got side tracked on a later project and then got busy and stopped and decided to do everything from scratch again. Although, for some reason (again) I got competely confused about something on Ch4. I think I have some sort of deep reading disability so prepare yourselves this is going to be dumb. My problem is with the difference between direct/indirect addressing. I genuinely cannot understand the difference.

So, LOAD R1,67 is cited as an example of direct addressing. This makes sense to me. When I pull the binary instruction that represents "LOAD R1,67" from memory, the "67" is literally the binary address that holds the value I want to store in R1.

LOAD R1,bar is cited as a completely identical operation (also performing R1 <-- M[67]), however I don't get this. It says right before that we should "assume bar refers to memory address 67". When I pull the binary instruction "LOAD R1,bar" from memory, is that trying to say that the "bar" in that instruction is a different memory address that we previously defined to hold the same value as M[67]? So bar itself is not literally referring to M[67] but it's effectively the same? Or are there somehow two different binary codes that you can enter as a [0...13] RAM address that lead you to the exact location? Because "assuming bar refers to memory address 67", to me, sounds like when I execute "LOAD R1,bar", in order to know that bar "refers to memory address 67" I have to fetch "bar" from memory, and it turns out that it then points to M[67]? If the answer is what I think it is then I have like 2 more paragraphs of text (that felt too way neurotic to include in this single post).

Thank you for your time
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

ivant
The computer only "understands" the so called machine language. For humans, these look just like numbers, like 4A 67 ... and are not easy to read or write.

To make it a bit more readable we are using Assembly language. It is very close to the machine language in the sense that we have exactly the same commands and capabilities, but we write them with mnemonics, like LOAD R1, 67. Assemblers also often provide additional conveniences, like the ability to label specific addresses, either for jumping to them, or for accessing the data in them. If "bar" is just a label for address 67 we can write the previous instruction as LOAD R1, bar. Both forms will be translated to exactly the same machine code. It's just that the second one is more meaningful for humans.

How does the assembler know that bar is 67? It depends on the assembler, but commonly, you have some directives like

.data
bar 67

Note that these are instructions for the assembler. No machine code is generated for them.
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

dgnunch
ivant wrote
How does the assembler know that bar is 67? It depends on the assembler, but commonly, you have some directives like

.data
bar 67

Note that these are instructions for the assembler. No machine code is generated for them.
So, in memory, that directive variable has the address of M[67] in it? And before the assembler turns it into machine code, it uses that pointer (the spot in memory with the address) to turn that into the value of the spot its pointing to? So all variables are pointers at some level? Or am I being stupid and semantic?
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

ivant
No, the address of the variable is 67. M[67] just means the memory location 67. Every variable has an address in memory where its value is stored. If the value itself is an address, then the variable is a pointer. Otherwise it is not a pointer.

So bar's address is 67. Let's say value at location 67 is 58 (aka M[67] = 58). Now, if your code uses 58 as an address to another variable, then bar is used as a pointer and you have indirect memory access. If you just use 58 as a number, or as an ASCII code or as whatever other piece of data, then it's just data.
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

dgnunch
So I'm just not seeing, like, how does the assembler know bar is connected with M[67] if there isn't a place in the computer where the characters "bar" are literally and directly connected to the address M[67]? And even if bar is literally and directly connected to that address, then the characters "b-a-r" stored in the computer still aren't a pointer in some sense (e.g. pointing to the place that bar needs to be replaced with)?

Thanks again, I'm sure I'll get where I'm going wrong if I brute force my way past Ch6.


Also a minor thing:
ADD R1,foo,j (R1 <-- foo+j)
LOAD* R2,R1 (R2 <-- M[R1])
STR R2,x (x <-- R2)

So what is the pointer here? What is being indirectly addressed? Is this a pointer because foo+j (the address holding the value I want) have to first be added.. so technically the first instruction doesn't have the address I want.. but indirectly it does after I add them?
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

ivant
dgnunch wrote
So I'm just not seeing, like, how does the assembler know bar is connected with M[67] if there isn't a place in the computer where the characters "bar" are literally and directly connected to the address M[67]? And even if bar is literally and directly connected to that address, then the characters "b-a-r" stored in the computer still aren't a pointer in some sense (e.g. pointing to the place that bar needs to be replaced with)?
The assembler is a program. When you run it, it reads a text file and generates a binary file. It knows about bar, because it is written somewhere in the input program it read. It remembers this connection, that bar is an alias for 67 somewhere during its execution. But this is not a pointer, it is called mapping.

The output of the assembler is a binary program for a specific CPU. It doesn't know anything about bar. It just uses address 67 to store some data there.

When we transform the source code of the program from assembly to machine code, we say that this is "assembly time" (or "compile time") for our program. When running the produced binary - it's called "run time". The mapping of bar to 67 is known at compile time, but not in run time.

dgnunch wrote
Also a minor thing:
ADD R1,foo,j (R1 <-- foo+j)
LOAD* R2,R1 (R2 <-- M[R1])
STR R2,x (x <-- R2)

So what is the pointer here? What is being indirectly addressed? Is this a pointer because foo+j (the address holding the value I want) have to first be added.. so technically the first instruction doesn't have the address I want.. but indirectly it does after I add them?
"foo" is the pointer. It contains the first address of a block of memory where the values of the array are stored. We are interested in the value of j-th element foo[j], so we need to first calculate its address. We do that by adding j to the address foo. (foo is the base address and j is the offset) and storing this in R1. So after the first instruction R1 contains the address of foo[j]. And the second instruction reads the value stored on this address and saves it in R2.
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

WBahn
Administrator
In reply to this post by dgnunch
The word "bar" is a symbol that is only used during the assembly process. It does not survive and become part of the program. It is there purely for human readability. The assembler is responsible for maintaining a symbol table that associates each string with its corresponding value. That table is used to replace the strings in the code with their value before converting the instructions to machine code.

Using the Hack assembly language, let's say I have the following junk program.

@R9
D = A+1
@fred
M = D
@42
D = D+1
@fred
D = M+D
@END
D-1; JLT
@fred
M = M - A
(END)
0; JMP

First, the assembler has to recognize certain predefined symbols, including "R9", so it starts off with a symbol table that has all of these preloaded into it (the only one I'll show is "R9").

Symbol Value
"R9"      9

Next it makes a pass through the code and determines which address in ROM each instruction will be stored at and associates each label (a string surrounded by parens) with the address of the instruction following the label.

00: @R9
01: D = A+1
02: @fred
03: M = D
04: @42
05: D = D+1
06: @fred
07: D = M+D
08: @END
09: D-1; JLT
10: @sue
11: M = D - A
      (END)
12: 0; JMP

So now our symbol table looks like this

Symbol Value
"R9"      9
"END"   12

Finally it goes through and assigns any further strings used in A-type instructions sequential values starting at 16. It is assumed that these are variables and that we want to assign them to addresses starting at RAM address 16.

So our final symbol table looks like

Symbol Value
"R9"      9
"END"   12
"fred"    16
"sue"     17

Now the assembler can go through the code and replace all occurrences of all strings with the corresponding values.

00: @9
01: D = A+1
02: @16
03: M = D
04: @42
05: D = D+1
06: @16
07: D = M+D
08: @12
09: D-1; JLT
10: @17
11: M = D - A
12: 0; JMP

At this point the symbol table can be discarded.

Notice that while we assigned values to the strings based on the expected use of them, once they are in the symbol table they are all equivalent in that they are just a number that is associated with a string of text. Thus we can use any of them however we like, whether it makes sense to do so or not. The instruction on line 11 is an example of that (can you see why?).
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

dgnunch
WBahn wrote
11: M = D - A
12: 0; JMP

At this point the symbol table can be discarded.

Notice that while we assigned values to the strings based on the expected use of them, once they are in the symbol table they are all equivalent in that they are just a number that is associated with a string of text. Thus we can use any of them however we like, whether it makes sense to do so or not. The instruction on line 11 is an example of that (can you see why?).
Thanks for the responses (again)! For posterity I assume that makes no sense because M refers to an address stored in A but somehow we are trying to do an operation using A at the same time so after the operation it will try to check where to store M but we will just potentially get nonsense

Also, I'm still confused about foo. Is foo just created and stored randomly somewhere at compile time when I create an array and when I try to use it it will fetch the address stored in it from memory when I try to do the foo+j operation so that I can then fetch a value in the array? And then discarded? And lastly, I assume it's something I'll figure out when I get to it but the sentence, "(Symbol): This pseudo-command causes the assembler to assign the label Symbol to the memory location in which the next command of the program will be stored. It is called ‘‘pseudo-command’’ since it generates no machine code." However at the moment I don't know how dense I am but I don't get that at all



Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

dgnunch
So my machine code for mult "works" (it's way longer than it needs to be I'm sure), like by the end of it r2 has the result of r0 and r1 multiplied, but I'm getting a bunch of comparison failures when I do the test. Did I do something wrong, or can I just ignore them?

Basically what this does is decide which number is larger and then if for example R0 is larger than R1 (lets say R0 is 8 and R1 is 3) it will do 8 + 8 + 8 rather than 3 + 3 + 3 + 3 + 3 + 3 + 3 +3 so that it always does the least additions. I don't even know if that's an optimization but that was the only way I could think to do it

        @2
        M=0
        @largerNum
        M=0
        @smallerNum
        M=0
        @count
        M=0            

        @1
        D=M
        @smallerNum
        M=D
        @0
        D=M
        @largerNum
        M=D

        @0
        D=M
        @1
        D=D-A
        @LOOP
        D;JGT

        @0
        D=M
        @smallerNum
        M=D
        @1
        D=M
        @largerNum
        M=D
(LOOP)
        @smallerNum
        D=M
        @count
        D=D-M
        @END
        D;JLE
        @largerNum
        D=M
        @2
        M=M+D
        @count
        M=M+1
        @LOOP
        0;JMP
(END)
        @END
        0;JMP
Reply | Threaded
Open this post in threaded view
|

Re: Direct/Indirect Addressing Question

WBahn
Administrator
Look at the lines that have comparison failures and determine what you think the correct values should be and see which (if either) file you think is correct. If you think that the comparison file is wrong, that is a different thread and definitely worthy of discussion here.

If you conclude that your file is wrong, then pick the test that you think represents the quickest run of the code and then walk through it, either on paper or with the simulator, and determine where it goes of the rails and why.