CompilationEngine - Input stream?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

CompilationEngine - Input stream?

peterxu422
I am confused as to why there's an Input stream/file argument for the constructor of the Compilation Engine Module. I know the input should be the tokens from JackTokenizer, but I do not see why/how I would use an input file for that. It can't be for the .jack file itself right? Because the JackTokenizer is the one that takes that files and starts parsing it.

Originally, I was planning to send the currToken and tokenType from my JackTokenizer class to the Compilation Engine. However, I did not see any other arguments listed in the API outlined for the CompilationEngine, so I thought maybe this might not be the way to do it.

Alternatively, if I create the JackTokenizer object in the CompilationEngine Module, then it seems that the JackAnalyzer module isn't quite necessary since it's supposed to coordinate the interaction between the tokenizer and compilation engine.

So I'm a little confused about the program flow in this case between the three modules: JackAnalyzer, JackTokenizer, and Compilation Engine. Can someone offer clarity?
Reply | Threaded
Open this post in threaded view
|

Re: CompilationEngine - Input stream?

cadet1620
Administrator
JackAnalyzer is the top level of the compiler.  In my compiler it is, in fact, a function named main().

main() parses the command line arguments and builds a list of input .jack files. For each file in the list, it creates a CompileEngine object with the input file name and corresponding output file name. It then calls CompileEngine.CompileClass().

Each CompileEngine creates a JackTokenizer to read the input file and an XmlWriter (ch 10 version) or a VmWriter (ch 11 version).

--Mark
Reply | Threaded
Open this post in threaded view
|

Re: CompilationEngine - Input stream?

peterxu422
In the textbook on page 213 under section 10.3.1 where it talks about the JackAnalyzer Module, it says the JackAnalyzer creates a JackTokenizer object. Is there any difference or preference to create the tokenizer object in CompileEngine as opposed to JackAnalyzer?
Reply | Threaded
Open this post in threaded view
|

Re: CompilationEngine - Input stream?

cadet1620
Administrator
peterxu422 wrote
In the textbook on page 213 under section 10.3.1 where it talks about the JackAnalyzer Module, it says the JackAnalyzer creates a JackTokenizer object. Is there any difference or preference to create the tokenizer object in CompileEngine as opposed to JackAnalyzer?
Just my own programming style.

--Mark
Reply | Threaded
Open this post in threaded view
|

Re: CompilationEngine - Input stream?

Diogo
In reply to this post by cadet1620
cadet1620 wrote
Each CompileEngine creates a JackTokenizer to read the input file and an XmlWriter (ch 10 version) or a VmWriter (ch 11 version).

--Mark
I'm afraid I'm about to ask something really silly here, but since I am quite confused at this project I will go ahead and ask anyway. (Sorry in advance).

Is it the JackTokenizer who is responsible to wrap its currents tokens in xml tags? The answer is no, right?

From what you described, seems like your XmlWriter gets the current token from the JackTokenizer and analyses it using the tokenizer methods (such as tokenType(), keyWord(), symbol(), ...) so it can wrap it with the correct xml tag, finding out if the token is a keyword, symbol, and so on... am I right?

If that is so, what got me confused was the description of stage 1: "the tokenizer should produce a list of tokens, each printed in a separate line along with its classification: symbol, keyword, identifier, integer constant, or string constant. The classification should be recorded using XML tags". This part gave me the ideia that the JackTokenizer should provide its current token already wrapped in the xml tags...
Reply | Threaded
Open this post in threaded view
|

Re: CompilationEngine - Input stream?

ybakos
Well, yes and no.

The compiler is broken into multiple components: tokenization, then parsing.

How might we test the tokenization to make sure it is correct before attempting to parse (interpret) the tokens? We let the tokenizer generate an XML-formatted list of tokens, which we can then test for correctness.

But, later, when working on parsing / code generation, you should not necessarily be using the XML the tokenizer generated as the input to the parser. You could, but this is unnecessary (and a poor approach). Once your tokenizer is tested, then you add on functionality to your code, such that, instead of the tokenization process generating xml output, the tokenizer "feeds" the tokens to the parser.

So yes, while working on the tokenizer, your code should be outputting the XML as dictated in the project.
Reply | Threaded
Open this post in threaded view
|

Re: CompilationEngine - Input stream?

Diogo
ybakos wrote
Well, yes and no.

The compiler is broken into multiple components: tokenization, then parsing.

How might we test the tokenization to make sure it is correct before attempting to parse (interpret) the tokens? We let the tokenizer generate an XML-formatted list of tokens, which we can then test for correctness.

But, later, when working on parsing / code generation, you should not necessarily be using the XML the tokenizer generated as the input to the parser. You could, but this is unnecessary (and a poor approach). Once your tokenizer is tested, then you add on functionality to your code, such that, instead of the tokenization process generating xml output, the tokenizer "feeds" the tokens to the parser.

So yes, while working on the tokenizer, your code should be outputting the XML as dictated in the project.
I just reread the chapter after your reply and ok, things started to make sense. I actually have practically the whole project up and running, but I still have some conceptual doubts. Thanks!

While we are still at it, here's another quick one: the JackTokenizer API states that the keyWord() method returns "the keyword which is the current token". This seems to me that this method is supposed to return the string which is the current token (if it has been proved that it is indeed a keyword with tokenType()). Yet, the API lists its return as all keywords in high case, as if they were some enumeration. Could you or someone please clarify this small issue?

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: CompilationEngine - Input stream?

ybakos
I think the important thing to notice about the description of keyWord (and tokenType, symbol, identifier, intVal and stringVal) is the last sentence in the description. "Should be called only when tokenType() is KEYWORD."

keyWord only returns a single constant representing whether the particular token, which should already have been defined as being a keyword, is a class, a method name, a function name, etc.