Add missing compare, logical, and augmentation ops to Jack

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Add missing compare, logical, and augmentation ops to Jack

cadet1620
Administrator
 
if ((runner ~= null) && runner.isActive()) {
    let score += (runner.speed >= 10) ? fast : slow;
    do runner.update();
}

This is a set of three modifications that all depend on multi-character symbols. The modifications are presented in order from easiest to hardest.

The first is to add the missing comparison operators. Jack currently requires ~(x < y) to express greater than or equal. It would be much clearer to be able to write this as x >= y. The major work for this mod is dealing with multi-character symbols. The parsing and core generation changes are almost trivial.

The second modification is to add augmented assignments to the "let" command. These are assignments that modify the value of a variable, rather than simply replacing it.

let hist[bin] += count;
There is a small parsing change, and generating VM code for the subscripted variable case shown above is a little tricky.

The third mod is to add the short-circuit logical operators, '&&' and '||'. The '? :' ternary conditional operator, '? :', can be included since it requires almost identical VM code structure to the logical ops. Parsing changes in compileTerm are minimal; the majority of the work is adding compileLogialOp and designing the VM code that it writes.

JackTokenizer changes

These three additions all depend on changing the JackTokenizer to be able to handle multi-character symbols. The suggested design has JackTokenizer.symbol() returning a Char. There are two obvious ways to change symbol() to return multi-character symbols; both require changes everywhere that JackCompiler deals with symbols.

The classic way to deal with multi-character symbols is to have symbol() not return text, rather have it return an enum (or int) which is a symbol code. This way the compiler code that used to do something like

    if (token.symbol() == '+')
can simple become
    if (token.symbol() == SYM_PLUS)
(There is also a not too horrible kludge that makes this easier; continue to return Char and define the characters returned for multi-character symbols to be characters not in the single-character symbol set. E.G.: #define SYM_GE 'g' // >= .)

Another way is to return the symbols in Strings. This works well in languages like Python where characters and strings can be freely mixed, but can be problematic in languages like Java. You can't just change symbol() == '+' to symbol() == "+" because you will be comparing the Strings' references instead of their contents.

My tokenizer has a string of characters that are to be identified as tokenType SYMBOL. In addition to the standard Jack symbols, '?' and ':' must be added for the ternary conditional operator. Additionally, there is a list of 2-character strings that are also symbols.

symbols = "{}()[].,;+-*/&|<>=~?:"
symbols2 = ("~=", ">=", "<=", "+=", "-=", "*=", "/=", "&=", "|=". "&&", "||")
After advance() deals with comments, it checks if the next 2 characters are in symbols2. If not, it then checks if the next individual character is in symbols.

(My tokenizer is structured so that advance() does all the parsing work and the various methods like symbol() simply return the latest parsing result.)


Adding '<=', '>=' and '~=' operators to Jack expressions

Grammar changes

There are no syntactic changes to the Jack grammar other than adding the new operators to the op rule.

op:  '+' | '-' | '*' | '/' | '&' | '|' | '<' | '>' | '='  | '<=' | '>=' | '~=

(I chose '~=' instead of '!=' because Jack's not operator is '~'.)

CompilationEngine changes

compileExpression() will need to support the new operators. The VM for the new operators is simply the appropriate lt, gt, or eq followed by not.


Adding '+=', '-=', '*=', '/=', '&=' and '|=' operators to Jack 'let' statements

Grammar changes

The Jack grammar needs to be modified to include the new assignment operators.

letStatement: 'let' varName ('[' expression ']')? assignmentOp expression ';'
assignmentOp: ''='  | '+=' | '-=' | '*=' | '/=' | '&=' | '|=' 

CompilationEngine changes

compileLet() needs two additions to handle the augmented assignments.
  1. Before compiling the new value expression:
    if the operator is not '=', push current value of the target variable.
  2. After compiling the new value expression, before popping to the target variable:
    if the operator is not '=', write the VM code for the augmentation operation.
Step 1 is a bit tricky when the target variable is subscripted, but can be done with 3 VM commands after the target address has been computed and is at the top of the stack.


Adding '&&', '||' and '? :' operators to Jack expressions

Grammar changes

The '&&' and '||' short-circuit logical operators fit into the current grammar rule for expression, but it is best to break them out to show how they will be handled by the recursive descent compiler.

The '?' conditional operator does not fit into the original expression rule because the ':' is a delimiter, not an operator.

expression:  term (op term  | logicalOp term | '?' expression ':' expression  )*
logicalOp:  '&&' | '||'

Why does the conditional operator use expression in its grammar rule?

Consider the expression

i=1 ? "one" : i=2 ? "two" : "more"
This looks like it should mean "if (i=1) { "one" } else { if (i=2) { ...", but if term had been used this would have been parsed as
i=1 ? "one" : i =2 ? "two" : "more"
and term's left-to-right precedence would have compiled it as
(i=1 ? "one" : i) =2 ? "two" : "more"

By using expression the true and false sub-expressions are parsed up to the ':' or the end of the expression.

i=1 ? "one" : i=2 ? "two" : "more"
This effectively puts automatic parentheses around the sub-expressions.
i=1 ? ("one") : ( i=2 ? ("two") : ("more") )

CompilationEngine changes

The VM code for the short-circuit logical operators and the conditional operator has very similar structure, so compileLogicalOperator(String op) is added to handle them.

compileExpresson() can call compileLogicalOperator() when it encounters a logical or conditional operator. The code should look like this:

compileTerm();
while (current token is operator) {
    op = current token
    nextToken();
    
    if (op is '&&' or '||' or '?') {
        compileLogicalOperator(op)      // calls compileTerm() or compileExpresson()
    }
    else {
        compileTerm()
        compileOperator(op)
    }
}
The compileLogicalOperator() function generates one of three VM code sequences.

For '&&' and '||', it generates code that tests the left-hand argument with if-goto and jumps around the right-hand argument if it does not need to be evaluated. It calls compileTerm() to compile the right-hand argument.

if (op is '&&') {
    // write some VM code
    compileTerm();
    // write some VM code
}
else if (op is '||') {
    ...

For the '?' operator, compileLogicalOperator(String op) calls compileExpresson() to compile the true and false arguments.

else if (op is '?') {
    // write some VM code
    compileExpression();
    expectSymbol(":");
    nextToken();
    // write some VM code
    compileExpression();
    // write some VM code
}
(The expectSymbol() function checks that the current token is the specified symbol and raises an error if it is not.)