How often have you needed to bring up an ASCII table to print characters or test
keycodes?
I've written this sort of code way too many times:
if ((key = 88) | (key = 120)) {
do Output.printInt(x);
do Output.printChar(44);
do Output.printInt(y);
Jack needs character constants that can be used in place of these obscure
numbers.
if ((key = 'X') | (key = 'x')) {
do Output.printInt(x);
do Output.printChar(',');
do Output.printInt(y);
It would also be convenient to be able to use hexadecimal constants for bit masks.
let comp = c_instr & 0x1FC0;
is much more readable than
let comp = c_instr & 8128;
Syntax changes
The book's syntax for expressions is
expression: | |
term (op term)* |
term: | |
integerConstant | stringConstant | keywordConstant | ... |
integerConstant: | |
A decimal number in the range 0 .. 32767 |
To support character constants, "term" will need to include a new syntax element, "characterConstant".
To support hexadecimal constants, "integerConstant" will need to be expanded to include both decimal and hexadecimal syntax.
term: | |
integerConstant | characterConstant | stringConstant | ... |
integerConstant: | |
decimalConstant | hexadecimalConstant |
decimalConstant: | |
A decimal number in the range 0 .. 32767 |
hexadecimalConstant: | |
('0x' | '0X') (digit | hexDigit)*
Valid range 0x0000 .. 0xFFFF |
hexDigit: | |
'A' through 'F', upper or lower case |
characterConstant: | |
' ' ' Unicode character ' ' ' |
Notes:
For simplicity, this syntax allows the degenerate hexadecimal constant "0x"
which should be interpreted as 0.
The Unicode character in characterConstant can be a single quote —
' ' ' is correct syntax for a single quote character;
' ' is illegal syntax.
Code changes
Changes are required to both the JackTokenizer and CompilationEngine
modules.
Hexadecimal and character constants are just new ways to specify integer
constants. Rather than adding new tokenType values and a separate
method to return their values, JackTokenizer.tokenType() will
return INT_CONST for hexadecimal and character constants.
For character constants, JackTokenizer.intVal() will return the numeric
value of the Unicode character.
Because JackTokenizer.intVal() can now return values > 32767 which
can not be handled by the "push constant N" VM command,
CompilationEngine.compileExpression() will need to write special VM
code when an INT_CONST token with value > 32767 is encountered.
For values 0x8000 ≤ N ≤ 0xFFFF, 0x7FFF ≥
~N ≤ 0x0000. The required VM code is
push constant ~N
not
Test Code
/** Test code for hexadecimal and character constants compiler modification.
*
* Should print:
* Character constants 'OK'
* 0xAbC OK
* 0X1DeF OK
* 0xFF00 OK
* 0x8000 OK
*/
class Main {
function void main() {
do Output.printString("Character constants ");
do Output.printChar(''');
do Output.printChar('O');
do Output.printChar('K');
do Output.printChar(''');
do Output.println();
if (0xAbC = 2748) { // note mixed case in hex constant
do Output.printString("0xAbC OK");
} else {
do Output.printString("0xAbC FAIL");
}
do Output.println();
if (0X1DeF = 7663) { // note mixed case in hex constant
do Output.printString("0X1DeF OK");
} else {
do Output.printString("0X1DeF FAIL");
}
do Output.println();
if (0xFF00 = ~255) {
do Output.printString("0xFF00 OK");
} else {
do Output.printString("0xFF00 FAIL");
}
do Output.println();
if (0x8000 = ~32767) {
do Output.printString("0x8000 OK");
} else {
do Output.printString("0x8000 FAIL");
}
return;
}
}