Jack Language Grammar

This grammar specification is based on the following conventions:
'xxx':Quoted boldface is used for tokens that appear verbatim ("terminals");
xxx:Regular typeface is used for names of language constructs ("non-terminals");
( ):Parentheses are used for grouping of language constructs;
x | y:Indicates that either x or y can appear;
x?:Indicates that x appears 0 or 1 times;
x*:Indicates that x appears 0 or more times.

Lexical elements: The Jack language includes five types of terminal elements (tokens), and two comment sequences1:
comment:  '/*' A sequence of characters not including */ '*/'
eolComment:  '//' A sequence of characters not including newline '\n'
keyword:  'class' | 'constructor' | 'function' | 'method' | 'field' | 'static' | 'var' | 'int' | 'char' | 'boolean' | 'void' | 'true' | 'false' | 'null' | 'this' | 'let' | 'do' | 'if' | 'else' | 'while' | 'return'
symbol:  '{' | '}' | '(' | ')' | '[' | ']' | '.' | ',' | ';' | '+' | '-' | '*' | '/' | '&' | '|' | '<' | '>' | '=' | '~'
integerConstant:  A decimal number in the range 0..32767
stringConstant:  ' " ' A sequence of characters including /* and // not including double quote or newline ' " '
identifier:  A sequence of letters, digits, and underscore ('_') not starting with a digit
Program structure: A Jack program is a collection of classes, each appearing in a separate file. The compilation unit is a class. A class is a sequence of tokens structured according to the following context free syntax:
class:  'class' className '{' classVarDec* subroutineDec* '}'
classVarDec:  ('static' | 'field') type varName (',' varName)* ';'
type:  'int' | 'char' | 'boolean' | className
subroutineDec:  ('constructor' | 'function' | 'method') ('void' | type) subroutineName '(' parameterList ')' subroutineBody
parameterList:  ( (type varName) (',' type varName)* )?
subroutineBody:  '{' varDec* statements '}'
varDec:  'var' type varName (',' varName)* ';'
className:  identifier
subroutineName:  identifier
varName:  identifier
Statements:
statements:  statement*
statement:  letStatement | ifStatement | whileStatement | doStatement | returnStatement
letStatement: 'let' varName ('[' expression ']')? '=' expression ';'
ifStatement:  'if' '(' expression ')' '{' statements '}' ('else' '{' statements '}')?
whileStatement:  'while' '(' expression ')' '{' statements '}'
doStatement:  'do' subroutineCall ';'
returnStatement:  'return' expression )? ';'
Expressions:
expression:  term (op term)*
term:  integerConstant | stringConstant | keywordConstant | varName | varName '[' expression ']' | subroutineCall | '(' expression ')' | unaryOp term
subroutineCall:  subroutineName '(' expressionList ')' | ( className | varName) '.' subroutineName '(' expressionList ')'
expressionList:  (expression (',' expression)* )?
op:  '+' | '-' | '*' | '/' | '&' | '|' | '<' | '>' | '='
unaryOp:  '-' | '~'
keywordConstant:  'true' | 'false' | 'null' | 'this'

1  The course treats comment sequenceses as external to the grammar, but they are much easier to handle in the context of the grammar. Consider this command:
let str = "Comments can start with //.";
You must know that you are within the StringConstant element so that you do not recognize //."; as an end-of-line comment and remove it.