riverfish wrote
Hey Cadet,
I have something similar to CompileStatementBlock.
My problem lies in trying to determine when to return CompileStatementBlock.
This is the part that I'm struggling with - how does CompileStatementBlock know it has seen its closing '}'?
def _CompileStatementBlock(self):
"""
Compiles <statement-block> := '{' <statement-list> '}'
Nested Scope option inserts <var-dec>* after '{'
ENTRY: Tokenizer positioned on '{'.
EXIT: Tokenizer positioned after '}'.
"""
self._ExpectSymbol('{')
self._WriteXml('symbol', self.tokenizer.Symbol())
self._NextToken()
if self.options['scopeExt']:
# enter the subscope
self.symbolTable = self.symbolTable.Subscope()
self._CompileScopeDec(zero=True)
self._CompileStatementList()
self._ExpectSymbol('}')
if self.options['scopeExt']:
# exit the subscope
self._WarnUnrefs('subscope')
self.symbolTable = self.symbolTable.Superscope()
self._WriteXml('symbol', self.tokenizer.Symbol())
self._NextToken()
def _ExpectSymbol(self, symbols):
"""
Parse the current token. It is expected to be one of 'symbols'.
'symbols' is a string of one or more legal symbols.
Returns the symbol parsed or raises an error.
"""
if not self.tokenizer.TokenType() == TK_SYMBOL:
self._RaiseError('Expected '+self._SymbolStr(symbols)+', got '+
self.tokenizer.TokenTypeStr())
if self.tokenizer.Symbol() in symbols:
return self.tokenizer.Symbol()
self._RaiseError('Expected '+self._SymbolStr(symbols)+', got '+
self._SymbolStr(self.tokenizer.Symbol()))
The ExpectSymbol calls are where CompileStatementBlock handles '{' and '}'. If the current token is not the expected token, it raises an exception which prints an error message and aborts the compile.
(I have similar ExpectKeyword and ExpectIdentifier routines, also IsSymbol, etc. routines that return true/fasle instead of raising errors.)
After CompileIf has called CompileExpression, the current token will be ')'. It verifies this with ExpectSymbol, advances the tokenizer and verifies that the new token is '{' and then calls CompileStatementBlock.
CompileStatementBlock verifies the '{', advances the tokenizer and calls CompileStaementList.
CompileStatementList process all the statements in the block, recursively calling CompileXxx routines, potentially including CompileStatementList itself, that parse everything including matching '{' and '}'.
When CompileStatementList returns to CompileStatementBlock, the tokenizer will be on the '}' that matches the '{' that CompileStatementBlock started with. CompileStatementBlock verifies that it is there, advances the tokenizer and returns.
This is what makes recursive parsing work. It "magically" handles nested structures.
Perhaps it will help you understand recursive parsing by turning the problem inside out. Consider recursive printing. Here is a program that prints nested "hello world" messages. (Python 3; end='' suppresses new-lines.)
def hello_world(n):
print('hello ', end='')
if n > 1:
print('{ ', end='')
hello_world(n-1)
print('} ', end='')
print('world! ', end='')
hello_world(1)
print()
hello_world(3)
# Prints
# hello world!
# hello { hello { hello world! } world! } world!
I think it might be better to continue this by email. I can share more code, you can send me your code to look at, etc.
--Mark