Thanks for all the replies.
I have been able to identify the bug and fixed it. The compiled Square.asm and Pong.asm code can now be ran on the supplied CPUEmulator.
The bug is caused by the LCL segment not setting properly for functions with local variables.
In a function, I should just push N * Zeros onto the stack, where N is equal to the number of local variables. However, I pop the Zeros out after the push and assign them to the local variables. So I did something like the following code
for i=0, n-1
pop local i
I thought I was doing "initialize the local variables to zero" ...... But this causes SP being overlap with LCL. So they overwrite each other.
All the functions in chapter 08's tests do not use any local variables. That is why my vmtranslator passed all the tests. So I would like suggest to add at least one test for function with local variables.