cadet1620 wrote
Congratulations on an interesting insight; I'll need to try implementing this and seeing how much it saves.
I finally got a chance to analyze this in detail. Here's what I find.
The proposed calling convention change is:
|
For all calls to functions with no arguments (call
function 0 VM commands), increment the stack pointer before pushing
the return address. |
|
This, in effect, turns the call into a call function 1 with a random argument. This is benign
because the called function will not reference the argument that it does not have. Since the return code
does not depend on the number of arguments, this fake argument has no effect on the function's return.
The return code is simplified because the return address does not need to be cached in a temporary
variable since there is always a space for the return value to be moved to (ARG[0]) before
the return address is needed.
| Original return code | | Simplified return code |
| FRAME1 = LCL |
| RET = *(FRAME-5) |
| *ARG = pop() | | *ARG = pop() |
| SP = ARG+1 | | SP = ARG+1 |
| | | FRAME = LCL |
| THAT = *(FRAME-1) | | THAT = *(FRAME-1) |
| THIS = *(FRAME-2) | | THIS = *(FRAME-2) |
| ARG = *(FRAME-3) | | ARG = *(FRAME-3) |
| LCL = *(FRAME-4) | | LCL = *(FRAME-4) |
| goto RET | | goto *(FRAME-5) |
|
|
|
| 1FRAME is a temporary variable, R15 for example. |
Recognizing that the the run of *(FRAME-n) accesses are pops in disguise (using FRAME instead of SP
as the stack pointer), can shorten this code significantly. Getting rid of the out-of-sequence FRAME-5
access and the RET store and recall reduces my return code by 6 instructions (13%).
Sounds promising...
But this may be a false optimization because every call to a function with no arguments requires at least one
extra instruction to increment the stack pointer for the fake argument. There are 20 such calls in
the OS .vm files.
My code writer only writes the return instructions once--the first time it translates a return VM
command. All remaining returns require only 2 instructions: @$RETURN 0;JMP ($RETURN is the label
at the start of the return code.)
I also only write most of the call code once. The actual calls
load target address, number of arguments and return address into registers and jump to the common
code. I can add a $CALL0 entry point to $CALL that increments SP and falls into $CALL.
This requires 2 instructions.
The net effect will be to save 4 words of ROM, with a small speed improvement. Every call()/return
pair will save 4 cycles; every call(...)/return pair will save 6 cycles. Stack usage will increase by a
small amount depending on the number of nested no-argument calls.
Additionally, because of the calling convention change, all the test files that contain calls will
need to be adjusted.
CONCLUSION
This is too much work for too little gain, so I am not going to write a test implementation.
--Mark