Project 7: gt and lt behavior not clearly specified for signed operands

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Project 7: gt and lt behavior not clearly specified for signed operands

Voltara
I'm working my way through the projects in the second edition of the book, and encountered an inconsistency in how the gt and lt commands are specified.

Section 7.3, page 133: "There is only one data type: a signed 16-bit integer." This suggests gt and lt should treat the operands as signed.

In Figure 7.5 on page 134, the only commands that specify "(two's complement)" are: add, sub, neg. This suggests gt and lt should treat the operands as unsigned.

In SimulatorsPackageSource/Hack/VMEmulator/Calculator.java of the source code distribution, the implementation is signed:

            case GREATER_THAN:
                result = (short)(input0 > input1 ? -1 : 0); break;
            case LESS_THAN:
                result = (short)(input0 < input1 ? -1 : 0); break;

In projects/06/pong/Pong.asm (which I assume is output from the author's VM translator), the GT implementation is unsigned: subtract x-y and test the sign of the result.

@R15
M=D
@SP
AM=M-1
D=M
A=A-1
D=M-D
M=0
@END_GT
D;JLE
@SP
A=M-1
M=-1
(END_GT)
@R15
A=M
0;JMP

The VM program in projects/07/StackArithmetic/StackTest/StackTest.vm only tests cases where both operands are positive, so it doesn't help to disambiguate.

One example where signed and unsigned implementations will give different results is the comparison 32767 > -1 (the difference 32767 - (-1) rolls over to -32768.)

push constant 32767
push constant 1
neg
gt
A correct signed implementation would handle the separate cases of same sign bit (subtract and test the sign of the result) and opposite sign bit (the negative operand is always lesser.)
Reply | Threaded
Open this post in threaded view
|

Re: Project 7: gt and lt behavior not clearly specified for signed operands

WBahn
Administrator
Voltara wrote
I'm working my way through the projects in the second edition of the book, and encountered an inconsistency in how the <tt>gt</tt> and <tt>lt</tt> commands are specified.
<p>
Section 7.3, page 133: "There is only one data type: a signed 16-bit integer."  This suggests <tt>gt</tt> and <tt>lt</tt> should treat the operands as signed.
<p>
Since there is only one data type, all arithmetic operations have to use that data type. The relational operations are arithmetic, so the the operands are signed. The comment regarding specific operations is in the way of a reminder -- it does NOT imply that any operation that doesn't have that reminder is now a data type that doesn't exist.

In <tt>projects/06/pong/Pong.asm</tt> (which I assume is output from the author's VM translator), the GT implementation is unsigned: subtract <tt>x-y</tt> and test the sign of the result.
This makes no sense. What does it mean to test the sign of the result? If the data type is unsigned, then the result is unsigned and is always positive.

For instance, what if x = 50,000 and y = 40,000? Then x-y = 10,000 and the "sign bit" is zero. But what if x = 50,000 and y = 10,000? Now x-y = 40,000 and the "sign bit" will be one. Yet in both cases x>y is true.

Testing if x>y by testing if (x-y) > 0 works for both positive and negative values of x and y provided you don't get into overflow conditions. Overflow conditions create caveats for a lot of functions and N2T does have a tendency to ignore them in the interest of keeping the implementations clean and simple. You can argue whether or not this is a reasonable compromise.

The VM program in <tt>projects/07/StackArithmetic/StackTest/StackTest.vm</tt> only tests cases where both operands are positive, so it doesn't help to disambiguate.
No test suite is going to be able to test everything, but there are a number of test suites in the projects that fail to test major cases. I'm hoping that these will get an overhaul when (and if) the new tools come out.

One example where signed and unsigned implementations will give different results is the comparison <tt>32767 > -1</tt> (the difference <tt>32767 - (-1)</tt> rolls over to <tt>-32768</tt>.)
This is an example of the overflow issue I mentioned above.

Keep in mind that ANY operation that uses or results in -32768 is an oddball case because this is the "troublesome" value in 16-bit two's comp. But your point is valid and can be demonstrated with other values that don't involve the pathological one.

A correct signed implementation would handle the separate cases of same sign bit (subtract and test the sign of the result) and opposite sign bit (the negative operand is always lesser.)
This is definitely a better implementation, but it does add quite a bit of additional overhead doing the sign tests.

Also note that it still misbehaves for the pathological case.

Is 0 > -32768?
Is -32768 > 0?

You get the same answer for both since -(-32768) = -32768.
Reply | Threaded
Open this post in threaded view
|

Re: Project 7: gt and lt behavior not clearly specified for signed operands

Voltara
WBahn wrote
Voltara wrote
In projects/06/pong/Pong.asm (which I assume is output from the author's VM translator), the GT implementation is unsigned: subtract x-y and test the sign of the result.
This makes no sense. What does it mean to test the sign of the result? If the data type is unsigned, then the result is unsigned and is always positive.
I could have written this better.  When I wrote "test the sign of the result", I was just trying to summarize the snippet of Hack assembly.  (The Hack jump operations treat the value as signed, regardless of what's happening in the VM layer and up.)

After reading your reply and thinking it through a bit more, I realized that I made a mistake in framing it as a signed vs unsigned issue.  The trap I fell into was "it works when they're both positive, and it works when they're both negative; therefore it must be a sign problem," when the real problem was overflow.  Signed or not, overflow is going to be an issue for any x-y >= 32768.

So please ignore everything I said about sign :-)

Testing if x>y by testing if (x-y) > 0 works for both positive and negative values of x and y provided you don't get into overflow conditions. Overflow conditions create caveats for a lot of functions and N2T does have a tendency to ignore them in the interest of keeping the implementations clean and simple. You can argue whether or not this is a reasonable compromise.
No, I agree wholeheartedly that simplicity is a virtue here.  My goal when working through the book and problems has been: What is the simplest possible way to implement this, and is that way simple enough to teach a beginner?  It has been apparent so far that much care and attention has gone into making the material accessible (especially with the second edition changes.)

This was the first instance I encountered so far where a 100% correct implementation ended up a bit more hairy than I would have liked.  I do feel it's worth pointing out briefly in the project notes, "You might have overflow problems when x and y are far apart.  For simplicity, you can safely assume that won't happen here or anywhere else in the projects."

A correct signed implementation would handle the separate cases of same sign bit (subtract and test the sign of the result) and opposite sign bit (the negative operand is always lesser.)
This is definitely a better implementation, but it does add quite a bit of additional overhead doing the sign tests.

Also note that it still misbehaves for the pathological case.

Is 0 > -32768?
Is -32768 > 0?

You get the same answer for both since -(-32768) = -32768.
That would fall into the "opposite sign bit" category, which would correctly identify -32768 (0x8000) as less than 0 (0x0000).

Thanks for taking the time to reply.  Overall the quality of the materials has been excellent, and it has been a pleasure working through them.  I'm looking forward to the new tools when they come out.
Reply | Threaded
Open this post in threaded view
|

Re: Project 7: gt and lt behavior not clearly specified for signed operands

WBahn
Administrator
This post was updated on .
Voltara wrote
This was the first instance I encountered so far where a 100% correct implementation ended up a bit more hairy than I would have liked.  I do feel it's worth pointing out briefly in the project notes, "You might have overflow problems when x and y are far apart.  For simplicity, you can safely assume that won't happen here or anywhere else in the projects."
I've felt that way about a number of points in the text. In fact, there are so many places where you could point out how life beyond N2T isn't as nicely behaved that you run the risk of making the text considerably longer than it is with all the asides.

Personally, I LOVE optional asides that go beyond what is presented to raise awareness of some of the finer points and hidden gotchas. But a lot of the target audience for the book are already pretty overwhelmed by what appears to be a daunting task that laying on the additional stuff, even if marked optional, might scare them away.

That might be something that supplemental material, either on the N2T site or perhaps even as a subforum here, might be a good route to go. Then you could have a paragraph in the first chapter that acknowledges that things are intentionally kept simple for these projects to make them tractable and that many important issues have been overlooked in the pursuit of that goal, but that many of those issues are explored in varying degrees of depth in the supplemental material for the interested reader. Then, in a few selected places where it seems particularly relevant to point that out again, a single sentence should suffice that again directs interested readers to the supplemental material.

Of course, it's too late to put this into the second edition, but the next time I converse with the authors I might see if they think it's a good idea and we can talk about how to implement it. Another thing I want to do here is to have an Errata page for each edition that collects and brings all the bugs and errors (of which there are amazingly few) into a single, organized place.

The problem is that I don't have the time to do it right now.

A correct signed implementation would handle the separate cases of same sign bit (subtract and test the sign of the result) and opposite sign bit (the negative operand is always lesser.)

This is definitely a better implementation, but it does add quite a bit of additional overhead doing the sign tests.

Also note that it still misbehaves for the pathological case.
Is 0 > -32768?
Is -32768 > 0?

You get the same answer for both since -(-32768) = -32768.
That would fall into the "opposite sign bit" category, which would correctly identify -32768 (0x8000) as less than 0 (0x0000).
Yep, you are correct. My bad.