I just recompiled the DFF.class to behave properly and it works as expected in the simulator:
| time | in | out |
| 0+ | 0 | 0 |
| 1 | 0 | 0 |
| 1+ | 1 | 0 |
| 2 | 1 | 1 |
| 2+ | 0 | 1 |
| 3 | 0 | 0 |
| 3+ | 1 | 0 |
| 4 | 1 | 1 |
| 4+ | 1 | 0 |
| 5 | 0 | 0 |
| 5+ | 0 | 0 |
| 6 | 1 | 1 |
So the output now gets copied from the input at exactly the falling edge, not what the input was at the previous rising edge. Interestingly, when I use this DFF in my implementation of the Bit, it behaves exactly as it did previously:
| time | in |load | out |
| 0+ | 0 | 0 | 0 |
| 1 | 0 | 0 | 0 |
| 1+ | 1 | 1 | 0 |
| 2 | 1 | 1 | 1 |
| 2+ | 0 | 1 | 1 |
| 3 | 1 | 1 | 0 |
| 3+ | 1 | 1 | 0 |
| 4 | 0 | 1 | 1 |
So the output is still getting set at the falling edge to what the input was at the rising edge. It seems that the new DFF doesn't affect any of the higher order chips that depend on it. My first guess (without diving into the code) would be that this has something to do with the order of calculations done by the simulator, ie. the DFF calculation is done before the internal input pin from the mux is updated.
I know I've been a bit of a nit picker about all of this, but it's all due to my desire to deeply understand how all this stuff works. Eventually I would like to be able to build the HACK computer (or similar) in real hardware. For now, I'm happy to accept the current behaviour as a quirk of the simulator.
Thanks for all your help and insight!!