Most of the tests are fairly simple and are only checking for gross errors.
For example the ALU test checks that each of the documented functions generates correct results for two different pairs of values. It assumes that the underlying parts are correct. It does not test that your implementation handles the control bits exactly as specified in all combinations.
Some of the tests, particularly the Xxx16 parts where the HDL has many copies of the same line but with different bus indices, could do a better job of testing that every one of the indices is correct.
So is it possible to write a HDL program that passes the tests but that is not a correct implementation of the chip?
Easy to do this intentionally, less likely to do it accidentally if all the underlying parts are good, except that the zr status is not tested for every out bit handled correctly.
Not16(in[0..7]=x[0..7], out=notx);
--Mark