It's not particularly obvious how the ALU computes its various functions.
ybakos created a great 
worksheet to help understand the ALU.
As you observe, the ALU operation for 
x - 
y has 
nx, 
f, and 
no set, so the actual computation is
    
out = NOT( (NOT 
x) + 
y)
You can use the definition of two's complement:
    -
n = (NOT 
n) + 1
to algebraically prove that the ALU's computation is equivalent to 
x - 
y.
It is quite elegant that such a simple structure as the TECS ALU can compute all the required functions.  A brute force design to do these functions would have resulted in a rather more complex design.
--Mark