The clock period has to be slow enough so that all of the signals can propagate through the system well before the next clock cycle starts.
Part of your problem might be hinted at here:
1st Clock Phase (START)
@3 binary representation (0000000000000011) enters the in-pin of A Register
Since @3 is an A-Instruction, A register will be loaded with the input but will yield the output during the start of the next clock phase
The A register (like all of the memory elements) loads the data at its input on the rising clock edge and, once loaded, it is immediately transferred (subject to propagation delay through the device) to the output.
So during the 1st clock period the value 3 is applied to the input of the A-register, but it is NOT loaded at this time because that requires a rising clock edge. Similarly, the instruction decode logic asserts the Load input on the A-register, but nothing happens yet.
On the next rising clock edge the value at the input gets loaded and appears almost instantly at the output. At the same time, the same clock edge is also advancing the program counter which is then changing the instruction address so that the next instruction becomes available, which will make the values at the inputs of the A-register change (potentially). As long as those values were held valid long enough for the load operation to complete, we don't care. This is why we need the propagation delay through the sequential elements to be longer than the hold time.
1st Clock Phase (END) == 2nd Clock Phase (START)
1. A-Instruction goes out of the A-Register
2. addressM pin will be loaded with the A-Instruction
3. C-Instruction (M=M+1 == 1111110111001000) enters A-Register in-pin
(from my understanding these three processes above happen simultaneously)
The C-instruction includes the control signal that configures the Mux that's ahead of the A-register to change so that the input to the A-register comes from the data ram and NOT the instruction.
See if you can use this new knowledge to work out the rest of it. If you still have problems, post an updated description as you currently see it and we'll go from there.
First, I did mess up part of my description. The instruction M=M+1 doesn't actually specify how the Mux ahead of the A register is configured since it doesn't matter (the A register's Load is LO). Instead it configures the Mux ahead of the Y input to the ALU to take the data from the data RAM and also enables the writeM bit.
The biggest problem with your description is the notion that the outputs from one clock cycle don't reach the input pins of the destination chips until the beginning of the next clock cycle. The outputs are connected to the inputs by wires and signals propagate down those wires at the speed of light (which, in a typical wire, can be take to be between half and two-thirds of the speed of light in a vacuum). So figure that the signal will go from the output of one chip to the input of another chip in about one nanosecond if the wire connecting them is roughly six to eight inches long. Within the CPU the distances are a thousand times smaller than that (or more). So unless we are talking about multi-gigahertz clock frequencies, the signals arrive at the inputs of the destination chip very early in the clock period. But it doesn't matter because the rising clock edge has already occurred (it's what started the process by which the signal is changing to begin with) and so the signals just sit there at the chip inputs with nothing happening until the next rising clock edge.
It might help you to draw a diagram in which your various points are spread out horizontally instead of trying to depict them climbing vertically up the clock edge. To make it simple, draw all of your edges such that they take, say, four time units to change (so they are steep, slanted angles instead of vertical angles). Then assume that the output of a gate starts changing six time units after the input causing the change started changing. Sketch out a few situations and things will likely become a lot clearer.
Do that and then we can talk about setup and hold times and how they relate to maximum clock frequency.
Just before the rising clock edge, everything in the CPU is being controlled by the current instruction; but the outputs of all memory elements are what was stored in those memory elements on the prior rising clock edge. The new inputs for those memory elements are sitting at the input pins, but they can't be stored until the rising clock edge actually happens.
The key is that the instruction bits only configure the CPU elements -- they do NOT cause the contents of any memory element to change. That ONLY happens on the rising clock edge. That includes advancing to the next instruction -- that happens on the rising clock edge and it takes a small amount of time for the values on the instruction pins of the CPU to actually change once the clock rising edge occurs. During that time, the CPU is still being controlled by the old instruction and so the values actually being applied to the data inputs of the memory elements are still the values determined by the old instruction and these are being stored into any selected memory elements at the same time that the new instruction is making its way through the read process from the instruction ROM. By the time that instruction is available at the CPU pins, the write process is complete for all of the other memory elements and they are no longer sensitive to the values at any of their input pins until the next rising clock edge.
Understanding setup and hold times and propagation delay will really help cement this concept in place. So let's discuss them.
The setup time of a flip flop is the amount of time prior to the rising clock edge that the data on all of the input pins (data pins and control pins) has to be stable and unchanging in order to guarantee that the value stored will be correct.
The hold time of a flip flop is the amount of time after the rising clock edge that the data on all of the inputs pins (data pins and control pins) has to be stable and unchanging in order to guarantee that the value stored will be correct.
In essence, there is a window of time that spans the rising clock edge in which none of the chip's inputs can be changing. If any of them change during that time, then there is no guarantee what value(s) will actually get stored in the flip flop.
So let's put some numbers to this. Let's say that our flip flops have a setup time of 30 ns and a hold time of 20 ns.
Propagation delay is the amount of time it takes, from the clock rising edge, for the new data to appear at the output of the flip flop. Let's say that this is 40 ns.
Finally, let's pick a clock frequency of 10 Mhz, meaning that the period of time from one rising clock edge to the next is 100 ns.
So let's align the rising clock edges with t=0 on our time line. Since we don't know what the prior instruction was, we don't know what data is going to get stored into the FFs on that first rising clock edge (at t=0). Nor do we know what the values on the instruction bus will be for the first 40 ns because that data is making its way through the RAM logic. After the instruction finally appears on the instruction bus, the various multiplexers and other combinatorial logic responds. Let's say that this takes 10 ns. So at 50 ns everything is stable and nothing further will happen until the next rising clock edge. Since that will happen at t = 100 ns and since our setup time is 30 ns, we are good because things can change all the way up to t = 70 ns without violating the setup time requirements. What's particularly important to recognize is that the output of all of the memory elements is going to reflect what was stored as a result of the prior instruction, which is what we want.
Now the second rising clock edge occurs at t = 100 ns. At this point we store new values in all of the memory elements and, in order for that to happen reliably, the inputs need to remain unchanging for at least 30 ns (the hold time requirement). But since the propagation delay of our flip flops is longer than that, we are guaranteed that this will be the case.