Modern computers are hybrid architectures that have built in hardware assisted stack operations as well as the more common register/register, register/memory and memory/memory operations.
The VM's stack machine is just an abstraction that simplifies the design of the compiler and supplies a common point to select between code generators for different target processors.
There is no requirement for the generated code to adhere to the stack model. Here are some examples from my optimizing VM translator. The comments show the Jack source code and the VM code that was generated by my Jack compiler. The generated ASM for these source line completely ignore the stack.
/// 38: let iterations = 31;
// push constant 31
// pop static 0 // iterations
@31
D=A
@Cordic.0
M=D
/// 151: let i = i+1;
// push local 0 // i
// push constant 1
// add
// pop local 0 // i
@LCL
A=M
M=M+1
/// 461: let msw = msw|MSB;
// push this 0 // msw
// push static 0 // MSB
// or
// pop this 0 // msw
@Float.0
D=M
@THIS
A=M
M=M|D
My translator does this sort of optimization by preprocessing the VM code into a somewhat smarter VM code that includes commands like "move constant 31 static 0".
As to why we use stack based programming, it naturally fits the concept of breaking down larger jobs down into smaller and smaller subtasks, and keeping track of the data required for all the subtasks.
Part of the confusion is the overloading of the term VM. It's used as this intermediate abstraction that doesn't exist outside a compiler, as a cross-platform portability aid as in the JVM, as a way to run working but no longer supported MS operating systems on modern broken MS OSes (ahem!), as one of many protected environments running on a host server...
--Mark