I don't know that much about the differences in architecture between the 040 and 060EC, however, regarding your question "which would run programs faster", it depends entirely on what sort of programs you were intending to run, and that involves knowing whether some programs tax integer-crunching capabilities or floating point math functions more.
Generally I think that an FPU is needed for many functions that average users ask of their hardware, so initially I'd be more inclined to go for the 040FPU from that point of view.
However, the 060 is generally acknowledged to be very much better than the 040. Without an FPU though, I can't say. Two definite differences are that the 060 is generally clocked 10 or 20MHz higher than the 040, and also it doesn't require a heatsink/fan, whereas the 040 does (not a very big one, but still).