Why simple if it can be done the complicated way...
Because mul&div64 routines are used a lot :
1/ with a direct replacement, it will be faster than using MuRedox, each cycle is important for this case
2/ Actual integer replacement mul&div64 is a bit slow, fpu version seems even faster
3/ Another secret goal for the moment, maybe for the v0.5
Again, it's not a patch/hack, but a complete compilation with gcc 2.95.3-4