CopyMemQuick() only pays off when you are copying small buffers (16 bytes or less) but then if you want really good performance for such small copies it is better inline it.
Large buffers... But actually, yes, compilers do that. If you run SAS with a high optimization setting, it will inline memory copies if the number of bytes to be copied is small enough. Otherwise, it performs its own copy. gcc does the same. It shouldn't be overly hard to tell the compiler to use CopyMem() if applicable, which again calls CopyMemQuick() when applicable.
For the "generic" type of memory copy you need - copy a structure from A to B - the compiler approach is quite fine. There's not much to be gained to begin with. If your program copies a lot of data, or quite frequently (e.g. you're writing a device and need to copy the user data to a DMA buffer), then you're most likely better off by calling CopyMemQuick() manually (if you can, i.e. proper alignment given) instead of depending on the compiler. Then again, if you *know the size* of the block in advance, you're probably *even better off* to write your own memory copy in a small assembler stub and call this instead.
Thus, after all, CopyMem() and CopyMemQuick() are unfortunately only half as useful as it may seem. If you need a really good memory copy, you're probably writing your own for your specific application anyhow, and if you're happy with the "Ford Escord" of memory copy, just use the one supplied by the compiler.