Most of the reason you write C code isn't for performance but maintainability.
I've rewritten the hash function of my AmigaE hash table class in Assembly. It cut out a lot of cruft but mostly the cruft was the result of E not supporting bit rotations.
In order to run it on a non-Classic Amiga, such as an AROS system, I had to also write the code in PortablE and it generated some hacky-looking C++ code but the GCC compiler knows how to convert a couple of shifts, and an OR to a rotate internally. Now suddenly I don't have to worry about writing in a new Assembly code for my x86 AROS hosted environment for the Mac, nor for PPC AROS, nor anything else.
It's a tradeoff that's becoming increasingly biased against hand-optimized code beyond what C can offer.