What's the difference between doing that and the TCP/IP hardware offloading provided by some server hardware?
From the hardware side of view there are chips which exactly do that for you, in case you want to easily adopt an FPGA to some ethernet connection (like WizNet family of ethernet connection chips).
My experience here is that you lose in speed, but gain in ease of use (as FPGAs don't have CPUs, in most cases).
Where you *really* can gain in TCP/IP speed is checksum checks and calculations, as those operations are tailored for serial data streams in hardware, and CPUs have no chance to keep up.
But these things are already done by the DENEB USB controller chip. In fact, there are solutions like those you want to do (e.g. from FTDI), but they are limited to USB1.1 and offer no real performance - besides the problem, that due to restrictions in RAM and ROM of the uC you cannot support all classes at once, and you are bound to the USB stack given by the IC manufacturer.
Michael