It's actually ~3.5:1. If you download the zip file, you'll see for yourself.
original datasize =
samples*sizeof(word) =
256 * 2 =
512 Bytes
compressed datasize =
samples*sizeof(4bit) + table*sizeof(word) + startsample*sizeof(word) + header =
(256*0.5) + (16*2) + 2 + 2 =
164 Bytes
=> ratio = 512/164 = 3.1219...
It isn't massively complex and it could certainly be implemented more simply than I did.
Complexity in sense of information theory, not implemtation wise. Implementing VQ (at high complexity) is easy.
That's not actually true. In the stereo case, there is still only one delta table derived from both channels. The independent variation of left and right will produce similar spread of
Yes (phase) and no (correlation).
If you would encode both independently, you would need 2 tables. But you use only one which does not resolve the tiny differences between left and right. Actually it is very likely that the stereo difference is killed completely, because it needs to compete with the other deltas. So you are actually worse than 2 independent channels with 2 tables, which is again worse than 2 joint stereo channels.
> the correlation of delta value spread isn't really affected by phase differences between the channels.
This is cool, didn't think of the phase difference. You got me here ;-)
(but only if the phase is shifted consistantly accross all sinus waves, which is almost never the case in reality)
Experimentation with mid and side band encoding did not produce any real difference from a QSNR perspective.
Yes, not with this algorithm, because the sideband suffers to much from competing against the mid band channel. You would need 2 Tables. (can be used together, but "trained" seperately).
My experience with ADPCM decode was that corrupt data in the compressed stream can (but won't necessarily) knock the decode out permanently from that point onwards. Explicit audio frames mean that at most only the remaining samples in the current frame will be corrupted.
Nobody stopps you from putting ADPCM into chunks of N samples. If this should be a stream, you need to do this anyway, because otherwise you cannot join the stream at any position you want, which is the main point of being a "stream".
But putting ADPCM into chunks has zero overhead. You just reset the adaptation factor to something average all N samples. Of course, setting the factor to the best value is better and can be achieved by adding 1 extra byte each frame.