motorollin wrote:
Karlos wrote:
When doing a full transform, the decomposition produces a spectrum of frequencies. When you reproduce the sine waves with the appropriate intensity, frequency and phase shift, the complex constructive/destructive interaction of them (re)produces the time varying properties you heard in the origninal sound.
That seems to contradict what you said earlier, unless I misunderstand this:
Karlos wrote:
However, complex sounds obviously change with time. Translation into the frequency domain produces time invariant data so what you have to do is to split your sound into tiny windows and perform a discrete FFT on each one. Each frame of sound will have a different spectrum that you can turn back into an impulse of sound.
It sure does sound that way, but that's because I wasn't being entirely clear. You have the mathematical ideal on one hand and computational reality on the other.
In the pure mathematical case, a finite duration signal of arbitary complexity (that varies across time) can be broken down into a spectrum composed of (up to) an infinite number of discrete sine waves, each having a distinct frequency, amplitude and phase shift. These sine waves do not vary over time (except in the regular sense), ie, they don't have an envelope and they don't slowly change phase or frequency. Therefore, this spectrum is not a function of time in the same way the original signal was.
Now, if you take this (potentially infinite) number of sine waves and mix them back together, the interference between them reproduces the original signal for the duration of that original signal. Mathematically you cannot do this for an infinitely long random signal since you'd require an infinitely large spectrum which is incalculable.
In computational reality, you simply cannot do this. The longer your complex sound, the larger your resulting spectrum is going to get and the longer it's going to take to calculate (you're looking at O(N log N) for FFT or O(N*N) for 'simple' FT).
Instead, you slice up your sound into frames which are individually a much shorter length and perform the (F)FT on each one individually, where you'd get a much more reasonable (and hence managable) spectrum per frame than you would for the entire original signal.
The only time when you can reasonably encode a long duration sound in one FT is when that sound is basically time invarying (or varies in a cyclic way) since you get a managable spectrum out of it.
Is that any clearer?