Fast Fourier Transform (FFT)
The FFT is a core algorithm in music ML pipelines because it converts waveform samples into frequency-domain structure efficiently.
Discrete Fourier Transform
For sequence of length :
Inverse transform:
FFT algorithms compute the same result in instead of .
STFT for Non-Stationary Music
Music changes over time, so systems use the short-time Fourier transform:
where is hop size and is a window (for example Hann).
Spectral Features Used by Models
Power spectrogram:
Log-magnitude representation:
These features are widely used for:
- spectrogram diffusion training
- embedding encoders
- pitch and onset conditioning signals
- quality metrics based on spectral distance