Human Speech Spectrum, Frequency Range, Formants
Average long term speech spectrum (talking over one minute). The maximum energy is in the 250Hz and the 500Hz band. These lower-frequency bands correspond to vowel sounds, the higher-frequency bands in the 2k and 4k region correspond to consonant sounds.
typical human male speech spectrum|
Frequency Range and Speech Formants
|frequency range of human voice, speech formants and typical loudspeaker crossover frequencies|
Speech and singing voice is characterized by the formants. The individual human voice is characterized by these boosted areas in the frequency response of a single voice for each single tone. Speech formants are areas withion the frequency spectrum that are energetically higher than the average. These areas are typically for the different vowels in human speech. There are four formants (f1, f2, f3, f4) but only the two first formant (f1 and f2) are really important for vowel recognition. An additional formant is typically for singing voices. This frequency area with high energy is the reason whay singing voices can be heard easily within or respectively on top of an orchestra. The lower speech formant f1 has a total range of about 300Hz to 750Hz and the higher speech formant f2 has a total range of about 900Hz up to over 3000Hz. But each single spoken or sang tone has a much narrower range for both formants. The singing formant has a range of about 2Hz to over 3kHz.
|the formants f1 and f2 for different vowels as boosted areas in the speech spectrum|
|singing formant - high energy at 2.5 to 3.0kHz - to predominate even a loud orchestra|
|main two formants f1 and f2 for different vowels (each f1 left, f2 right)|
Vowels carry the power of the voice and consonants provide intelligibility. By only providing a limited frequency range no full intelligibility can be provided.
|single Band ||provided Intelligibility|
|250Hz ||very poor|
|500Hz ||about 12%|
|2kHz and 4kHz ||together about 57%|
|about 15 seconds of recorded speech|
The crest factor of speech is always high just because of the difference between spoken words and pauses.
See: Crest Factor.
Relationship between Distance and received Speech Level
|Source: Klark-Teknik, Audio System Designer|