SPEECH SIGNAL
المؤلف:
John Field
المصدر:
Psycholinguistics
الجزء والصفحة:
P286
2025-10-14
294
SPEECH SIGNAL
The physical signal from which a listener constructs a message. Also referred to as the acoustic-phonetic signal: ‘acoustic’ referring to the sound waves which reach the listener’s ear; while ‘phonetic’ refers to the speaker’s perception of these waves in terms of features (þ/ voiced etc.). The term percept is sometimes used when the signal is discussed from the point of view of a listener.
It is important to be clear whether the signal is being discussed from an acoustic point-of-view (in terms of what is physically present) or an auditory one (in terms of what the listener perceives). The volume of a sound (the degree of force behind it) is referred to as intensity (or amplitude) in an acoustic context but loudness in an auditory one. Intensity is measured in decibels (dB); whereas loudness is measured on a phon scale. The two are not the same: if two sounds have the same intensity but different frequency, listeners may perceive them as differing in loudness.
In an acoustic context, the term frequency is used for the high low continuum (the effect of the tension and speed of vibration of the vocal cords), and is measured in vibratory cycles per second or Hertz (Hz). The term pitch is used for perceived frequency. However, again, the listener’s perception of frequency does not relate in a simple way to a Hertz measurement: they would not find a sound at 500 Hz to be twice as high as one at 250 Hz. Pitch is measured in mels, on a scale that is logarithmic rather than linear. This reflects the fact that listeners are much more sensitive to frequency changes at lower levels. An increase in frequency from 20 Hz to 160 HZ corresponds to an increase in mels from 0 to 250; whereas the difference between 3120 Hz and 4000 Hz produces the same difference of 250 mels.
Human speech can be displayed on a computer in the form of a spectrogram. This shows the intensity of the signal at different levels of frequency and plots it against time. A simpler display is a waveform, which shows the amplitude of the signal over time. It is important to remember that this information is purely acoustic: a spectrogram shows the information that physically reaches the listener’s ear; it gives us no indication as to which parts of this information the listener heeds.
Different parts of the signal may also vary in perceptual prominence or saliency. In general, more prominent syllables involve greater muscular effort on the part of the speaker. Stressed syllables in lexical words are perceptually more salient than unstressed function words. Important factors contributing to saliency are pitch movement, duration and loudness. Individual phonemes also vary in saliency, depending upon their sonority (vowels > nasals > fricatives > stops) and their duration.
Note that while a phone has a target frequency and amplitude, its resonant quality varies according to the size and shape of each speaker’s vocal tract. Factors include not simply the variation in thickness of the vocal folds (especially as between men and women) but also the size, shape and position of the articulators: teeth, tongue, palate, nasal cavity etc.
See also: Noise, Phonological representation, Speech perception: phoneme variation
Further reading: Ball and Rahilly (1999); Fry (1979); Laver (1994); Pickett (1999)
الاكثر قراءة في Linguistics fields
اخر الاخبار
اخبار العتبة العباسية المقدسة