Banto1d, 23 March 2022
Universität Hamburg
Matthew Faytak (Univ. at Buffalo)
Katie Franich (Univ. of Delaware)
Part 1:
Part 2:
Part 3:
Regular breaks will be interspersed
These slides are a web page
The slides are hosted here on GitHub
Nearly all references mentioned during the workshop are linked at the end of the slides
Africa has long been regarded as central to phonetic description Ladefoged (1968), Maddieson & Sands (2019)
But relatively little of this scholarship involves African scholars
We presume that you, the participants, are:
If you are not in this group, we ask that you prioritize those in this group for questions and feedback
Let’s assume that we are all committed to:
This workshop is an introduction to instrumental phonetics
“Phonetic” work may also refer to non-contrastive subphonemic detail
Several practical advantages over impressionistic approaches
In impressionistic phonological description, all presentation of data is filtered through the worker’s theoretical analysis
Phonetic recordings allow better testing of hypotheses about phonological structure
Even a trained phonetic ear is prone to making occasional mistakes based on perceptual bias
Not all contrasts can be easily described by the analyst’s ear, and especially transcribed quickly in the moment
Recordings allow for careful listening later
Recordings are required for instrumental phonetic work: many incidental benefits
The aim is not to displace impression-based methods, but to complement them
We always want acoustic speech data to be:
Certain details of format are also important:
Here is an example of a good recording
The following slides contain recordings which fail on one of the points above
Recordings should not contain excessive background noise
Any noise, however quiet to your ears in the moment, will be much louder in the recording later
Listen carefully to your surroundings, and avoid:
Speaker should also minimize non-speech noise:
If echo is strong, speech ends up overlapping itself; problem for listening and analysis later
How to improve: listen for echo and choose surroundings which have less
If the speaker is too loud and/or too close to the microphone, the device cannot respond enough; clipping results
How to improve: make test recordings after you position your microphones
At the most basic, you only need something to make recordings on. One of the below:
A way of transferring files off of the device and backing up for future analysis:
Headphones, to check recording quality
External microphones can increase the quality of acoustic data by recording less echo and background noise
An acoustic baffle can reduce echo
Do not use computer noise reduction/filtering in general
Recording over Zoom works surprisingly well, if all else fails Ge, Mok, & Xiong (2021); Sanker et al (2021)
Download Praat from fon.hum.uva.nl or praat.org
Both the object window (shown below) and picture window appear when you open Praat (we’ll ignore the picture window for now)
Basic issues can often be solved using the manual
Let’s record ourselves saying a “good day, good afternoon” greeting in whichever language you would like
You should now have a Sound in your objects window
We’ll end this by saving our work as a .WAV file (the standard format for phonetics work)
We might also import sounds which have already been recorded
We’ll start by viewing Babanki “hills of bushes”
Viewer window has button and keyboard controls
We may wish to know where the words and segments are, but we lack useful landmarks at this point
TextGrids are one of the most useful features of Praat: annotate and organize your audio files
Let’s use the interval tier “sentence” and use it to transcribe the utterance
Let’s use the point tier “stop” to mark off where each [t] release happens
We might be dissatisfied with how the tone marks are displaying; we could make a new tier for tones (autosegmental style)
Amending the TextGrid using the new tier:
Using CTRL+S or the menu shown below, you must save your TextGrid when you are done
We may wish to provide further details in our TextGrids, but we encounter another problem here: how to interpret the data?
Waveforms show sound pressure (the pressure that sound waves make on the microphone) versus time
Sounds produced with a more open mouth are louder and more sonorous Parker (2008) ; these are thicker on the waveform
We can also show the spectrogram for our recording
Spectrograms are three-dimensional, and show time vs. frequency vs. sound pressure (color)
Vowels, semivowels, and approximants have characteristic striping, horizontally and vertically
Nasals look somewhat like vowels, with “smudged” formants and less energy (darkness)
Fricatives have high-frequency or low-frequency noise
Stops show an absence of (most) energy followed by a burst across the whole spectrum
Affricates look similar, but as if they were followed by a fricative
Prenasalized stops and affricates often have a long nasal closure followed by a short oral closure which we can see clearly in the spectrogram
Figures are an easy way to present a small amount of phonetic data in scientific papers
This section: professional-looking and informative data displays
Waveforms and spectrograms can be “drawn” or “painted” (respectively) in the picture window
“Draw” (in the object window) is for any line-based drawings, including waveforms
The result: a waveform drawn within the plot area
TextGrids can be drawn as well, using the same menu as Sounds
A Sound and TextGrid can be drawn together very easily: simply select both and choose the Draw menu as before
“Paint” is for drawing spectrograms and other objects, but: we need the right object to do this
This sends a Spectrogram object to the object list; when we select this we get a “Paint” option under Draw
The result: as expected, but a bit too tall for its width (we might make the plot area wider/shorter)
Drawing a spectrogram and a TextGrid at the same time is a bit more complicated
The result: TextGrid annotations on top of the spectrogram
As with everything else in Praat, you must save before closing the picture window or you will lose your work
Now we’ll turn to taking numerical measurements in Praat
More than showing an entire sound file as a waveform or spectrogram, focusing on a specific phonetic property can be useful
Also lets us measure many utterances and summarize, which also allows us to handle phonetic variation
Because of this it’s best to collect many observations and average or model the data to remove noise and variation
One of the simplest measures: duration of segments or words
An example from Babanki (bbk-prenas.wav): proportion of prenasalized consonants which is nasal Faytak & Akumbu (2021)
Other phenomena studied with duration (not exhaustive):
Pitch (and all other measurements which we’ll discuss) have a dedicated menu
Resulting pitch track:
Settings for pitch are important
A quick change to pitch range fixes the jump issue
All tonal phenomena involve f0
Babanki example in :
Sub-phonemic effects of intonation on tonemes can be examined figure from Rialland & Aborobongui (2016)
Other tone topics (not exhaustive):
A measure of loudness, measured in decibels (dB)
Intensity is useful for measuring degree of constriction
Babanki example: stressed (stems) and unstressed (prefixes) don’t seem to be differentiated by intensity
Implosive vs. non-implosive voiced stops have different intensity profiles figure from Nagano-Madsen & Thornell (2012)
f0 (pitch) and intensity tracks can easily be Drawn to figures as seen above
Formant frequencies provide vowel quality and other contrasts
Turn on “show formants”, and formant tracks for the first three formants (F1, F2, F3) appear
Estimating formant frequencies requires calibration for every individual speaker: low pitched voices need different settings compared to high pitched voices
The default settings work well for higher-pitched voices
The default settings of 5 formants in 5500 Hz (for higher-pitched voices) don’t work well for the very low-pitched voice of the Babanki speaker:
Better: changed to 4.5 formants, 4200 Hz; much lower frequency range
Formant tracks work best in a plot of F1 against F2 (which is quite hard to make in Praat), but formant tracks can be drawn like any other measure
F1-F2 scatterplots have F2 on the x axis, F1 on the y axis, with both axes reversed
Better scatterplot figures can be made using tabular data (we’ll discuss later) figure from Faytak & Akumbu (2021)
Formant data is useful in figuring out harmony systems; ATR or otherwise
Other uses for formant measures (non-exhaustive):
Praat also detects measures relating to voicing: these are grouped under the unintuitive name “Pulses”
The Pulses menu contains the same “showing”, measure-getting, and drawing functions as other menus
The result: each detected “voice pulse”, shown over waveform (not spectrogram)
The pulses themselves can be plotted with a TextGrid like any other similar object
If voicing has a predictable timing but varies in extent, the voicing report may be useful (access in the Pulses menu)
Voicing in unexpected places is common for labial-velars; this can be confirmed by looking for pulses figure from Connell (1994)
Other voicing-related topics:
Praat is useful, but it has important limitations
Because of this, we often need to export data from Praat into other programs
The most effective way to export numerical measurements: tabular data, that is, spreadsheets
Google Drive or Excel both work well (.xls, .txt, or .csv format)
We’ll use duration to demonstrate the most basic spreadsheet construction
Repeat this process for each of your observations, starting a new row each time
Text output is generated using the menus above the data display
For any measure other than duration, menus behave slightly differently depending on whether your cursor has selected an interval or not
A shortcut exists to get the midpoint of any interval
“Get…” provides a single measurement, or a single mean measurement across an interval
“Listing…” provides every single measure within an interval
Exceptions:
Mean, max, and min f0 measures (from “Get…”) can be copy-pasted into their own cell just like our duration example
Listings (f0 over time) are trickier to paste:
We’ll take pitch as an example, but these instructions will work for any listing
Continued:
Unlike before, we need to contextualize the time values and normalize them
The best way to do this:
Same as pitch listing, but note there are four columns for F1-F4, plus one for times
Much like above, but with more measurement columns:
Working with listings is more difficult than working with mean measures, but having time series of measurements is valuable
Beyond the scope of this tutorial, but more efficient in a number of areas: basic coding
While there is a steeper learning curve, the improvement to the process may pay off in the long run
Acoustics gives us an indirect idea of the movements of the articulators
Sometimes, though, we need to look directly at the articulators
We point out here that characterizing and measuring some articulations is much easier than usually expected
Lips are easily seen moving since they are on the exterior of the face
Example: Babanki lip articulation during vowels Faytak & Akumbu (2021)
Lip activity during Medumba [ʉ]: compressed in the direction of a bilabial stop Olson & Meynadier (2015)
Subtle lip posture differences also known to occur on various fricatives and approximants
For faster articulations, short videos can also be useful
A “tongue-print” on the roof of the mouth which shows where articulatory activity takes place Anderson (2008)
The method is more involved than simple photography:
Kom exhibits a contrast between two vowels which I will discuss further in my regular talk
Palatography reveals aspects of the lingual articulation of these two vowels
More canonically used for clear evidence of lingual place contrasts
Everything that we’ve talked about in this section involves minimal equipment
Optional:
The value of photographic evidence as a figure should be obvious: simply include the image
If you have larger numbers of images, you might convert these to tabular data for further analysis
It should be mentioned that certain articulations produced further back are not discussed here
However, ultrasound technology is gradually making it easier to image these articulations Miller, Gick et al. (2006); Namaseb & Iskarous (2007); Allen, Pulleyblank, & Ajíbóyè (2013); Hudu (2014)
Name your recording files according to the same logical pattern
To identify further details, speak them during the recording itself
Back up every recording in multiple locations if possible, to avoid technological problems or theft destroying your work
Tabular data, of whatever sort, needs to be averaged, summarized (mean/standard deviation), or submitted to a statistical model
Simple statistical analysis and modeling are standard in phonetics for analysis of numerical data (t-testing, linear models, curve fitting, etc)
If you want to try stats yourself, it’s best to find software for handling tabular data:
If you plan to model the data, it’s important to have enough statistical power
For the average phonetics study, good power is obtained at 15-20 speakers (rule of thumb)
We’ve covered:
What remains to be seen:
In compiling these slides and the references they contain (as many papers which present phonetic evidence in African languages as possible), I reflected upon:
In the references provided here, it is clear that African linguists are underrepresented, regardless of how “African linguist” is defined
Some potential topics for discussion:
Akumbu, P. (2019). A featural analysis of mid and downstepped high tone in Babanki. In Clem, E., Jenks, P., & Sande, H., eds., Theory and description in African linguistics: Selected papers from the 47th Annual Conference on African Linguistics, 3–20. PDF
Allen, B., Pulleyblank, D., & Ajíbóyè, Ọ. (2013). Articulatory mapping of Yoruba vowels: an ultrasound study. Phonology, 30(2), 183-210. Abstract & PDF
Anderson, V. (2008). Static palatography for language fieldwork. Language Documentation & Conservation, 2(1), 1-27. PDF
Bendjaballah, S. & Le Gac, D. (2021). The acoustics of word-initial and word-internal voiced stops in Somali. Journal of the International Phonetic Association, first view. Abstract
Boyer, O., & Zsiga, E. (2013). Phonological devoicing and phonetic voicing in Setswana. In Ọla Orie, Ọ. and Sanders, K., eds., Selected Proceedings of the Annual Conference on African Linguistics, 43, 82-89. PDF
Connell, B. (2007). Mambila fricative vowels and Bantu spirantisation. Africana Linguistica, 13(1), 7-31. Article
Connell, B. (1994). The structure of labial-velar stops. Journal of Phonetics, 22(4), 441-476. Abstract
Danis, N. (2020). Yorùbá vowel deletion involves compensatory lengthening: Evidence from phonetics. Stellenbosch Papers in Linguistics Plus, 60(1), 1-12. Abstract & PDF
Faytak, M., & Akumbu, P. W. (2021). Kejom (Babanki). Journal of the International Phonetic Association, 51(2), 333-354. Article
Fransen, M. (1995). A Grammar of Limbum: A Grassfields Bantu Language Spoken in the North-West Province of Cameroon. PhD dissertation, Vrije Universitet Amsterdam.
Gahl, S. (2008). Time and Thyme Are not Homophones: The Effect of Lemma Frequency on Word Durations in Spontaneous Speech. Language 84(3), 474-496. Abstract
Ge, C., Xiong, Y., & Mok, P. (2021). How reliable are phonetic data collected remotely? Comparison of recording devices and environments on acoustic measurements. In Proc Interspeech 2021, 1683-1687. PDF
Genzel, S. & Kügler, F. (2011). Phonetic realization of automatic (downdrift) and non-automatic downstep in Akan. Proceedings of ICPhS 17, Hong Kong. PDF
Gick, B., Pulleyblank, D., Campbell, F., & Mutaka, N. (2006). Low vowels and transparency in Kinande vowel harmony. Phonology, 23(1), 1-20. PDF
Gjersøe, S., Nformi, J., & Paschen, L. (2019). Hybrid falling tones in Limbum. In Clem, E., Jenks, P. & Sande, H., eds., Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, 95-118. PDF
Hamlaoui, F. & Makasso, E. (2019). Downstep and recursive phonological phrases in Bàsàá (Bantu A43). In Clem, E., Jenks, P. & Sande, H., eds., Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, 155-175. PDF.
Hudu, F. (2014). [ATR] feature involves a distinct tongue root articulation: Evidence from ultrasound imaging. Lingua, 143, 36-51. Abstract
Hudu, F. (2016). A phonetic inquiry into Dagbani vowel neutralisations. Journal of African Languages and Linguistics, (37)1, 59-89. Abstract
Hyman, L. (2014). How to study a tone language. Language Documentation & Conservation, 8, 525-562. PDF
Koffi, E. (2018). The acoustic vowel space of Anyi in light of the cardinal vowel system and the Dispersion Focalization Theory. In Kandybowicz, J., Major, T., Torrence, H., & Duncan, P., eds., African linguistics on the prairie: Selected papers from the 45th Annual Conference on African Linguistics. PDF
Ladefoged, P. (1990). What do we symbolize? Thoughts prompted by bilabial and labiodental fricatives. Journal of the International Phonetic Association, 20(2), 32-36. Abstract Preprint PDF
Ladefoged, P. (1968). A phonetic study of West African languages: An auditory-instrumental survey. Cambridge University Press.
Lewis, D., & Shittu, S. (2014). Phonemic Status of Len Fricative-Vowels. Ibadan Journal of Humanities Studies, 24, 27-45. PDF
Lotven, S. & Berkson, K. (2019). The phonetics and phonology of depressor consonants in Gengbe. In Clem, E., Jenks, P. & Sande, H., eds., Theory and Description in African Linguistics: Selected Papers from the 47th Annual Conference on African Linguistics, 249-268. PDF
Maddieson, I. & Sands, B. (2019). The sounds of the Bantu languages. In Van de Velde, M., Bostoen, K., Nurse, D., & Philippson, G., eds., The Bantu Languages: Second Edition, 79-127. Routledge. Preprint PDF
Mathes, T. & Chebanne, A. (2018). High tone lowering and raising in Tsua. Stellenbosch Papers in Linguistics Plus, 54, 1-16. Abstract & PDF
McCollum, A. & Essegbey, J. (2020). Initial prominence and progressive vowel harmony in Tutrugbu Phonological Data and Analysis 2(3), 1-37. Abstract & PDF
McKinney, N. (1990). Temporal characteristics of fortis stops and affricates in Tyap and Jju. Journal of Phonetics, Abstract
McPherson, L. (2020). Seenku. Journal of the International Phonetic Association, 50(2), 220-239. Abstract
Miller, A., Namaseb, L., & Iskarous, K. (2007). Tongue body constriction differences in click types. Laboratory Phonology, 9, 643-656. PDF
Monaka, K. (2005). Shekgalagari stops and theories of phonological representation. Lwati: A Journal of Contemporary Research, 2, 24-42. Abstract & PDF
Myers, S., Namyalo, S., & Kiriggwajjo, A. (2019). F0 timing and tone contrasts in Luganda. Phonetica, 76(1), 55-81. Abstract
Nabirye, M., de Schryver, G., & Verhoeven, J. (2016). Lusoga (Lutenga). Journal of the International Phonetic Association, 46(2), 219-228. Abstract & PDF
Nagano-Madsen, Y. & Thornell, C. (2012). Acoustic properties of implosives in Bantu Mpiemo. In Eriksson, A. & Abelin, Å., eds., Proceedings of FONETIK 2012, Gothenburg, 73-76. PDF
Naidoo, S. (2012). A re-evaluation of the Zulu implosive [ɓ]. South African Journal of African Languages, 30(1), 1-10. Abstract
Nakagawa, H. (2008). Aspects of the phonetic and phonological structure of the G|ui language (Doctoral dissertation). Synopsis
Naumann, C. (2016). The phoneme inventory of Taa (West !Xoon dialect). In Vossen, R. & Haacke, W., eds., Lone Tree: Scholarship in Service of the Koon. Essays in memory of Anthony Traill. Köln: Rüdiger Köppe Velag. PDF
Nforgwei, S. (2004). A study of the phonological and syntactic processes in the standardisation of Limbum. PhD dissertation, Université de Yaoundé. PDF
Olson, K. & Meynadier, Y. (2015) On Medumba bilabial trills and vowels. Proceedings of ICPhS 18, Glasgow. PDF
Oppong, O. (2021). Pitch reset in Asante Twi, a dialect of Akan. MA Thesis, University of Helsinki. Abstract & PDF
Parker, S. (2008). Sound level protrusions as physical correlates of sonority. Journal of Phonetics, 36(1), 55-90. Abstract
Rialland, A. & Aborobongui, M. (2016). How intonations interact with tones in Embosi (Bantu C25), a two-tone language without downdrift. In Downing, L. & Rialland, A., eds., Intonation in African tone languages 195-xxx. Berlin: Mouton de Gruyter. Abstract PDF
Ritchart, A. & Rose, S. (2015). Schwas in Moro Vowel Harmony. In Kramer, R., Zsiga, E., & Tlale Boyer, O., eds., Selected Proceedings of the 44th Annual Conference on African Linguistics, 231-242. PDF
Sanker, C., Babinski, S., Burns, R., Evans, M., Johns, J., Kim, J., Smith, S., Weber, N., & Bowern, C. (2021). (Don’t) try this at home! The effects of recording devices and software on phonetic analysis. Language, 97(4), e360-e382. PDF
Solé, M. J., Hyman, L. M., & Monaka, K. C. (2010). More on post-nasal devoicing: The case of Shekgalagari. Journal of Phonetics, 38(4), 604-615. Abstract Preprint
Starwalt, C. (2008). The acoustic correlates of ATR harmony in seven-and nine-vowel African languages: A phonetic inquiry into phonological structure. PhD dissertation, The University of Texas at Arlington. PDF
Traill, A., Khumalo, J., & Fridjhon, P. (1987). Depressing facts about Zulu. African Studies 46(2), 255-274. Abstract & PDF
Utman, J., & Blumstein, S. (1994). The influence of language on the acoustic properties of phonetic features: A study of the feature [strident] in Ewe and English. Phonetica, 51(4), 221-238. Abstract
Whalen, D. H., DiCanio, C., & Dockum, R. (2020). Phonetic documentation in three collections: Topics and evolution. Journal of the International Phonetic Association, 52(1), 1-27. Abstract
Zee, E. (1981). Effect of vowel quality on perception of post–vocalic nasal consonants in noise. Journal of Phonetics, 9(1), 35-48. Abstract