Control of articulatory speech synthesis seems a lot like a very technical choreographic language.
From the example here: Articulatory synthesis
- To set the glottis to a position suitable for phonation, use the ArtwordEditor to set the Interarytenoid activity to 0.5 throughout the utterance. You set two targets: 0.5 at a time of 0 seconds, and 0.5 at a time of 0.5 seconds.
- To prevent air escaping from the nose, close the nasopharyngeal port by setting the LevatorPalatini activity to 1.0 throughout the utterance.
- To generate the lung pressure needed for phonation, you set the Lungs activity at 0 seconds to 0.2, and at 0.1 seconds to 0.
- To force a jaw movement that closes the lips, set the Masseter activity at 0.25 seconds to 0.7, and the OrbicularisOris activity at 0.25 seconds to 0.2.
It makes this sound:
Then the weird thing is if you mimic the sound you’re following the same program!