This is a treatise on the origin of speech and communication. It is a study of basic human vocalizations and how they developed into speech.
Empirical recordings of animal and basic human sounds provide the following observations:
(1) All vowel sounds are heard in cries of sexual excitement and pain. Consonant sounds are not heard in cries of sexual excitement unless words are uttered.
(2) Animal sounds are compound sounds that combine consonant and long vowel sounds.
(3) Vowel sounds made by family pets express emotion and requests. Cats will meow loudly and noisily as if it is complaining and crying when it is being taken for euthanasia. Neighborhood dogs will wail when someone in the neighborhood dies.
(4) Consonants are artificially created to enable humans to imitate animal sounds, sounds of nature, sounds made by moving objects, and sounds of human activity. These include sounds caused by wind, rain, thunder, dripping water, etc. Examples include the cow’s “moo”, the cat’s “meow”, the snake’s “hiss”, the bee’s “buzz” and “hum”, the pig’s “oink”, the cuckoo’s “coo”, the owl’s “hoot”, the wolf’s “howl”, and the dog’s “bark” and “wail”, etc. Tree leaves blown by the wind make the “f” and the “sh” sounds. Clapping and slapping produce the “d”, “t” and “p” sounds.
(5) Special consonant vocalizations include the African “click” and the Arabic (‘a), the Russian (bl) in the Russian word “we” (Mbl). The click in African languages may have come from man’s imitation of natural clicking sounds made by animals such as some snakes. It may also have been an imitation of animal clicking sounds to lure prey when hunting. The grunt of the camel may have been the origin of the Arabic (‘a) sound. The Russian (bl) may have originated from the lingering elongated “m” when uttered in freezing weather.
(6) Some spoken dialects lack certain vocalizations. Those who speak Taiwanese (Fukienese provincial dialect) cannot make the “r” and the “f” sounds, and the Cantonese have difficulty distinguishing between the “l” and the “r” sounds. Those who speak Pekinese (Mandarin), the original dialect of Beijing, can enunciate all of them as in the Mandarin “er” (son), “fei” (to fly), “la” (to pull), and “re” (hot). Taiwanese speakers substitute the “f” sound with a “wh” sound. Thus an airplane in Mandarin, “fei ji” comes out as “whui ghi” when a Taiwanese pronounces the term.
(7) The combination of consonants and vowels by humans made it possible for speech to develop.
(8) The primitive human desire for expression, description and narration was the primary motive for speech. When humans were hunter-gatherers living in small family groups in caves, the desire of the hunter to express and describe his hunting adventures, to relate what he had seen, what he had heard and what he did, and the dangers he faced, and his desire to warn others together constituted the primary motive for speech.
(9) This primal desire for expression also prompted the cave dweller to draw what he saw on cave walls without any cognition of aestheticism, artistic expression, and symbolism.
(10) Consonants convey meaning. Vowels convey emotion and feeling. The “s” represents the “hissing” sound of the snake. “S” is also represented by the hiroglyph of the snake. In Hawaiian, pure vowel sounds are combined without consonants to form meaningful words.
(11) The combinations of consonants and vowels became a phonetic vocabulary which the hunter-gatherer used to relate his adventures. One example is the “s” and the “sh” sounds in the words “to shoot”, “schiessen” in German, “she” in Mandarin Chinese and “se” in Cantonese. Although this may be a representation of the sound of a burning fuse, the “sh” sound predates the advent of the fuse since it is a “letter” in the Phoenician and Sanskrit alphabets. It is more likely that the “sh” sound was a human imitation of the sound of an arrow flying through the air. The Chinese character meaning “to shoot” is a pictograph of a human body and an ancient pictograph of “a bow and arrow at the ready”. This ancient pictograph is the modern character “cun” (inch), thus indicating in ancient times that the “human body was an inch away from shooting or letting go of the arrow already at the ready”. In Chinese, a common man-made sound of two swords clashing is “ka tssa”. The Chinese also make the “sa sa sa” sound to describe a blade slashing through the air, leaves blown by the wind, shuffling on sand or on salt on a wooden floor.
(12) The “th” sound heard in English, Spanish of Spain, and Arabic may have been from a human imitation of the sounds of water boiling, food simmering, meat sizzling and smelting iron.
(13) The repeated use of this phonetic vocabulary enabled the hunter-gatherer to tell his story again and again. Thus, narration and storytelling became most likely the foundation of complex spoken language.
(14) For a narrative to be meaningful to the listener, it had to be accompanied by gestures and body movement. These became standardized and formed the basis of commonly understood cross-cultural gestures (with a few exceptions of reversed meaning).
(15) Narration also had to retain its particular arrangement of sounds and vocalizations to retain meaning. This formed the basis for grammar. As the narration was repeated in speech, context became important to retaining the original meaning. Grammar in both spoken and written forms preserved syntax.
(16) Comprehension of expressions and narrations is based on mutual understanding. We all know how difficult it is to communicate with each other when we do not speak the same language. For example, a hunter describing his encounter with a snake would “hiss” and move his arm and hand to and fro to demonstrate the movements of the snake. This “snake movement” is common in all Hindi, Cambodian, Thai, Chinese and Arabic belly dances. The narration accompanied by hand gestures and body movements would be immediately understood by the listener even if he or she does not understand the spoken language.
(17) By repeating the same narration using the same utterances and vocalizations by those who had heard it, the original speech and the original story and the original meaning of the sounds are perpetuated and preserved.
(18) As human activities expand, imitations of newly created sounds are added to the narrative, and additions made the narrative more descriptive. Before the advent of automatic transmissions in automobiles, children would imitate the sound of acceleration and shifting of the gears while pushing a toy car along the floor.
(19) For vocal utterance to carry meaning, such as naming of an object, conveying intention, desire, command and simple response, the same utterance must be used in similar circumstances in a similar way and repeatedly. These utterances that carry particular meanings are often accompanied by finger and hand gestures. It was only long after spoken and written language had developed fully that sounds without gestures could convey full meaning. Poetry reading is such an example.
(20) Simple speech may have begun with monosyllabic utterances that were used to convey commands and simple responses. These would include simple commands like “Come!”, “Go!”, “Komm!” in German, “Ven!” in Spanish. The Spanish and English “no”, the German “nein”, the Russian “nyet” may have come from the original vocalization “nnnn….” indicating “doubt.” The Greek “oxi” (Ohee) (meaning no) may have come from the primitive sound of “ooo… hhh” expressing doubt. The Arabic “laa” (no) and the “ei” in Finnish may also have come from utterances expressing “doubt.” The Finnish “ei” may have sounded like “e…i…” (ay). the Arabic “laa” may have come about with the practice of adding the article “l'” in front of definite expressions, the “l” in the Arabic “laa”. Italian does the same when a noun is considered definite as in “L’italia.”
Dialects are regional variations in enunciation. Some are mutually understandable. Some are not. Szechuanese and Mandarin are pretty much mutually understandable, but Shanghainese, Cantonese, Taiwanese (Fukienese), Foochownese (the dialect of Foochow, the provincial capital of Fukien province) and Mandarin are mutually incomprehensible. Hakka is another dialect that is rarely understood outside of the Hakka community.
(22) Here is a table of basic sounds represented by the Latin alphabets.
A: vowel, pain and sexual pleasure, “baa” of goats
B: “b” in “baa” of goats and sheep
C: C originates from the snake, “s” as in the Russian “c”, and the “c” in Spanish. The hard “c” (ca, co, cu) in English is foreign in German. Greek has no letter “c”. In Russian and German, the hard “c” is the “k”, and in Italian, the hard “c” is the “ch”.
D: tapping, water dripping
E: vowel, pain and sexual pleasure
F: sound of leaves blowing in the wind
G: “goble” of geese and turkey, “grunt” of camels
H: “hoot” of the owl, breathing sound
I: vowel, pain and sexual pleasure
J: extension of the “ee” (tee, tea) sound as in Dutch, a voiced substitute for the silent “h” in Spanish.
K: “clucking” of chickens, “quack” of ducks, “caw” of crows, “coo” of cuckoos, “croak” of frogs
L: the human “la” sound, substitute for lyrics
M: “moo” of cows, “meow” of cats, “hum” of bees
N: human sound of doubt “nnnnn…..”
O: vowel, pain and sexual pleasure
P: human hand clapping, tapping, slapping; the Chinese “pao” (cannon), “pai (to slap), “puo” (break), “pu” (to plunge and to lunge forward)
Q: “quack” of ducks, the “k” in Arabic
R: “grrr..” of bears, “purrrr” of cats, the Spanish “rr” as in “growl” of rabid dogs about to attack
S: “hiss” of snakes, the Russian “c”
T: human tapping
U: the “woo” in “woof” of canines and in the English word “wolf”, vowel sound of pain and pleasure
V: “v” as ‘f” in German and Dutch
W: “vv” as elongated “v” in such German wards as “Wasser” and “Wetter”
X: as Webster’s II New Riverside University Dictionary states, a speech sound represented by X, thus a speech sound representing the “eks” sound
Y: “yelp” of coyotes, “yell” of wolves
Z: “buzz” of bees
(23) Tonal communication
A subset of vocalized communication is tonal communication. According to my canine expert, pets react not to the particular languge you use to talk to your pet but to your tone of voice. In human conversation, the tone of voice conveys meaning and betrays one’s true feelings. Answering “Yeah! Yeah! Yeah!” betrays one’s annoyance. Pets understand the owner’s commands by recognizing a combination of the tone of voice, gestures, body language and facial expressions. Every dog owner knows that the dog’s eyes are very telling. They can express sadness, shyness, being coy, regret, anticipation, “I know I’m wrong” and “I just did a no-no, sorry!”, embarrassment, etc.
(24) Tonal communication is nonverbal vocalized communication.
(25) Dogs and cats also use location to indicate what they want. Dogs will bark in front of its food plate when hungry, and bark behind the front door to indicate it wants to go out.
(26) Cats do the same thing. The cat also uses vocalizations and tone to convey different meanings. It “meows” for food, hisses at intruding cats and makes a low pitch “meowish growl” with its long tail curved between its hind legs when ready to fight.
(27) Dogs make threatening barks, asking to play barks, begging and waiting for food barks, rabid growls, whine and wail.
(28) Music and rhythm
Music is tonal. It imparts emotion. A prime example of this is the instrumental music “Dueling Banjos.” The movie “Close Encounter of the Third Kind” tried to portray music as the “language” of communication. However, only emotion and emotional responses were communicated, not meaning. Therefore, even though only humans can make music, by itself without lyrics, music does not convey objective meaning. It conveys only emotion. And we all respond to music emotionally.
(29) Communication requires a template, i.e., we communicate fully only with someone who speaks the same language as we do. A Spanish speaking person can pretty much understand Portuguese but not so much Italian. China has so many dialects that up to the 1960s, the only common language between two Chinese was English. Mutual understanding thus relies on speaking and writing in the same language, or on verbal and written translations into each other’s language. A good translator needs to know both languages well enough to “interpret” what was said in each language.
Dance may have been man’s most primitive form of using movement and gestures to convey complex meaning. Dance may have also preceded man’s ability to use meaningful speech without gestures.
Communication requires a sender and a receiver. Miscommunication occurs when the sender’s message is not understood or is rejected by the targeted receiver.
Miscommunication based on not understanding each other is obvious when two people speaking two different languages face each other and do not understand each other. Communication is thus reduced to mostly hand gestures or finding a common third language. In the modern world, hand gestures are pretty much universal except for a few gestures that have different significance in different cultures.
Miscommunication characterized by a rejection of a sender’s message occurs very frequently during soccer games. A fallen American women’s soccer player rudely waved her hand gesturing “Don’t touch me!” at a Brazilian player when the Brazilian player approached sympathetically, an Egyptian men’s soccer player was rudely waved away in a similar gesture by an American goal keeper when the Egyptian player approached in sympathy, a Mexican soccer player was rudely waved away by a fallen American soccer player, and a Mexican goal keeper was rudely pushed to the ground by a tall Polish player when the Mexican player place his right hand on the back of the head of the Polish player in a gesture of friendship. These are examples of rejection of a sender’s message of sympathy and friendship. Generally, the sender acted in sympathy but the intended receiver reacted with hostility.
Propaganda is the intentional propagation or communication of falsehoods and rumors and is not miscommunication. Miscommunication resulting from propaganda is often perpetuated and reinforced by the propagandists. Miscommunication occurs because of the ignorance of the audience, a failure to understand the real facts, a failure to realize the falsehood of the propaganda.
Intentional propagandistic miscommunication is used in politics and in religion. Some of these are obvious. The Nazi propaganda against the Jews, the campaign leading to the Chinese Exclusion Act of 1882. Islamic prohibitions like: pork is dirty, and prohibitions against women in Islam, etc. Propaganda is thus the intentional creation of bias.
The power of slogans
Popular revolutions like the French Revolution, the Socialist and Communist revolutions in Russia and China, Nazism, were spread with mostly anti-establishment slogans. Slogans usually call for change, the overthrow of the old establishment and tradition, and are often mistaken for good intention. However, slogans can rally the ignorant or the idealist, and the common factor shared by the ignorant and the idealist is discontent of the present, the intransigence of the establishment, and the subsequent violent crackdown. However, the expression of such discontent led by slogans is usually misguided. Misguided idealism has generally and historically resulted in violent revolutions that legitimized evil governance. Iran’s 1979 revolution was one such example. On the other hand, we have also seen many spontaneous “color revolutions” under idealistic slogans that have brought about relatively non-violent and peaceful revolutions recently.