Kevin Bjorke
Kevin Bjorke
10 min read

Filed Under


Animated Speaker, Part One


It's a hard simple fact for travelers above the 40th parallel: in the depth of winter, the indoor temperature is sure to descend quickly if the furnace isn't running.

This sort of sentence has been historically difficult for even the best computerized language bots and GPT variations: both of the kind of bots that attempt to understand human language, and those other bots that try to emulate it. Computers struggle because this single sentence has a range of different embodiments that map physical attributes like vertical position to completely different physical or non-physical attributes like temperature. Why does a furnace “run,” and what defines the “hardness” of a fact? Are seasons shallow as well as deep?

How did our language get to be like this, and yet why is it so easy for humans like you or I to understand it?

To get a grasp on a possible chain of answers, let your mind wander back to the earliest imagined days of language and gesture, millenia ago.

There are many non-human animals that communicate with gesture: say, fish that raise their fins and spines when challenging one another over territory. A credible evolutionary narrative: the fish that can best scare-off rivals without actually having to risk combat is more likely to enjoy the benefits of the wider territory, and over time more likely to win the survival and gene pool games.

Hominids like us have complex, social communications. This character is shared with some other non-primates, like a prairie dog chirping a warning to the rest of the colony: predators may be nearby. Or at a higher degree of descriptiveness, a honeybee’s shaking dance that replays their successful hunt for food, so that other bees can extract more nectar from the same source.

What the bee’s dance shares with much human communication is a symbolic mapping: the physical experience of flying to the right twenty meters then turning left 15 degrees and continuing for fifty meters more can be re-expressed by a bee as wiggle-wobble-vibrate-wobble-wobble and that information is directly equivalent to repeating the experience. Other bees who see and understand the dance can then also make the same flight without further guidance.

Early hominid communication probably shared this idea of connecting experience with description, with even more emphasis on cooperation: hunting groups, whether wolves or orcas or people, use communication to coordinate themselves in a speculative manner. The huntmaster may gesture go that way not because they have already gone there and found prey but because they’re speculating that it will lead the the group having a successful hunt and their companions understand what to do. Ancient hunters may have had a learned playbook, not unlike a bee, but their patterns could also be deployed more flexibly.

Evolution also gave us a brand-new tool for expression: the hand.

A Symbol Machine

Other mammals use their forepaws for a few simple interactions, mostly in fights or hugs. Some have sophisticated dexterity, such as raccoons. Hominids go a lot farther. As our ancestors were forced by their environment to stand upright, not only did their musculature and skeletons evolve to the new circumstances, but the entire layout of shoulders and hands underwent changes that have left us very different, clavicle to fingertips, from any remaining ape cousins.

Evolutionary changes in arm anatomy are some of the earliest signs that our bodies have been shaped not only by our environment, but also by our technologies. In particular, the human arm is great at throwing. No ape has a shoulder than can throw overhand (chimps can barely roll a tennis ball in a lurching underhand). Overhand throws, on the open plains or at a softball match, are simply a lot more powerful. A thrown rock can drive off a leopard or take down an antelope, to name only two animals that early humans had no chance against bare-handed. And just like the threatening fish, those rock-tossing hominids who could eat more and avoid being eaten longer were more likely winners of the evolutionary game.

It’s not hard to see a connection between the end pose of a throwing gesture and one of our core communicative ones: the outstretched arm saying “over there.” A communication that follows out through the fingertips and beyond, past the boundary of the self and into the larger world.

In tall grass or when crouching, arm and hand signalling can be both more complex than that of a prey-animal herd, and nearly silent. New signals and tactics could be learned and refined over months, short-cutting the speed of evolutionary change in other creatures. Our species’ communications dominance has never since been challenged.

Hunting wasn’t the only activity of early people, and even as human limbs were changing so too were our ears and faces – especially after the advent of fire, which made everyday foods easier to chew. Jaws and face muscles could favor expressivity over force. Hominid vocalization machinery improved too, topped by a growing brain.

Long-jumping forward through time for a moment, in the past decade it’s been shown that speech and communicative gestures are both processed by the same brain areas. For the proto-humans, their improving hands and tongues were evolving in close tandem with more human-like speech centers in their brains. So similar are gesture and speech that, as the NIH scientists would discover millenia later, the interior neurological response for seeing a gesture like a finger over the lips to indicate quiet was essentialy identical to that from hearing a whispered “shhhh.”

Among the cavemen and proto-cavemen, this ability to connect ideas, gestures, and speech gave the pre-humans a deeply different way to communicate from any other creature: communication could follow multiple modes. Gestures could be words, or even mixed together seamlessly. Inside their hominid heads, a little dictionary could operate, explaining gestures as sounds and sounds as gestures and both as information to be acted upon.

In either case, to speak or to gesture is to engage your muscles in a symbolic way, connected to some level of idea, and to use these symbols, alone or in sequence, to engender and build a parallel copy of that same idea in the mind of someone else. This is a very different task than compiling text statistics.

The gesture “over there” maps a motion to lived-experience senses of direction and travel. The timing of the motion and the shape of the arc can indicate the different between “right there” or “waaaay over there.” The abstract and concrete are joined in the expression.

Timing is Everything

The temporality of information, the development of meaning over time as we unwind it from word to word and gesture to gesture, is also often forgotten, even though in fact there’s a whole brain area dedicated to the processing of prosody: of how pitch, tone, and timing affect language (a classic English example of how meaning can change: the difference between “we already ate, Grandma” versus a version lacking the comma).

No surprise that the joke and the pratfall depend on a common sense of timing.

Muscles, motion, music and memory all connected in new ways within those early human-like brains. The neurons of our ancestors, like our own, often stored new memories about places and things laid out along sheets of memory cells that, from a modern viewpoint, often look a little like a cartoony map. Memories about neighboring places are often stored in cells nearby one another, while places farther away are separate. The time it takes to connect ideas that are connected to different locations in the brain can be analogous to distance travelled between those external places.

We experience the connection between memory and location all the time. Can’t remember the name of someone you see rarely? Recalling the location where you last saw them can often give your memory a jolt. Think of an old friend. Does the memory come along with the recollection of a location?

It’s in this connection between idea and gesture that the idea of “change” probably came to be expressed as motion. Just as change in position over time is expressed as direction – “we walked over that way” – changes in other more abstract, maybe invisible qualities could be expressed as movement – “the warmth of the air went up.” A computer might recognize the pattern in a large body of text, but current algorithms lack the why of that correspondence – they only register that the words are often placed together, and are unlikely to understand the relation between rising temperatures and a hot stock pick.

Path Markers

At some point in prehistory, gesture became drawing. Maps appear to have been some of the earliest markings. The gestures an arm uses to describe a long journey over a tall hill are not that different from those used to draw a simple diagram of that journey. From there, the abstractions multiply: A circling gesture to indicate family. A line to indicate direction, waving hands to indicate water – they all have close parallels in drawing gestures, of following the energy of an idea and re-animating it.

A big difference is that a drawn gesture could now be recorded as a persistent object. The listener(s) may actually receive the story on a different day (or year), without the person who made that object being present. A listener can become a viewer, a reader – a viewer who can read or rephrase or copy or refine a object-message. Color it, edit it, re-express in wood or stone or sand. If a line on the cave wall could represent a journey, maybe it could show what kinds of animals were found at the end of it. What might happen with them. Who went on the trip.

Utilla Ma
13,000 years ago, someone made marks on this stone in what's now Spain.
Centuries later archeologisits realized that it presents a map
of the valley outside the mountain cave where it was found.

The ever more ornate permutations of symbolic expression were paralleled in sound – in song and then through musical instruments (there are ancient flutes nearly as old as the cave paintings of Lascaux). Gesture when mixed with music became dance, while speech mixed with drawing became writing.

All of these art forms – for thats what they became – connect physicality to both the expression and comprehension of ideas. Even the most inert concertgoer or reader is moving mentally, ideas building and bridging between paragraphs or passionato. Watch someone listening to a favorite piece of music, or listening to a story or instructions on how to assemble a Lego set. The connections between physical motion, thought, and expression are clear.

That is, until photography.

You Press the Shutter, We Do the Rest

Photography removes the hand from visual expression. The same little squeeze of a button is used to record ducks at the pond, a royal portrait, or the rubble left by an airstrike. To the viewer, many of the same image idioms still apply: color, line, composition in classical terms. But the mechanism of creation has been disconnected, or moved to more-abstract levels: lighting, costumes, wind machines, Photoshop.

In the aftermath of the “Daguerromania” of 1850 or so, there were some reactions from painters who would prefer to say that photography “freed” them from strict mimesis-of-nature, rather than relegating them to the margins: Impressionists, Abstract Expressionists, Dadaists, or Conceptual Artists of every stripe (many of whom gave in to use photography, typically as a medium to record themselves creating gestures to drive the images’ ideas). Only in recent years has painting seemed to be coming back from its century of fitful dreams to reassert its ability to connect seeing and time and gesture onto a canvas again, to let the painter tell a story through poetic means without feeling the need to either meticulously recreate the static appearance of a thing or to run away from representation as somehow “untrue.”

As with photography, the arrival of moveable type for uniform printing began a similar slide away from the personal in the written word, which was often no longer written at all. Printing had previously been expensive, regulated by church and state, and that lent its official dullness an element of authority. It became prefered to an expressive pen.

Not long after photography came the typewriter, banishing pens from newsrooms and garrets, setting widespread standard of rigid presentation that in turn led to the laser writer (no “type” to be moved) and the screen (no object to carry - no ink to page, nor pen to paper).

The banishment of the hand was nearly complete.

(to be continued)


Related Posts