|
The Secondary Never-Asked Question |
|
|
|
|
Geschrieben von: Bruce Balentine
|
|
There are no translations available.
Eventually it comes down to the NAQ. You’ll recall that the NAQ is, “What problem are we solving and how important is it to solve it?” But there is a corollary to it. When someone makes a visionary statement that appears to have no answer to an NAQ, the new NAQ is, “Why?” This is the secondary NAQ. For example, someone says, “I want to solve the problem of a stupid bus - I want to make sure that the welcoming message only plays when it’s appropriate. My goal is to make the bus more intelligent and therefore friendlier.” Sooner or later, someone has to ask, “Why?” And the question often stumps the speaker.
One of the first signs of a discontinuity is that people who were once freely allowed to make declarations about things that are “obvious” and “inevitable” are increasingly challenged to defend their views. This is because the discontinuity is just beginning to sense that the views are flawed - no longer “unquestionable” as they once were.
Before the current discontinuity, for example, people could say things like, “Speech is the most natural and intuitive way to interact with machines. If you don’t buy speech, then your customers will go to your competitors, all of whom are considering speech now.” The response was predictable hand-wringing and pants-wetting - “Omigosh, I have to do something.” Even if an outright action was not in the cards, there was still belief in the inevitability of the statement.
Now that the discontinuity is here, on the other hand, we are seeing more skeptical reactions: “Show me. Prove it.” Or, most deadly, “Why?” As in, “Why do I have to replace touch-tone with speech? Why is natural language better than directed dialogue? Why will my customers go to my competitors? Why shouldn’t I wait until next year? What’s wrong with the simplest solution? Where is the urgency? Who are you, anyway? And what are you doing in my office?”
In other words, more and more people nowadays are asking the secondary NAQ. So as a start on processing the standard answers to that set of questions, here are a few that have been debunked - if you ask the NAQ, do not allow anyone to give you these answers without some real follow-on explanation.
The Just Because argument
This is one of the most common, and is similar to the one used by parents or children when they want to terminate a discussion quickly. The idea is to make a statement that seems on the surface to be self-evident, and then quickly veer the conversation to exploiting rather than backing up the statement.
The method is simple:
- Generate a string of sentences with positive-sounding words;
- Assume that the sentences are true and proceed immediately to greed; and,
- In the event of dissent, simply repeat the sentences with additional phrases aimed at making the dissent sound petulant, arrogant, non-visionary, or unfaithful.
Here’s an example conversation:
Believer: Speech is the most natural and intuitive way to communicate. Speech is uniquely human. We are born already predisposed for spoken language. Speech is the thing that separates us from the animals. Speech is so intuitive, you don’t have to learn it - by the time we’re grown, all of us can use speech. Speech goes back eons to the dawn of mankind.
Listener: Wow. Sounds cool.
Believer: Speech will therefore become the most common and important user interface medium ever. Not only for telephones, but for everything. If you get in on the ground floor then you can become rich beyond your wildest expectations. It’s only a matter of time. (sound of propellers revving)
Dissenter: But all you’re doing is saying lots of positive-sounding words. You’re not explaining why speech is better than, say, eye-hand coordination for certain tasks. You’re not telling me what you mean by “natural,” or “human,” or “intuitive” - nor are you telling me what makes speech exhibit more of these properties than (for example) opposing thumbs, binocular vision, or male-pattern baldness.
Believer: Well some people are negative even about things that are obviously true, but the rest of us know what’s holding back technology products - it’s making them easy to use. And nothing is easier than speech. It’s simply the most natural and effortless human activity.
The technique reminds me of the Texas Governor in The Best Little Whorehouse in Texas, who - when asked a question by a reporter - smiles, puts his hat on sideways, turns suddenly into the hat, and then vanishes. “Dance a little sidestep” is the country phrase and song.
There are standard variants on the argument:
- The religious faith argument;
- “That’s my story and I’m stickin’ to it;” and,
- “Gotta Build Bypasses”
The Big Vision argument
I had a colleague once who was working with me on a simple interactive device - we called it a “speech button” for lack of a better metaphor. The idea was to build a simple and robust branching device - a 2-way or 3-way kind of dialogue component - that was resistant to background noise and to turn-taking errors. The device had utility in a small footprint low-end product. The simplicity was important because it was 1991. Hardware was limited in power.
His friends and colleagues in the speech lab (there was one in every big company back then) - specialists whom my friend referred to as “the cognoscenti” - were uninterested in helping. “Why are you down there fooling with speech buttons?” they asked. “Why not come up here to where the big boys are playing - fully fluent natural conversational speech?” They went on to predict, “It’s a little ways out, but I’m sure we’ll have it nailed by 1995. Certainly by 1998 at the very latest.”
I came to view this as the Big Vision argument. It’s a very male-primate thing, and can be summarized with the declaration, “My Vision Is Bigger than Your Vision.” There’s always a lot of hissing and displaying of brightly-colored bottoms (metaphorically) that goes along with the statement. The basic idea is that something that is practical and useful is inherently uninteresting - thinking big is the way to become alpha male.
The argument is basically ingrained early by the mentality of the boy’s locker room. If you want to produce something practical and useful, you’re probably a dork and you’ll get popped with a towel. But if you have a Big Vision, then - well, umh - wow!
The Big Vision was one of the early contributors to the selling of the future that set in just about that time in the speech industry and has been going on ever since.
The Fear of Being Wrong argument
New technologies that have “obvious” value often have champions that are not interested in rational discourse. So they stake out their turf at the Hyde Park of media - usually conferences and sometimes publications related to conferences - and then make their declarations. They are usually able to simply shout down the dissenters, who tend to tire more easily because of their lack of faith (read: “ability to benefit personally from the claims”).
These midway barkers pull out the “fear of being wrong” arrow from their quiver at first criticism - and usually don’t need to dip further into their armory. The argument goes something like this:
Believer: This is the FUTURE!
Dissenter: But it doesn’t make sense.
Believer: Ladies and gentlemen. Let me just read to you the words from some of the dissenters of the past.
“This new-fangled telephone is good for government and business, but there’s no use for it in everyday life.”
“Americans don’t have time to watch television - it’s just a fad that will pass.”
“I can’t think of a single reason why anyone would ever want to own a computer.”
“Internet? World Wide Web? It’ll never catch on!”
Dissenter: Okay, I GIVE!

A VISIT TO A SPEECH CONFERENCE
Of course these arguments are all red herrings. There have been as many wrong technology predictions as right ones through the years, and the success or failure of any given past prediction is simply irrelevant to the question of value in a new product or service. We could talk just as easily, for example, about 8-track tapes, Betamax versus VHS, quadraphonic sound systems, and the Wankel engine.
If something has value, then we should be able to describe exactly what that value is. I challenge anyone involved with speech recognition to deliver a new answer - one that goes beyond those that have been debunked here - that gives a compelling reason for applying ASR as a user interface technology. An ergonomic reason.
|
|
Why a Discontinuity? Why Now? |
|
|
|
|
Geschrieben von: Bruce Balentine
|
|
There are no translations available.
Whenever I mention these thoughts to others, I often get a cynical but credible reply. “Bruce,” I hear, “you have to remember that this is a business. There are always inefficiencies and misunderstandings in the business world. Why do you assume that this one is any different or that it will ever change? How can you be arrogant enough to think that you can affect an entire industry?”
Well, of course, my reply is that I’m not pretending to affect any industry. What I’m saying is much less egotistical. I’m saying that these changes are happening right now - due to forces beyond my or any else’s control - and there’s nothing that anyone can do to stop them. I don’t believe that I’m causing them. I’m just describing them.
But let’s play out the argument. Why not more of the same? After all, the best predictor of the immediate future is the immediate past. So forecasting in business is like forecasting the weather. You can do quite well by simply declaring that tomorrow will be very much like today. Most of the time you’ll be right. In fact, the only time you’ll be wrong is when the weather changes - that is, when there is a discontinuity. So there is personal advantage to simply exploiting the current climate. Even blindly.
Exploiting the current climate is, in fact, what I predict will continue to happen - as it has for the past decade - until it cannot any longer. This is because staying the same is simply the path of least resistance. Why not business as usual? Why not just sell one-off solutions into the high-end IVR market and then tout the future whenever there are obstacles? Why not give lip service to speech as a future best-seller to close immediate opportunities?
Well, of course that is the correct behavior when change is unlikely. But it’s a disaster when change is imminent. Because in business - like the weather - the discontinuities are where the information resides. The main thing you want to know about the weather is “when is change likely to occur?” You miss all of the information when you look at the equilibrium, it’s the discontinuities that represent business opportunity.
The first and most important reason for this particular discontinuity is the simple fact that the enterprises’ business problem is not yet solved. Corporations by the thousands have installed IVR systems at great expense, gone through iteration after iteration, tuned until their hearts exploded, and incorporated every fashion and fad that has come their way for the past two decades - in some cases even longer. And they still are not enjoying the cost-saving benefits, the stability of a manageable system, or the customer loyalty that they sought. In many cases, these corporations are on their fourth or fifth iteration. And the call center and IVR costs just keep rising.
Other reasons for the timing of the discontinuity include:
- Credibility for the speech industry is at an all-time low, because predictions have been repeatedly wrong;
- We are on the far side of all excuses - Y2K, 9/11, the “dot com” bubble, impending war. All of the political, judicial, economic, or technical reasons for holding back decisions are now in the past;
- Economic pressures are strong and getting stronger; and,
- The buyers have become users.
This last one is especially interesting. As speech and touch-tone IVR systems have permeated all enterprises, the people making decisions about those systems are developing
personal histories as well. Buyers leave their offices and use flight arrival and departure systems, call an IVR for traffic information, check on bank accounts by calling, and register their children for school with the telephone. Every one of these buyers has now become a user. And many of them are fed up. And that means that - fuming as users - they are prepared to change their behavior as buyers.
There’s never been a worse time to be on the wrong side of this particular chasm.
|
|
Simplicity: The First Step |
|
|
|
|
Geschrieben von: Bruce Balentine
|
|
There are no translations available.
So let’s work a simple and concrete example of "being a machine.” For years now, I have watched speech vendors go overboard on yes-no questions. Variety in speaking yes or no responses to questions of all kinds is commonly touted as a feature of speech and an intrinsically "natural” aspect of a spoken user interface. I’ve seen poster board signs at trade shows listing scores of yes and no synonyms - from simple and socially-common transformations like, "yep” and "sure,” to more exotic and frivolous constructions like, "okie dokey,” and "are you kidding?”
The argument for including these words in the yes-no grammar is that this kind of "variety” is the very spice of language that makes a VUI or a TUI come alive. You’ll recall that Jack used the same argument when he incorporated "on” and "off” variants in his Say It to See lighting product - phrases like, "I can’t see,” and "g’night John-boy.” And, as we’ve learned, ASR technologies today do a very good job of discriminating between these variants. So when users use them, they usually work well.
The problem, of course, is false acceptance. If you observe users in real life interacting with these systems, you rarely observe the use of all but a few common yes-no synonyms. But at the trade shows, you see demonstrators delivering every exotic examples of "yes” and "no” with reckless abandon. And it’s not uncommon to see speech champions play the role of provocateur when "testing” other people’s applications, answering yes-no questions with, "sure, why not?” and "well, ah reckon ah caint disagree with ya own thaht won.” Just to see what it would do. In other words, these synonyms are here to protect us from monkey butts! And as we’ve seen, real users aren’t monkey butts, because the set and setting of real users is more task-centric. Real users aren’t calling to play with the IVR, they’re calling to get something done.
So all of these yes-no synonyms hide like cockroaches in the yes-no grammar, just waiting for unplanned circumstances. And the light sends them scurrying when they are hit with traffic noise, side conversation, and disfluencies. In other words, we get false acceptance on the occasional yes-no question and it throws the whole system into a death spiral.
A good machine knows that risk. A good machine also knows that "yes” and "no” answers are the primary recovery mechanism for errors. So a yes-no question must be exceptionally robust, accurate, fast, and simple. And when it’s not, it will amplify errors. A good machine therefore asks a question with the expectation of serving a user - not entertaining a monkey butt. A good machine is perfectly within its rights expecting such a user to give clear and standardized answers.
And how does this play out with the users? Well it’s just no problem for most. The words "yes” and "no” are very simple to say, and people find the expectation that such words are appropriate in a formal and businesslike interaction quite understandable. Especially if it makes everything else go faster. So - although people do spontaneously deliver synonyms, and there are, of course, monkey butts - it is very easy for an IVR to ask users for "yes” and "no” and to get a sensible response. This is provided the IVR can detect and reject the synonyms.
IVR: Is that what you want?
User: Yes, that’s it.
IVR: Please answer yes or no.
User: Yes.
After the interaction, the user infers that this is a "rule" (answering yes or no) and most often finds it unobjectionable. The IVR on the other hand, can adjust its thresholds for maximally best performance on "yes" and "no" without having to worry as much about false acceptance and false rejection. The entire application is more stable, AND the user reports this yes-no expectation as a predictable and easily-learned "feature" of the IVR. We rarely see users become upset about it.
So here is the first of a whole sequence of design methods that begin to specify what constitutes a "good machine." There’s less variety, and no entertainment - but it does its job better than when it attempts to be "more human." The tradeoff between user and machine is minor, but the benefits are tangible:
- All yes-no questions are the same, so user learning is fastest;
- Yes-no questions use the same software, so testing is more thorough;
- No exceptions to yes-no behavior means less software to write;
- No exceptions to yes-no behavior means fewer rules for the user to learn;
- Lower false acceptance rate reduces potentially fatal state errors; and,
- Lower false rejection rates on yes-no questions means less user frustration and faster task completion.
The bullets stand in opposition to the reason FOR using yes-no synonyms:
"That’s not how a person would do it."
The sentence is cast in various ways, using words like "less natural," or "it’s not very friendly," but they all boil down to this one argument. And I say that the tradeoff is therefore worth it. All I have to do is accept that my design will lead to a good machine rather than a bad person. What I get in return is an effective IVR that delivers business value in less time and at lower cost.
I’ll take that tradeoff any day.
|
|
Geschrieben von: Bruce Balentine
|
|
There are no translations available.
You can make the connection that Ryan and his colleague are exactly like Droog and Ayla—using tools to do their work, and using spoken human language to collaborate concurrently with each other. Two channels of interaction - each uncontaminated by the other. In both cases, it is theory of mind that makes the difference.
Now that we have touched on theory of mind (ToM), I would like to drill down a bit and explore it more thoroughly. First, let’s define the term.
ToM is the ability of one human being to empathize with another by imagining what that other is experiencing subjectively. The ability depends on the construction of a mental model - a theory - of the other person’s mind.
One of the more well-known of ToM tests is called the Sally-and-Anne test. In this experiment, one or more children are presented with two dolls or puppets, Sally and Anne. They are playing with a marble. Sally puts the marble into a basket and then leaves the room. Anne then takes the marble out of the basket and puts it into a box. This all happens in full view of the spectators - children, animals, or other experimental participants. Shortly after, Sally returns.
At this point, we intervene, stop the show, and ask one of the spectators, “Where will Sally look for the marble?” The expected answer is, of course, that Sally will look in the basket. We all know that Anne moved the marble to the box, but we also know that Sally can’t know this. Sally was not here when the marble was moved, so Sally thinks the marble is in the basket. We use this theory of Sally’s mind to predict Sally’s behavior - she’ll look in the basket.
Most higher primates, extremely young children, and people of all ages with autism do not answer this way. They all say, instead, “Sally will look in the box.” Why? “Because that’s where the marble IS.” These spectators do not, it is concluded, possess a ToM ability. And the implication is absolutely mesmerizing.
Those without ToM (theory of mind) have an internal model of external reality. The world is a certain way. And that model is singular - the world is only a certain way. The marble is in the box. Period. So that’s where Sally will look for it.
Those with ToM have the same internal model of reality. They know just as well where the marble is. But they also have the ability to construct a second-order reality and to project it onto Sally’s mind. “Even though the marble is not in the basket, Sally thinks it is in the basket, and so that is where she’ll look for it.” They have constructed a theory of Sally’s mind that does a better job of predicting Sally’s behavior than those without ToM. This second-order reality does not interfere with the first-order representation of “true” reality - the spectator still knows that the marble is in the box. Sally will fail to find it and will then become confused (“Hey - who moved my marble?”). But Sally has a different reality. And it is real to her, although not to the spectator.
This principle, of course, extends to possible third-order representations and beyond. What if the spectators saw Sally peeking through a window - unbeknownst to Anne - while Anne was moving the marble? In such a case, Sally may very well look into the basket, knowing that the marble is actually in the box, in order to deceive Anne into thinking she doesn’t know something that she in fact does know. In other words, the spectators construct a third-order representation of Sally’s mind in which Sally knows where the marble is, but deceives Anne by pretending to look elsewhere. Anne in turn now has a second-order representation of reality that is correct, but a second-order ToM of Sally that is wrong. Sally did this play-acting, perhaps, to cause Anne to form an incorrect theory of Sally’s mind - possibly to Sally’s future advantage.
Whew! It gets rich quickly.
But this kind of ToM projection happens all the time, and is the key to pretending, joke-telling, lying, and other “he thinks, she thinks” knots.
All normal children above a certain age pass this ToM test with flying colors. Almost all autistic people fail the Sally-and-Anne test, and indeed, the test is used in part to diagnose and to study autism. Most of us cannot imagine a world in which our ToMs are compromised. But the symptoms of autism derive from the lack of ToM.
Symptoms of autism include:
-
Inability to pretend;
-
Literal interpretation of environmental cause and effect;
-
Inability to understand jokes;
-
Poor social skills with other children;
-
Lack of an awareness of the existence or feelings of others;
-
Impaired ability to imitate; and,
-
Impaired use of language and poor comprehension of language.
There are many more, but you can see that our ToM abilities gave us some of the more important traits that we value as humans. And what’s more, I’m going to argue that it is essential for language understanding.
|
|
Harrison Ford Views More Images |
|
|
|
|
Geschrieben von: Bruce Balentine
|
|
There are no translations available.
OK, so now let's compare the Blade Runner sequence with a similar scene in Patriot Games, a film of the early nineties. This time, Harrison Ford plays Jack Ryan, a CIA Analyst who, by happenstance, interferes with an IRA assassination. Unfortunately, a renegade faction of the IRA targets him and his family because of Ryan's intrusion into their affairs.
Ryan: "Bring up camp 18 again. Tighten on the camp here."
(points to screen and then studies it for awhile)
"Let's look up in here ..." (gesturing with a pen at the computer monitor) "...see what we can see."
Viewer: [a blue box surrounds area and enlarges ... just like in Blade Runner; audio a bit more subdued - background music more dramatic]
Ryan: "Show me this group in here."
Viewer: [people standing in the desert, long shadows behind them ... it's obvious that the sun is not directly overhead. Another blue box]
Ryan: "hmm ... nah, that's nothing ... go back to the other, umh, bigger picture. What do we see out here?" (gesturing to another area to the lower left of the screen).
Viewer: [operator zooms in, manipulating keyboard and mouse]
Ryan: "What's that? ... tighten up on that, will you? Can you enhance that? Sharpen it up?" (pause)
(a grim smile on his face) "That's ..." (long pause as he realizes what he's looking at)
What's remarkable about this parallel dialogue is the shift - only in the course of a few
years - in the popular view of how one might go about using imaging software. In the Blade Runner case, we have a film - released in 1982, but written some two years earlier - that predates the broad popularization of personal computers.
Of course the history of the PC is well-known, and certainly aficionados of the time were aware of them. From the CP/M systems of the mid-seventies, through Apple II and other highly influential hobbyist products, to the 1982 release of IBM's Billion-Dollar Baby, the personal computer would have been known to the authors of this Future Noir movie. But everyday experience with manipulating images - so accepted by movie viewers today - would not have been known at the time. The "futuristic" technology of the early 80's was speech.
So Harrison Ford in BR sits on his living room carpet and talks to his imaging system. His interface is a command and control interface - he tells the machine directly what to do. And the compelling reason for using speech recognition rather than some kind of manual control is, apparently, to allow Deckard to drink whiskey.
Eyes busy, hands busy, gullet busy.
Very 1940's. And also very Jetsonian.
In Patriot Games, we have a more modern view of image manipulation. The data are visual, so they appear on a large color display - just as with BR. In this case, the images are from satellites. A human operator manipulates the images using a powerful eye-hand coordination interface. He moves the mouse to outline areas on the screen. The film was made almost exactly ten years later than BR.
Just as Droog and Ayla used their hands to manipulate a man made tool (hammerstone and flint) while using speech to communicate with each other, the Patriot Games model of collaborative working shows an effective eye-hand coordination interface for image processing, coupled with a human-to-human channel that supports ToM and planning discourses.
Now stick with me on this. A human being is manipulating an image processing system. He is using state-of-the-art user interface technologies to do that. He has a large color display, a keyboard plus a pointing device, and his very high speed computer is incredibly responsive as he manipulates the satellite images to locate what he is looking for.
Meanwhile, his colleague is watching the same display. The colleague knows more than the operator exactly what it is that he's looking for. So they are collaborating. The image processing system provides a direct manipulation interface, while human speech provides a meta-channel that supports the human collaboration. The two channels together represent a much more powerful and symbiotic way of accomplishing work. I believe that this is at least part of what Ben Shneiderman means when he suggests that, "speech is reserved for humans."
Now suppose Harrison Ford were able to talk directly to the image processing system - just as he had in the future ten years earlier in Blade Runner. He would have commanded the system while observing the display - requiring that he know more about the capabilities of the system. But he also would have devoted much of his attention to the needs of the interface itself. He would have alternated between task and viewer operation.
But instead, Ryan has a sentient colleague - one who knows how to operate the viewer and whose job is to focus only on that. Ryan is freed to think about what he is looking at. The two together got more done than either would have been able to achieve separately - a very powerful use of spoken human language.
|
|
Learning To Manipulate Artifacts |
|
|
|
|
Geschrieben von: Bruce Balentine
|
|
There are no translations available.
Underlying all practical philosophies, including design philosophies, is a basic position on knowledge transfer. Droog is the clan’s master toolmaker, and - according to the ruthless logic of aging and group survival - every master must inevitably become a teacher. I enjoy the pleasant fantasy of imagining Droog on a crisp autumn morning some 30,000 years ago. The glaciers were then just receding from the steppes of Central Asia, and researchers have recovered many artifacts from the indigenous people of that exciting prehistoric time.
Droog finds a spot near the river. Here he has water for rinsing his tools, and an ample supply of local stones. He sits on his heels in a squatting position and unwraps his tool bundle. The hide from a giant hamster serves as a working surface, and Droog spreads it across the ground in front of him. He arranges the collection of stones, shell fragments, teeth, bones, and sticks neatly on top of it. Whispering a prayer to focus his mind, Droog takes a deep breath and begins work.
Two hours later, Droog is lost in concentration. Lying beside him are several newly-made flint spear tips, as well as a hand-axe. It’s been a productive tool-making session, and Droog is pleased with the results. He sets down the half-finished adze and pauses briefly. The sun is overhead and sweat drips from his forehead into his eyes. “It’s gotten hot,” he remarks. Leaning forward he cups his hands to collect river water and splashes his face several times.
This is the point where Ayla approaches. Her shadow falls across his workplace and he looks up and smiles at her. It’s been a good tool-making summer, and these last two hours have been especially prolific. Droog is still a little introspective but alert from his intense industry, and is in the mood for sharing. Ayla in turn is a quick learner and hungry for new skills.
It’s a perfect time for some tutoring.
The illustrations below show the interaction between Ayla and Droog. Look closely at the sequence and then focus on empathizing with Droog. He knows what it feels like to knap flint. But such knowledge is difficult to convey directly through spoken language because it has to get into the body - it has to feel right. So Droog gives Ayla a hint and then watches her strike a blow. He then looks into her mind and imagines what that blow felt like to her as she performed it.

NO, DON’T JUST HIT THE TWO TOGETHER. LOOK ...

THIS IS THE HAMMER STONE THAT MOVES.

THIS IS THE STILL-STONE THAT WILL BIRTH YOUR TOOLS.

LOOK AT THE STILL-STONE AND IMAGINE THE SPEAR TIP SEPARATING.
This last observation is an incredibly important one. Droog has memories that Ayla hasn’t. One memory is what it felt like both before and after he successfully learned the craft of flint knapping. But Droog has another feature - one unique to conscious and sentient human beings. Droog can form a theory of another’s mind. Droog can literally look into Ayla’s mind and feel empathically what she is feeling. So he observes Ayla’s toolmaking blow, empathizes with her internal mental state, and then speaks to her in an attempt to modify that mental state - to describe the delta between what Ayla is feeling now and what she should feel.
This is the perfect use of language. Not as a mechanism for manipulating artifacts, but as a metachannel for connecting two sentient minds as they work together toward a common goal.
When Droog says, “Your arm should feel loose, not stiff,” he is describing his body-feeling when he delivers the blow correctly. Droog uses introspection to remember this feeling, and uses empathy to infer Ayla’s different - and wrong - body-feeling. He then uses language to describe an alternate feeling. He speaks with the explicit goal of influencing her internal mental state.
Now step into Ayla’s mind. She is trying to find a feeling that she has not yet experienced. Her natural tendency is to concentrate on the flint that she is striking - trying to place the hammerstone onto the precise spot. But she is now empathizing with Droog, trying to “feel” something she has never felt. She listens to Droog’s spoken language as he describes these feelings. She tries to “visualize” the flake coming away, thinking about the flake rather than the stone itself. Then she tries to imagine “pushing the hand through” the seam between flake and stone rather than “hitting the stone” directly and brutally.
It works. A flake - the perfect size for a spear tip - gently lifts itself from the flint stone and peels smoothly away with a “tink.” Ayla has successfully knapped her first tool. And Droog has successfully imprinted his own mind onto Ayla’s in an ancient and vital human ritual. As a result of this interaction, the young will receive the knowledge of the old before the old inevitably die. The tribe will survive. Tools will outlive toolmakers. Civilization will emerge from nature. The future will be better than the present.
Magic! And wonderful.
YOUR ARM SHOULD FEEL LOOSE, NOT STIFF. DON’T HIT THE STONE - PUSH THE HAMMER THROUGH THE STONE.
|
|