Episode 21: Yorick Wilks Discusses the History and Future of Natural Language Processing

Dr. Yorick Wilks // Sep 27, 2016

 In this episode of STEM-Talk, we talk to one of our own senior research scientists, Dr. Yorick Wilks, renowned for his work in natural language processing. Wilks is also a professor of artificial intelligence at the University of Sheffield in England, and senior research fellow at the Oxford Internet Institute at Balliol College.

A “war baby” born in London in the midst of the Second World War, Yorick was sent away to school due to the bombings. He excelled and went to Cambridge, where he studied with Margaret Masterman, a protégé of philosopher Ludwig Wittgenstein.

Yorick first came to America—L.A. in the 1960s—on a one-year Air Force Research Grant. Yeas later, he moved to Stanford University’s AI Lab, where he worked with John McCarthy, one of the founders of Artificial Intelligence.

Yorick’s research interests have been vast and rich, including machine translation, translating, understanding and extracting meaning from language, belief representation and human and machine communication. He has authored 14 books and many more papers, and has been the recipient of numerous awards, including, in 2008, the Association for Computational Linguistics (ACL) Lifetime Achievement Award.

Yorick also speaks several languages, including Swahili and Japanese.

Yorick is a senior research scientist at IHMC’s Ocala, Florida facility where he was interviewed for this podcast. STEM-Talk Host Dawn Kernagis and IHMC Associate Director and senior research scientist Bonnie Dorr—who is also a leading expert in natural language processing—conduct this rich interview, full of both historical insight and wisdom about the future of AI.  Yorick also spends much of his time in Oxford, England, where he lives with his wife and two beloved dogs, an Italian greyhound and a German Sheppard.

1:07: Ken mentions that Yorick was an easy selection by “a unanimous vote by the double secret selection committee.” He calls Yorick a pioneering researcher, mentor and a raconteur of the first order.

1:31: Ken continues: “Yorick was on the ground floor when AI and the Internet were in nascent stages of development.”

2:30: Dawn reads an iTunes 5-star review of STEM-Talk from “Love the ocean”: “I just listened to Joan Vernikos’ STEM-Talk, and I am convinced that I am on my way to living a healthier life from the changes I’ve made incorporating what she said in her talk. What an inspiration she is, and how proud I am to have met her at NASA, where I currently work, and know that even after her NASA days, she continues to research and publish. STEM-Talk truly finds those brilliant and interesting people and encourages in-depth discussions. Continuous five-stars.”

4:30: Dawn welcomes Yorick and Bonnie.

4:58: Yorick describes upbringing: “I was a war baby, from a poor, working class family.” His parents worked in aircraft factories and sent him to school outside of London because of the bombings.

5:48: He got a scholarship to a good school; and another scholarship to attend Cambridge. “In some ways, I escaped my upbringing completely.”

6:00: Yorick won a school prize at age 16, and asked for Aristotle’s Metaphysics. That marked his first interest in philosophy. At Cambridge, he studied math and physics; he changed to philosophy after a year.

6:50: He considers himself in “apostolic succession from Wittgenstein” via Margaret Masterman, his philosophy tutor at college. “She wasn’t good at teaching; but she was a genius, a guru.”

7:56: Wittgenstein didn’t like women in his classes; he didn’t like ugly people, Yorick says. “But she hung in there, and Wittgenstein was the biggest influence in her life.”

8:22: Wittgenstein thought understanding the world meant understanding language…But he wasn’t anti-science at all. He was an engineer by background. He thought how we saw the world was determined by language.

9:10: Masterman thought she was carrying out a Wittgenstein philosophy, but with new technology (computers.)

9:20: Yorick tells about spending the 1960s in L.A., the era of sex, drugs and rock n roll. He had a one-year Air Force Research Grant and was attached to an offshoot of the Rand Corporation, which was Bob Simmons’ group. He worked on an IBM 360, and started programming (in Lisp) his thesis ideas in L.A.

11:15: Yorick moved to Northern California at the end of the 1960s.

12:09: He took a job John McCarthy’s AI lab at Stanford.

12:25: Yorick recalls some the earliest days of the Internet at Stanford.

13:54: “Margaret Masterman had the idea that you could code the meaning of language with a small number of semantic primitives (features).”

15:11: Yorick’s thesis was building a representation of English that was in another language of semantic primitives. It was ahead of its time. Back then, Noam Chomsky was popular: “Syntax and grammar were what mattered…We were dead set against that. We thought it was completely wrong.”

16:30: Yorick created the first semantically-based machine translation system from English to French. “It was no good as a translation; it was just the idea of doing it that way.”

17:13: Yorick coined semantic parsing, which today is commonplace, but back then was novel.

17:49: Yorick discusses his appreciation for the perspective now often referred to as “human-centered computing.” He was influenced by Martin Kay, a computational linguist who thought that translation was too sophisticated for machines alone, but rather that human/machine teamwork was necessary. “For laws, constitutions, poetry, the human must be in the loop. Machine translation is just a tool for the human.”

22:00: Yorick talks about the two broad approaches—symbolic and statistical—to computerized language processing.

23:10: “The biggest shift in language processing in the past fifty years has been the advent of massive hardware—more so than theoretical advances.”

25:06: All the new Google translations are basically statistical now. “Sometimes they work, sometimes they don’t, but they are easily produced given big-data and they are workhorses that deliver. You can get a decent translation and understand almost anything now.”

25:25:  But, “language cannot be, at base, statistical. We cannot be statistical engines generating English. A novel isn’t just a very long Markov Chain… novels have structure… novels are about stuff.”

25:55:  Currently most people seem to agree that some kind of cooperation between statistical and symbolic methods will be most efficacious in AI.

26:25: Commercial break: STEM-Talk is an educational service of the Florida Institute for Human and Machine Cognition, a not-for-profit research lab pioneering ground-breaking technologies aimed at leveraging human cognition, perception, locomotion and resilience.

27:00: Yorick talks about the current popularity of network-based methods such as “deep understanding” and “deep learning.”

28:56: “We all know when we say one thing but mean another. Statistics won’t express that, but linguists have spent decades expressing it.”

29:10: “Deep learning is a bigger matter,” he says, calling it an “absurd misnomer.” Yorick continued that it has “produced good results in facial recognition, speech recognition. It hasn’t yet produced striking results in language understanding…I don’t think it’s that different than what went before.

30:19: “We are living in a world, where funding and hype and real science are mixed up in a strange way; to get somewhere and flush our real funding, you have to sound as if you are the new Messiah.”

31:15: “I inherited machine translation because it was the prime task that Masterman’s research center was set up to do. I parlayed that into a representation for language and meaning.”

32:50: “In the 1970s and 80s, I got very interested in the representation of human beliefs…that connected back to my early work on semantic representation. I began to think we couldn’t understand language unless we could understand what the other person believes about language and the world.”

34:00: Yorick has continued to work in belief representation at IHMC, specifically on work funded by the Office of Naval Research that models dialogue to try to determine who is the main influencer and has the leadership role.

34:30: In a chance happening, Yorick met David Levy, author of Love and Sex with Robots: The Evolution of Human-Robot Relationships. Levy got interested in human conversation and wanted to create the best conversational machine. Levy wanted to build a machine to win the Turing Test competition called the Loebner Prize.

35:44: Yorick led the team in 1998 at Sheffield that won for Levy, who funded it.

36:50: Bonnie talks about all the areas of research Yorick has covered: understanding language, translating language, extracting meaning from language, detecting correct word sense from ambiguous words. She asks him about the future of the fields he has worked in.

40:30: “Dialogue systems are very hot. Human-machine cooperation is “the hottest topic in AI right now.”  How we are going to control automated cars?  “They’ve got to talk like us; otherwise it’s hopeless.”

41:20: Yorick talks about working on the EU’s largest project on artificial companions, or conversational companions. “Just as people tell dogs their secrets, people would have computer companions that would live with them. It would be a hand-bag sized type thing. It would be your interface with the web.”

44:12:  Yorick talks about AI systems as “cognitive orthoses.”

44:25: Yorick predicts people will warm to the idea of artificial companions. “A computer program would have photographs and talk to you about them. It would debrief you on your memories and keep your memoires straight; and help to keep you mentally alive. It’s going to be a lot better than no companion at all.”

45:05: “People will have emotional relationships with anything. The bar isn’t that high.”

45:35: Dawn asks Yorick to talk about Ken Ford’s observation that after decades of pundits and philosophers arguing that AI is provably impossible, suddenly that argument has been replaced with the assertion that not only is it possible, but superhuman AI is so inevitable that it is the greatest danger ever faced by the human race.  In only about a decade, the conversation has shifted from you can’t do itto you shouldn’t do it!

46:42: Yorick says that the media stokes these fears irresponsibly.

46:56: “Stephen Hawking knows no more about AI than anybody who reads the newspapers. His mind is full of cosmology … which is no help.”

47:20: “Automated weapons could do horrible things; but all weapons can do terrible things.”  It’s not what people think the problem is, AI itself is not the problem.

50:14: Yorick enjoys mentoring Ph.D. students; their most common problem is that their writing is so compressed, he says. The only way out of that is to have them write just one paragraph that is completely clear; and to let that grow.

51:38: He describes a different territory for research with respect to fifty years ago, when there was a “virgin territory in research.” His advice to researchers: “If you think you have anything original to say, say it and see where it takes you.”

54:10: Yorick speaks French, German, Italian, Spanish, Japanese and Swahili (which has 16 genders, meaning noun clauses, not sex.)

55:05: Knowing languages, you see how badly some are designed.

56:36: “IHMC is quite like John McCarthy’s lab at Stanford. You’re left to do what you want as long as someone is attracted to it; there’s no party line.”

57:58: “I have a very high view of dogs. They remember you. They have a lot of attractive features.” He has an Italian greyhound and a German Sheppard.

1:00:38: Ken muses on the fascinating interview, which covered a broad range of AI topics, Yorick’s rich educational experiences, and his participation in the early days of the Internet.

1:01:20: Ken and Dawn sign off.