Human-Machine Communication & Language Processing


IHMC researchers are working on natural language computer interfaces and digital companions for conversational understanding, with the aim of developing computerized assistants that accommodate a range of abilities, desires, and needs of users through adaptable natural language processing.  Managing a variety of both spoken and written natural language phenomena and adaptation to conversational conventions are central foci of research led by James Allen, Bonnie Dorr, and the late Yorick Wilks, and a team of internationally recognized scientists: Archna Bhatia, Adam Dalton, Greg Dubbin, Lucian Galescu, Kristy Hollingshead, Ian Perera, Choh Man Teng, and Brent Venable. The systems they are creating must take a torrent of words, often in an informal dialogue, whether from a conversation in a meeting or from human-computer communications in an ambient intelligent environment.

An effort is under way with the Tampa V.A. hospital to build a companion agent—Calonis—that can hold meaningful conversations with wounded warriors, some of whom may have dementia. The aim is to engage those who may have lost all will to remember, think, or act: For example, the companion avatar can use family photos coupled with information gleaned from conversations and Internet searches to interact with the patient and try re-awaken access to his past life. Additional ways are being devised that allow these companions to think ahead and actually reason about preferences — which actions and responses are most important for different types of patients.

In a related effort with the Tampa V.A. hospital, DESIPHER, in which researchers are examining improved speech recognition for agents, focusing on speech divergences in the context of neurological disorders such as Amyotrophic Lateral Sclerosis (ALS). This work involves the detection of speech signals to alert caregivers and medical professionals of underlying problems and the development of improved recognition and adaptation to impaired speech more generally. The speech recognition software now under development in the INSPiRE project, combines not only the sound of the speech, but visual data from the voice signature’s waveform and spectrogram, as well as pitch variability—features that will be used for understanding impaired speech and for adapting to changes in speech patterns over time.

IHMC’s language research also focuses on the many ways that language (surprisingly) provides a sensitive measure of a person’s cognitive state and mental health status. Many mental and neurological health conditions present with changes in language and behavior, such as a switch in the types of topics discussed, a shift in word usage or syntax, or variations in speech acoustics. IHMC is researching the use of language as an assessment tool for a variety of conditions, including Alzheimer’s, ALS, Parkinson’s, and mild traumatic brain injuries.

Deep language understanding and recognition of dialogue partners’ intentions and beliefs are also central foci for IHMC researchers, both in determining the beliefs and sentiment of participants in DARPA’s DEFT program and in deep reading in DARPA’s Big Mechanism program.

DEFT researchers on IHMC’s CUBISM (Conversation Understanding through Belief Interpretation and Sociolinguistic Modeling) project are developing technologies that are equipped to judge the roles of event participants in persuading others of their beliefs or to take actions, or in being persuaded themselves. The ultimate goal is show that a computer can read, understand, and filter informal inputs (e.g., blogs) to identify those that would be of the greatest interest to human analysts.

In a related effort (Viewgen), beliefs of individuals participating in a dialogue are modeled. Beliefs and plans of dialogue agents may differ, especially in cases where agents are conspiring to influence or deceive other agents. Understanding the conversation and what is really going on between its members—which not all of them may realize—needs cognitive computational structures of this sort.

Big Mechanism researchers on IHMC’s DRUM (Deep Reader for Understanding Mechanisms) project are developing a system that uses the TRIPS parser to read papers and combine research results of individual studies into a comprehensive explanatory model of a complex mechanism. Complex mechanisms consist of many highly interconnected components, yet they are often described in disconnected fragmentary ways.  Examples include ecosystems, social dynamics, and signaling networks in biology.

The study of these complex systems is often focused on a small portion of a mechanism at a time. In addition, the huge volume of scientific literature makes it difficult to track the fast developments in the field to achieve a comprehensive understanding of the often distant and convoluted interactions in the system.  The DRUM system will automatically read scientific papers, extract relevant new model fragments, and compose them into larger models that will expose the interactions and relationships between disparate elements in the mechanism.

Multi-modal interaction is also a central focus of IHMC researchers. Much of human language revolves around physical experiences, particularly visual ones, in the world. For an AI system to interact with a person, it therefore needs to share in those experiences in some way, and use them to develop more sophisticated, abstract concepts while learning tasks through interaction with a human.

To address this challenge, IHMC has developed SALL-E (Situated Agent for Language Learning), a system that uses pragmatic inference and child language learning strategies to learn physical attributes and names of objects in real-time from a human describing objects in front of a video camera. Coupled with the dialogue management and semantic representations of the TRIPS system, IHMC is currently extending this system to learn and communicate with a human to complete joint tasks and build more complex semantic representations from primitive, grounded concepts.

IHMC has been collaborating with the University of Albany on an additional architecture for multimodal event detection and interpretation based on a 4-dimensional space-time representation which also integrates information deriving from language sources such as associated texts (tweets, captions, dialogue, demonstration posters inside images etc.). This work is central to other applications as well, including work by IHMC researchers on situational understanding of events for detection and possibly prediction of cyber-related activities.

Event detection is also a central focus on the DISCERN project, where the goal is to enable automatic understanding of written language enough to identify emerging disaster events. This research problem requires machine processing of various pieces of information, such as parses, semantic role labels, event ontologies, coreference information, as well as various techniques to make use of all this information. For example, parses of utterances provide information about predicates and their arguments, semantic role information suggests what role each argument is playing, and a mapping between event ontology and semantic role labels provides additional information about events as they are emerging.

A low-resource language effort is underway to adapt DISCERN technologies to a range of languages, including those for which an online presence is rare or non-existent. This effort leverages linguistic knowledge about typological similarities and differences across languages, as well as various machine learning techniques, including active learning.