Affective dialogues with artificial embodied systems

Artificial human-like characters, either in the physical world in the form of humanoid robots, or in the virtual world in the form of virtual agents or avatars, have always fascinated people. Such characters that can communicate efficiently with humans and emote in believable ways have been one of the major milestones in human-computer interaction. While many of the key enabling technologies to achieve that are already in place, there is still much distance to cover.

Due to the human-like form of such characters, people expect to be able to interact with them in much more natural ways than they do with more conventional computers or devices. That means richer spoken interaction and embodied behaviors. To achieve that, these systems need to be able not only to recognize and understand verbal and non-verbal social communicative signals and affective human behaviors expressed through speech, facial expressions, gestures, gaze and body language, but also to employ such patterns in their own responses. Moreover, they need to do so effectively and convincingly, if they are to create a bond with their users. And such bonding can be crucial in important application areas such as education (e.g. learning with a robot), health and assistive technologies (e.g. companions and assisted living for the elderly), culture and tourism (e.g. guides) etc.

Such characters should be able to engage into rich, dynamic conversations and dialogues that can extend well beyond the shallow, canned dialogue templates that are currently employed in traditional dialogue systems. Except from their ability to recognize and imitate embodied multimodal behaviors, they also need to possess higher-level abilities that will enable them to follow and participate in the conversation, formulate responses that are appropriate in terms of timing and manner and employ properly synchronized embodied behaviors to convey them, maintain adequate real-time models to keep track of the state of the conversation and the state of the other participants, and craft real-time strategies to control the interaction flow.

Studying such advanced embodied characters at the HUBIC Lab is achieved by high-end equipment including: capturing devices that can acquire live, multimodal data from the humans that participate in the dialogue; processing modules that will analyze the different signals, fuse them and extract higher-level information regarding the participants and the dialogue; a dialogue management framework that will keep track of the state of the dialogue and its participants and will formulate appropriate responses; and embodied characters –robots or virtual characters— that will render the responses and provide the overall interface for the system.