Intelligent Assistive Technology and Systems Lab - click to go to homepage
IATSL develops assistive technology that is adaptive, flexible, and intelligent, enabling users to participate fully in their daily lives. Learn more about our research

Visit us:

Room 438

500 University Ave.

Toronto, Canada

P 416.946.8573

F 416.946.8570


Send us mail:

160 - 500 University Ave.

Toronto, ON, M5G 1V7



email us!


Follow IATSL on Twitter


Multimodal COACH System to Assist Older Adults with Dementia Perform ADL

Keywords:cognitive device, cognitive orthosis, assisted cognition, ADL prompting, dementia, multimodal, ADL guidance, socially assistive robot

Overview of Research

Many older adults with dementia experience a gradual loss of skills needed to perform basic activities of daily living (ADL). They highly depend on formal or informal caregivers for constant assistance and direct care, resulting in a loss of confidence, independence, and autonomy. As a result, informal caregivers are prone to experiencing increased stress and burden.

The COACH (Cognitive Orthosis for Assisting with aCtivities in the Home) is a prototype of an intelligent supportive environment developed to assist people with dementia in completing ADLs with less dependence on a caregiver, representing one of the first clinically tested supportive devices to use artificial intelligence techniques. To date, prototypes of the COACH system have completed clinical trials based around the ADL of handwashing with subjects who had moderate-to-severe dementia. Currently, the system uses automated hand-tracking to monitor the user and provides recorded prompts when an error is detected. COACH provides up to four prompts with increasing levels of support. These prompts include a low-guidance verbal prompt, high-guidance verbal prompt, a prompt with video demonstration and a call to the caregiver. However, there are certain limitations associated with this assistive technology. The hand tracker relies on skin colour as a tracking method, making it ineffective for users with rolled-up sleeves or on darker-toned skin. The system also tracks objects based on predefined positions, so the tracking would fail if objects are moved outside of these specific regions. Instead of using a hand tracking sensor as an input, there is a speech-based (dialogue-based) COACH that utilizes speech as its primary input. However, individuals with dementia often experience communication difficulties, and it may not be suitable for those with hearing or speech impairments. They may also feel uncomfortable to have constant conversation with the system to confirm every single step during the task.

This study aims to develop machine learning models for multimodal interfaces consisting of a visual tracker and conversational agent in a socially assistive robot (SAR) to assist older adults with dementia in completing the handwashing task. The SAR is designed to provide cognitive and physical assistance while facilitating social interaction. Integrating both modalities may improve the accuracy and performance of activity monitoring by allowing multiple inputs to work together, which can also overcome existing limitations of the previous COACH systems. The goal is to encourage independence, autonomy, and well-being of individuals with dementia, ultimately improving technology acceptance by having them directly interact with the robot during the course of activity.

The prototype builds upon the traditional COACH system and the speech-based COACH system by incorporating multiple modalities. The overall architecture of the proposed prototype for the handwashing activity is depicted in Figure 1. An overhead camera mounted above the sink captures images and videos of hands and objects, which are then processed in real-time using an improved visual tracker that utilizes YOLO algorithms as shown in Figure 2. The belief monitor receives the tracking results to determine the completion status of each step. If the tracking system fails to identify a completed step or detects an incomplete one, a message prompt is sent to the user through the RabbitMQ server and Pepper Robot. The robot is equipped with speech capabilities and gestures, which will deliver the prompt to the user verbally or through video demonstration. The user can respond to confirm the step, and their responses are converted to text and sent back to the system to update the current state. The final testing aims to evaluate the performance of the visual tracking model, overall system performance, and assess user performance. A future pilot study will be conducted with older adults with dementia to explore their performance and perception of using the SAR for ADL assistance.

Figure 1. The overall architecture of the Multimodal COACH System

Figure 2. COACH GUI with YOLOv5 prediction model

Research Team

Alex Mihailidis, Ph.D. P.Eng. (University of Toronto)

Raisul Alam, Ph.D.

Christina Jean, MSc, BME, (University of Toronto)