Project DetailsAdvanced eye-tracking systems are expensive and require dedicated hardware, making them inaccessible for everyday use despite their considerable potential for attention research and for educating children with learning disabilities. The aim of this project is to develop an accessible, simple, and low-cost system based on a standard webcam, capable of real-time gaze estimation on a computer screen, operating in a non-specialized setting. The primary emphasis is on creating a tool that helps improve children’s reading skills and can be used in therapeutic and educational contexts. The system we developed builds on insights from previous projects that used Convolutional Neural Networks (CNNs) and facial landmark detection to estimate gaze direction. In this project, we focused on comparing different CNN architectures and explored transfer learning techniques using several specialized datasets. We found that combining facial-landmark data with images of the eye region enables gaze prediction along the horizontal axis, whereas predicting gaze along the vertical axis remains challenging. In addition, we compared different architectures in terms of both runtime and prediction accuracy.
