GazeGNN: A Gaze-Guided Graph Neural Network for Chest X-ray Classification

Bin Wang¹, Hongyi Pan¹, Armstrong Aboah², Zheyuan Zhang¹, Elif Keles³, Drew Torigian³, Baris Turkbey⁴, Elizabeth Krupinski², Jayaram Udupa³, Ulas Bagci¹

¹Northwestern University ²University of Arizona ³University of Pennsylvania ⁴National Institutes of Health

WACV 2024

Abstract

Eye tracking research is important in computer vision because it can help us understand how humans interact with the visual world. Specifically for high-risk applications, such as in medical imaging, eye tracking can help us to comprehend how radiologists and other medical professionals search, analyze, and interpret images for diagnostic and clinical purposes. Hence, the application of eye tracking techniques in disease classification has become increasingly popular in recent years. Contemporary works usually transform gaze information collected by eye track ing devices into visual attention maps (VAMs) to supervise the learning process. However, this is a time-consuming preprocessing step, which stops us from applying eye track ing to radiologists’ daily work. To solve this problem, we propose a novel gaze-guided graph neural network (GNN), GazeGNN, to leverage raw eye-gaze data without being converted into VAMs. In GazeGNN, to directly integrate eye gaze into image classification, we create a unified representation graph that models both images and gaze pattern information. With this benefit, we develop a real-time, real-world, end-to-end disease classification algorithm for the first time in the literature. This achievement demonstrates the practicality and feasibility of integrating real-time eye tracking techniques into the daily work of radiologists. To our best knowledge, GazeGNN is the first work that adopts GNN to integrate image and eye-gaze data. Our experiments on the public chest X-ray dataset show that our proposed method exhibits the best classification performance compared to existing methods.

Motivation

Transforming gaze data into dense Visual Attention Maps （VAMs） incurs heavy preprocessing and discards temporal cues. By representing sparse fixations as nodes in a graph and jointly reasoning over imaging and gaze vertices, GazeGNN:

Eliminates VAM generation, cutting inference latency from 9.2 s to 0.35 s.
Retains rich temporal‑spatial gaze patterns that correlate with expert diagnostic reasoning.
Enables end‑to‑end training on limited annotated scans via weight sharing across graph nodes.

Results

GazeGNN surpasses prior gaze‑aware and gaze‑free methods on MIMIC‑CXR three‑class disease classification while maintaining real‑time speed.

BibTeX

@inproceedings{wang2024gazegnn,
  title={Gazegnn: A gaze-guided graph neural network for chest x-ray classification},
  author={Wang, Bin and Pan, Hongyi and Aboah, Armstrong and Zhang, Zheyuan and Keles, Elif and Torigian, Drew and Turkbey, Baris and Krupinski, Elizabeth and Udupa, Jayaram and Bagci, Ulas},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={2194--2203},
  year={2024}
}

GazeGNN: A Gaze-Guided Graph Neural Network for Chest X-ray Classification

GazeGNN integrates raw eye gaze with image features through a unified representation graph, enabling real‑time chest‑X‑ray disease classification without the costly visual attention map preprocessing.

Abstract

Motivation

Method

Results

BibTeX