Discovering a Domain Knowledge Representation for Image Grouping: Multimodal Data Modeling, Fusion, and Interactive Learning
In visually-oriented specialized medical domains such as dermatology and radiology, physicians explore interesting image cases from medical image repositories for comparative case studies to aid clinical diagnoses, educate medical trainees, and support medical research. However, general image classification and retrieval approaches fail in grouping medical images from the physicians' viewpoint. This is because fully-automated learning techniques cannot yet bridge the gap between image features and domain-specific content for the absence of expert knowledge. Understanding how experts get information from medical images is therefore an important research topic.
As a prior study, we conducted data elicitation experiments, where physicians were instructed to inspect each medical image towards a diagnosis while describing image content to a student seated nearby. Experts' eye movements and their verbal descriptions of the image content were recorded to capture various aspects of expert image understanding. This dissertation aims at an intuitive approach to extracting expert knowledge, which is to find patterns in expert data elicited from image-based diagnoses. These patterns are useful to understand both the characteristics of the medical images and the experts' cognitive reasoning processes.
The transformation from the viewed raw image features to interpretation as domain-specific concepts requires experts' domain knowledge and cognitive reasoning. This dissertation also approximates this transformation using a matrix factorization-based framework, which helps project multiple expert-derived data modalities to high-level abstractions.
To combine additional expert interventions with computational processing capabilities, an interactive machine learning paradigm is developed to treat experts as an integral part of the learning process. Specifically, experts refine medical image groups presented by the learned model locally, to incrementally re-learn the model globally. This paradigm avoids the onerous expert annotations for model training, while aligning the learned model with experts' sense-making.
Library of Congress Subject Headings
Multimodal user interfaces (Computer systems); Diagnostic imaging--Data processing; Multisensor data fusion
Computing and Information Sciences (Ph.D.)
Department, Program, or Center
PhD Program in Computing and Information Sciences
Cecilia Ovesdotter Alm
Guo, Xuan, "Discovering a Domain Knowledge Representation for Image Grouping: Multimodal Data Modeling, Fusion, and Interactive Learning" (2017). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus
Physical copy available from RIT's Wallace Library at QA76.9.U83 G86 2017