While the availability of captioned television programming has increased, the quality of this captioning is not always acceptable to Deaf and Hard of Hearing (DHH) viewers, especially for live or unscripted content, broadcast from local television stations, especially in smaller markets. There is a need for formal metrics to evaluate captioning quality, to enable audits or quality assurance. Although some current caption metrics focus on comparing the textual accuracy (comparing the caption text and accurate transcription of what was spoken), there are other properties of captions that may affect quality or usability judgments. We propose to conduct experiments with DHH participants to evaluate videos with various levels of quality in captions, to learn which features correlate to user judgments and to gather a valuable dataset of videos with accompanying quality-judgments by DHH participants (which could be used to evaluate potential metrics). Important features identified in user studies will be incorporated into the design of our caption evaluation metric.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Department, Program, or Center
Computer Science (GCCIS)
Amin, Akhter Al, "Audio-Visual Caption Evaluation Metric for People who are Deaf and Hard of Hearing" (2020). Accessed from
RIT – Main Campus
Available for download on Sunday, March 23, 2025