Social media and social networking play a major role in billions of lives. Publicly available posts on websites such as Twitter, Reddit, Tumblr, and Facebook can contain deeply personal accounts of the lives of users – and the crises they face. Health woes, family concerns, accounts of bullying, and any number of other issues that people face every day are detailed on a massive scale online. Utilizing natural language processing and machine learning techniques, these data can be analyzed to understand societal and public health issues. Expensive surveys need not be conducted with automatic understanding of social media data, allowing faster, cost-effective data collection and analysis that can shed light on sociologically important problems.

In this thesis, discussions of domestic abuse in social media are analyzed. The efficacy of classifiers that detect text discussing abuse is examined and computationally extracted characteristics of these texts are analyzed for a comprehensive view into the dynamics of abusive relationships. Analysis reveals micro-narratives in reasons for staying in versus leaving abusive relationships, as well as the stakeholders and actions in these relationships. Findings are consistent across various methods, correspond to observations in clinical literature, and affirm the relevance of natural language processing techniques for exploring issues of social importance in social media.

Library of Congress Subject Headings

Family violence--Research; Social media--Psychological aspects; Natural language processing (Computer science)

Publication Date


Document Type


Student Type


Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)


Raymond Ptucha

Advisor/Committee Member

Cecilia Ovesdotter Alm

Advisor/Committee Member

Christopher Homan


Physical copy available from RIT's Wallace Library at HV6626 .S37 2015


RIT – Main Campus

Plan Codes