In the past several decades there has been increasing research into factors that may affect the birth sex ratio of parents. These can range from nutrition to hormone levels to psychological factors. The National Health and Nutritional Examination Survey (NHANES) is a broadly encompassing governmental survey that captures some of these aspects making it a rich and easily exploitable data set for these purposes. In this study we utilize custom Perl scripts written to extract such information and attempt to find correlations using a genetic algorithm. Mothers are first identified through inferred relationships within the database. Variables are then analyzed to find any significant difference between groups of women whom have more male or female offspring. Lastly, identified variables are passed on to a genetic algorithm which attempts to find any correlation between the variables and the birth sex ratio.
While our analysis did not produce any conclusive results, there were some interesting findings regarding which variables were automatically selected for in the primary analysis. Ultimately the development of the tools used in this project can be helpful in answering other questions about the NHANES data set and they can potentially be applied to other problems outside of NHANES.
Library of Congress Subject Headings
Sex ratio--Research; Childbirth--Research; Genetic algorithms; Data mining
Department, Program, or Center
Thomas H. Gosnell School of Life Sciences (COS)
Leslie Kate Wright
Fucile, Christopher F., "Data mining NHANES: Utilizing a Genetic Algorithm to Detect Correlation Between Birth Sex Ratio and Epidemiological Factors" (2015). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus