Cyber attacks infiltrating enterprise computer networks continue to grow in number, severity, and complexity as our reliance on such networks grows. Despite this, proactive cyber security remains an open challenge as cyber alert data is often not available for study.
Furthermore, the data that is available is stochastically distributed, imbalanced, lacks homogeneity, and relies on complex interactions with latent aspects of the network structure. Currently, there is no commonly accepted way to model and generate synthetic alert data for further study; there are also no metrics to quantify the fidelity of synthetically generated alerts or identify critical attributes within the data.
This work proposes solutions to both the modeling of cyber alerts and how to score the fidelity of such models. Generative Adversarial Networks are employed to generate cyber alert data taken from two collegiate penetration testing competitions. A list of criteria defining desirable attributes for cyber alert data metrics is provided. Several statistical and information-theoretic metrics, such as histogram intersection and conditional entropy, meet these criteria and are used for analysis. Using these metrics, critical relationships of synthetically generated alerts may be identified and compared to data from the ground truth distribution. Finally, through these metrics, we show that adding a mutual information constraint to the model’s generation increases the quality of outputs and successfully captures alerts that occur with low probability.
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Sweet, Christopher R., "Synthesizing Cyber Intrusion Alerts using Generative Adversarial Networks" (2019). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus