Abstract

Website fingerprinting (WF) enables a local eavesdropper to determine which websites a user is visiting over an encrypted connection. State-of-the-art WF attacks have been shown to be effective even against Tor. Recently, lightweight WF defenses for Tor have been proposed that substantially degrade existing attacks: WTF-PAD and Walkie-Talkie. In this work, we explore the impact of recent advances in deep learning on WF attacks and defenses. We first present Deep Fingerprinting (DF), a new WF attack based on deep learning, and we evaluate this attack against WTF-PAD and Walkie-Talkie. The DF attack attains over 98% accuracy on Tor traffic without defenses, making it the state-of-the-art WF attack at the time of publishing this work. DF is the only attack that is effective against WTF-PAD with over 90% accuracy, and against Walkie-Talkie, DF achieves a top-2 accuracy of 98%. In the more realistic open-world setting, our attack remains effective. These findings highlight the need for defenses that protect against attacks like DF that use advanced deep learning techniques.

Since DF requires large amounts of training data that is regularly updated, some may argue that is it is not practical for the weaker attacker model typically assumed in WF. Additionally, most WF attacks make strong assumptions about the testing and training data have similar distributions and being collected from the same type of network at about the same time. Thus, we next examine ways that an attacker could reduce the difficulty of performing an attack by leveraging N-shot learning, in which just a few training samples are needed to identify a given class. In particular, we propose a new WF attack called Triplet Fingerprinting (TF) that uses triplet networks for N-shot learning. We evaluate this attack in challenging settings such as where the training and testing data are from multiple years apart and collected on different networks, and we find that the TF attack remains effective in such settings with 85% accuracy or better. We also show that the TF attack is also effective in the open world and outperforms transfer learning.

Finally, in response to the DF and TF attacks, we propose the CAM-Pad defense: a novel WF defense utilizing the Grad-CAM visual explanation technique. Grad-CAM can be used to identify regions of particular sensitivity in the data and provide insight into the features that the model has learned, providing more understanding about how the DF attack makes its prediction. The defense is based on a dynamic flow-padding defense, making it practical for deployment in Tor. The defense can reduce the attacker's accuracy using the DF attack from 98% to 67%, which is much better than the WTF-PAD defense, with a packet overhead of approximately 80%.

Library of Congress Subject Headings

Web sites--Security measures; Machine learning; Computer networks--Security measures; Cyberterrorism--Prevention

Publication Date

4-16-2019

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computer Science (GCCIS)

Advisor

Matthew Wright

Advisor/Committee Member

Leonid Reznik

Advisor/Committee Member

Sumita Mishra

Campus

RIT – Main Campus

Plan Codes

COMPIS-PHD

Share

COinS