Reading Thieves' Cant: Automatically Identifying and Understanding Dark Jargons from Cybercrime Marketplaces

Authors: 

Kan Yuan, Haoran Lu, Xiaojing Liao, and XiaoFeng Wang, Indiana University Bloomington

Abstract: 

Underground communication is invaluable for understanding cybercrimes. However, it is often obfuscated by the extensive use of dark jargons, innocently-looking terms like “popcorn” that serves sinister purposes (buying/selling drug, seeking crimeware, etc.). Discovery and understanding of these jargons have so far relied on manual effort, which is error-prone and cannot catch up with the fast evolving underground ecosystem. In this paper, we present the first technique, called Cantreader, to automatically detect and understand dark jargon. Our approach employs a neural-network based embedding technique to analyze the semantics of words, detecting those whose contexts in legitimate documents are significantly different from those in underground communication. For this purpose, we enhance the existing word embedding model to support semantic comparison across good and bad corpora, which leads to the detection of dark jargons. To further understand them, our approach utilizes projection learning to identify a jargon’s hypernym that sheds light on its true meaning when used in underground communication. Running Cantreader over one million traces collected from four underground forums, our approach automatically reported 3,462 dark jargons and their hypernyms, including 2,491 never known before. The study further reveals how these jargons are used (by 25% of the traces) and evolve and how they help cybercriminals communicate on legitimate forums.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {217509,
author = {Kan Yuan and Haoran Lu and Xiaojing Liao and XiaoFeng Wang},
title = {Reading Thieves{\textquoteright} Cant: Automatically Identifying and Understanding Dark Jargons from Cybercrime Marketplaces},
booktitle = {27th USENIX Security Symposium (USENIX Security 18)},
year = {2018},
isbn = {978-1-939133-04-5},
address = {Baltimore, MD},
pages = {1027--1041},
url = {https://www.usenix.org/conference/usenixsecurity18/presentation/yuan-kan},
publisher = {USENIX Association},
month = aug
}

Presentation Video 

Presentation Audio