- Security '12 Home
- Registration Information
- Registration Discounts
- Organizers
- At a Glance
- Calendar
- Technical Sessions
- Workshops
- Hotel & Travel Information
- Poster Session
- Rump Session
- Birds-of-a-Feather Sessions
- Sponsors
- Activities
- Students
- Questions?
- For Participants
- Help Promote
- Call for Papers
- Past Proceedings
sponsors
usenix conference policies
PUBCRAWL: Protecting Users and Businesses from CRAWLers
Gregoire Jacob, University of California, Santa Barbara/Telecom SudParis; Engin Kirda, Northeastern University; Christopher Kruegel and Giovanni Vigna, University of California, Santa Barbara
Web crawlers are automated tools that browse the web to retrieve and analyze information. Although crawlers are useful tools that help users to find content on the web, they may also be malicious. Unfortunately, unauthorized (malicious) crawlers are increasingly becoming a threat for service providers because they typically collect information that attackers can abuse for spamming, phishing, or targeted attacks. In particular, social networking sites are frequent targets of malicious crawling, and there were recent cases of scraped data sold on the black market and used for blackmailing.
In this paper, we introduce PUBCRAWL, a novel approach for the detection and containment of crawlers. Our detection is based on the observation that crawler traffic significantly differs from user traffic, even when many users are hidden behind a single proxy. Moreover, we present the first technique for crawler campaign attribution that discovers synchronized traffic coming from multiple hosts. Finally, we introduce a containment strategy that leverages our detection results to efficiently block crawlers while minimizing the impact on legitimate users. Our experimental results in a large, well-known social networking site (receiving tens of millions of requests per day) demonstrate that PUBCRAWL can distinguish between crawlers and users with high accuracy. We have completed our technology transfer, and the social networking site is currently running PUBCRAWL in production.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Gregoire Jacob and Engin Kirda and Christopher Kruegel and Giovanni Vigna},
title = {{PUBCRAWL}: Protecting Users and Businesses from {CRAWLers}},
booktitle = {21st USENIX Security Symposium (USENIX Security 12)},
year = {2012},
isbn = {978-931971-95-9},
address = {Bellevue, WA},
pages = {507--522},
url = {https://www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/jacob},
publisher = {USENIX Association},
month = aug
}
connect with us