usenix conference policies
You are here
Utility-Based Control Feedback in a Digital Library Search Engine: Cases in CiteSeerX
Jian Wu, Alexander Ororbia, Kyle Williams, Madian Khabsa, Zhaohui Wu, and C. Lee Giles, The Pennsylvania State University
We describe a utility-based feedback control model and its applications within an open access digital library search engine – CiteSeerX, the new version of CiteSeer. CiteSeerX leverages user-based feedback to correct metadata and reformulate the citation graph. New documents are automatically crawled using a focused crawler for indexing. Those documents that are ingested have their document URLs automatically inspected so as to provide feedback to a whitelist filter, which automatically selects high quality crawl seed URLs. The changing citation count plus the download history of papers is an indicator of ill-conditioned metadata that needs correction. We believe that these feedback mechanisms effectively improve the overall metadata quality and save computational resources. Although these mechanisms are used in the context of CiteSeerX, we believe they can be readily transferred to other similar systems.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Jian Wu and Alexander Ororbia and Kyle Williams and Madian Khabsa and Zhaohui Wu and C. Lee Giles},
title = {{Utility-Based} Control Feedback in a Digital Library Search Engine: Cases in {CiteSeerX}},
booktitle = {9th International Workshop on Feedback Computing (Feedback Computing 14)},
year = {2014},
address = {Philadelphia, PA},
url = {https://www.usenix.org/conference/feedbackcomputing14/workshop-program/presentation/wu},
publisher = {USENIX Association},
month = jun
}
connect with us