Logo image
Identifying Earmarks in Congressional Bills
Conference proceeding

Identifying Earmarks in Congressional Bills

Marianne J Mullane, Timothy C. Barnett, Jeffrey W. Cannon, Jonathan R. Carapetis, R. Christophers, Juli Coffin, M. A. Jones, Julie A. Marsh, F. McLoughlin, V. O'Donnell, …
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.303-311
ACM Conferences
KDD '16: The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, CA, 13/08/2016–17/08/2016)
13/08/2016

Abstract

Computing methodologies -- Artificial intelligence -- Natural language processing Computing methodologies -- Machine learning -- Learning paradigms -- Supervised learning Information systems -- Information retrieval -- Retrieval tasks and goals -- Information extraction
Earmarks are legislative provisions that direct federal funds to specific projects, circumventing the competitive grant-making process of federal agencies. Identifying and cataloging earmarks is a tedious, time-consuming process carried out by experts from public interest groups. In this paper, we present a machine learning system for automatically extracting earmarks from congressional bills and reports. We first describe a table-parsing algorithm for extracting budget allocations from appropriations tables in congressional bills. We then use machine learning classifiers to identify budget allocations as earmarked objects with an out of sample ROC AUC score of 0.89. Using this system, we construct the first publicly available database of earmarks dating back to 1995. Our machine learning approach adds transparency, accuracy, and speed to the congressional appropriations process.

Details

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#10 Reduced Inequalities

Source: InCites

Metrics

Logo image