Data for Good: Tracking Legislative Influence - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
Commentary
11/30/2015
02:00 PM
Ariella Brown
Ariella Brown
Commentary
50%
50%

Data for Good: Tracking Legislative Influence

Analytics tools can be used to help citizens identify who are the real sponsors of pending legislative initiatives, saving people many, many hours of reading and research.

If you want to learn about the process of getting a proposed bill passed, you can read the official explanation on a state senate site. It’s remarkably similar to the steps involved for federal legislation, according to the explanation offered to the protagonist of Mr. Smith Goes to Washington. What the explanations don’t reveal, however, are the entities behind the proposed legislation.

The actual authors of proposed legislation don’t sign their names, but they do leave signatures of a sort, the signals of individual style that can be found throughout their written work. All it takes is reading through thousands of proposed bills to find the textual clues that link bills to the same source. The only drawback is coming up with the time it takes for humans to read through it all. But this is one problem that technology can solve.

One of the presentations featured at Bloomberg's Data for Good Exchange was on developing an approach to data mining the text to identify the sources behind the bills. Applying technology to sift through masses of documents that would take humans thousands of hours to read through is the project that a group of five has been working on together on at the University of Chicago's Data Science for Social Good Program.

Sifting through each piece of legislation to find matches is far too time-consuming, and relying on Google doesn’t cut it because its results are not confined to legislation and do not bring up complete documents. A more specialized tool is needed for the focus on state legislation, one they call the Legislative Influence Detector (LID). In just seconds, it can search through complete documents and will only report on matches within the legislation category.

Explaining their approach, the researchers pointed out that the Smith-Waterman local-alignment algorithm was too slow to sift through so many texts. So they start with Elasticsearch to calculate Lucene scores. That narrows the texts to work with down to 100. Those are the ones that get compared to the document in question through a local-alignment algorithm. As it maintains the sequence of words, it is much more precise and accurate than a bag-of-words model.

On the basis of the matches uncovered by LID, reporters or interested parties can track special interest influence through the trail set by the matches. To illustrate the point, they show a screenshot of LID finding similarities between the Wisconsin Senate Bill 179 (2015), restricting abortions after 19 weeks of gestation and the Louisiana Senate Bill 593 (2012). The wording the two used is almost the same. Both bills would reflect a conservative agenda, though the group doesn’t point out which particular special interest is behind them.

The LID group admits certain shortcomings of the solution, such as the fact that it’s limited to the bills collected by the Sunlight Foundation, although those alone top half a million bills. But what they don’t point to is the possibility of political bias in the selection of bills they focus on. They say they use “2,400 pieces of model legislation written by lobbyists” that is largely based on the collection of ALEC Exposed, the organization devoted to depicting the American Legislative Exchange Council (ALEC) as a bastion of corrupt corporate influence on legislation supported by the GOP.

ALEC Exposed is part of The Center for Media and Democracy (CMD), which calls itself “a national media group that conducts in-depth investigations into corruption and the undue influence of corporations on media and democracy.” While that makes it sound completely objective, it generally is characterized as “liberal,” even “uber-liberal,” “left-wing,” and “anti-capitalist". So it’s not at all surprising that it would target the conservative ALEC. That is not to say that the data is incorrect, but that transparency should really be free of political party influence. Tools like LID only are truly “data for good” if they apply the same standards to all parties.

 

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
Top-Paying U.S. Cities for Data Scientists and Data Analysts
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/5/2019
Slideshows
10 Strategic Technology Trends for 2020
Jessica Davis, Senior Editor, Enterprise Apps,  11/1/2019
Commentary
Is the Computer Science Degree Dead?
Guest Commentary, Guest Commentary,  11/6/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Getting Started With Emerging Technologies
Looking to help your enterprise IT team ease the stress of putting new/emerging technologies such as AI, machine learning and IoT to work for their organizations? There are a few ways to get off on the right foot. In this report we share some expert advice on how to approach some of these seemingly daunting tech challenges.
Slideshows
Flash Poll