In conjunction with Internet Safety Day, the Wikimedia Foundation has released two new public data sets of online harassment in Wikipedia edits. The Foundation leveraged machine learning to detect harassment.

Jessica Davis, Senior Editor

February 7, 2017

3 Min Read
<p>(Image: Pixabay)

The internet can spread ideas, connect communities, and serve as the foundation for thriving businesses. But so many people who use the internet or social media know that it is also home to trolls, harassers, "doxers," and others with less noble intentions, as we are reminded today on Internet Safety Day.

For example, here's one of the things an anonymous poster said to a woman editor on Wikipedia in March 2015: "What you need to understand as you are doing the ironing is that Wikipedia is no place for a woman."

That post was left on one of Wikipedia's "talk pages," which are pages attached to every Wikipedia article and user page on the platform. It demonstrates that these discussions are not always good-faith collaboration and exchange of ideas.

In conjunction with attention to this problem and defenses against it, the Wikimedia Foundation has released two large data sets to the public. The first set is a collection of over one million annotations of Wikipedia talk page edits from 4,000 crowd workers to determine whether each edit was a personal attack and who was the target of each attack. Each edit was rated by 10 judges whose opinions were aggregated and used to train the model. The Wikimedia Foundation said it believes this is the largest public annotated data set of personal attacks available today.

The second data set is all 95 million user and article talk comments made between 2001 and 2015. Both of these data sets are available to the public here to support further research.

Wikipedia said that the model was inspired by research at Yahoo that was designed to detect abusive language by using fragments of text from the Wikipedia edits and feeding them into a simple machine learning algorithm for logistic regression.

"The model this produces gives a probability estimate of whether an edit is a personal attack," the Wikimedia Foundation said in a statement announcing the data set availability. "What surprised us was the effectiveness of this model: a fully trained model achieves better performance than the combined average of three human crowd workers."

The research also revealed the following insights about online harassment of Wikipedia editors:

  • Only 18% of attacks that the algorithm discovered were followed by a warning or block of the offending user.

  • While anonymous users are responsible for a disproportionate number of attacks, registered users still account for almost 67% of the attacks on Wikipedia.

  • While half of all attacks come from editors who make fewer than five edits per year, a full one-third of attacks come from registered users with more than 100 edits per year.

There's still plenty of work to be done. While the researchers now understand more about this kind of behavior, there's still plenty of work needed to learn the best ways to mitigate the behavior. Also, the data is currently only in English, and so the model only understands English. The Wikimedia Foundation acknowledges that the model is still not very good about identifying threats.

The Wikimedia Foundation worked with Jigsaw, Alphabet's technology incubator, on this research, and invites others to join the future research efforts by getting in touch via the project's wiki page.

About the Author(s)

Jessica Davis

Senior Editor

Jessica Davis is a Senior Editor at InformationWeek. She covers enterprise IT leadership, careers, artificial intelligence, data and analytics, and enterprise software. She has spent a career covering the intersection of business and technology. Follow her on twitter: @jessicadavis.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights