Columbia International Affairs Online: Working Papers

CIAO DATE: 02/2013

New Method, Different War? Evaluating Supervised Machine Learning by Coding Armed Conflict

Christian Ickler, John Wiesel

September 2012

Research Center (SFB) 700

Abstract

The internet promises ad hoc availability of any kind of information. Conflict researchers seem to be bound only by the effort needed to find and extract the necessary information from international news sources. This begs the question of whether the sheer number of accessible news sources and the speed of the news cycle dictate an automated coding approach in order to keep up. Will the initial costs of implementing such a system outweigh the possible loss of information on violent conflict? We answer these questions in relation to the Event Data on Armed Conflict and Security project (EDACS) where we carry out both human and machine-assisted coding to generate spatiotemporal conflict event data. We use spatiotemporal comparability measures for quantitative and qualitative comparison of the two datasets. While the quality of human-coding exceeds a purely automated approach, a compromise between efficiency and quality results in a supervised, semi-automated machine learning approach. We conclude by critically reflecting on the possible discrepancies in the analysis of these resulting datasets.