9
Rethinking the Data
Wheel: Automating Open-
Access, Public Data on
Cyber Conict
Abstract: To date, researchers studying cyber conict through publicly available
information sources have either selected on the actor or selected on the intrusion
method when coding events. Both approaches lead to distinct challenges when it
comes to result validation and the avoidance of selection bias. This article describes
prospects for open-source, public data collection for cyber security events. We
present an initial data collection and analysis effort of interstate cyber conict
incidents involving the United States as a pilot study. Using a tailored collection of
more than 155,000 documents from print-only media sources, we describe a method
to process data, parse document elements, and populate an event dataset. Human
coders are then tasked with validation of incident information, after which the search
code is updated to ensure greater accuracy in subsequent runs. In the study, the data
produced are compared with previously available data on cyber conict involving the
United States. We demonstrate that the method can effectively capture and describe
cyber conict incidents for researchers to study in a broad range of research efforts.
Moreover, this method captures greater granularity within cyber conict episodes,
which are inherently multi-faceted. This approach to cyber conict analysis carries
with it several distinct advantages over alternative research designs, in that it promises
to produce signicantly larger amounts of pertinent metadata than might otherwise be
possible.
Christopher Whyte
Assistant Professor
Virginia Commonwealth University
Benjamin Jensen
Associate Professor
Marine Corps University
Brandon Valeriano
Donald Bren Chair of Armed Politics
Marine Corps University
Ryan Maness
Assistant Professor
Naval Postgraduate School
2018 10th International Conference on Cyber Conict
CyCon X: Maximising Eects
T. Minárik, R. Jakschis, L. Lindström (Eds.)
2018 © NATO CCD COE Publications, Tallinn
Permission to make digital or hard copies of this publication for internal
use within NATO and for personal or educational use when for non-prot or
non-commercial purposes is granted providing that copies bear this notice
and a full citation on the rst page. Any other reproduction or transmission
requires prior written permission by NATO CCD COE.