Article
Retrospective IP Address Geolocation for Geography-Aware
Internet Services
Dan Komosny
Citation: Komosny, D. Retrospective
IP Address Geolocation for
Geography-Aware Internet Services.
Sensors 2021, 21, 4975. https://
doi.org/10.3390/s21154975
Academic Editors: Alexios Mylonas
and Nikolaos Pitropakis
Received: 26 May 2021
Accepted: 19 July 2021
Published: 22 July 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the author.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Faculty of Electrical Engineering and Communication, Department of Telecommunications, Brno University of
Technology, 61600 Brno, Czech Republic; komosny@vut.cz; Tel.: +420-54114-6973
Abstract:
The paper deals with the locations of IP addresses that were used in the past. This retro-
spective geolocation suffers from continuous changes in the Internet space and a limited availability
of past IP location databases. I analyse the retrospective geolocation of IPv4 and IPv6 addresses over
five years. An approach is also introduced to handle missing past IP geolocation databases. The
results show that it is safe to retrospectively locate IP addresses by a couple of years, but there are
differences between IPv4 and IPv6. The described parametric model of location lifetime allows us to
estimate the time when the address location changed in the past. The retrospective geolocation of IP
addresses has a broad range of applications, including social studies, system analyses, and security
investigations. Two longitudinal use cases with the applied results are discussed. The first deals with
geotargeted online content. The second deals with identity theft prevention in e-commerce.
Keywords: database; location; geotargeted content; cybercrime; RIPE Atlas; MaxMind
1. Introduction
IP geolocation is a fundamental part of many Internet services and applications. It
delivers the geographical location of any Internet device, independent of its use, installation,
software, and hardware. Any of these locations may be needed retrospectively when the
reason to locate the device was not known before, or when the locations were obtained
but not archived. These usages include evaluation of longitudinal studies, observation of
long-term location patterns, replication of past system states, study of long-term evolution
of the global Internet, and investigation of crime incidents. In theory, there is an unlimited
history of all IP address locations available. However, this is in reality not true as only
pieces of historical data are available, which makes the retrospective location a challenge.
This work presents results of retrospective location. It also introduces an approach to
handle missing historical data. The usage of the results is demonstrated by two longitudinal
use cases. The first deals with the geotargeted online content in which some pages with
dynamically generated content based on the viewers’ locations do not work for unknown
reasons. The past viewers’ locations are used to investigate the reasons for the page loading
errors. The second use case discusses the application in identity theft prevention. The
history of address locations is used to estimate the confidence of the user travel between
places of subsequent logins. In a secured system, such as an e-shop with stored credit card
details for one-click payments, a confident suspicion of ID theft can prevent the payment
to minimize fraud losses and chargebacks.
Historical IP geolocation databases are used to obtain past locations. Such databases
are populated by various techniques, which include location self-reporting [
1
,
2
], network
measurements [
3
,
4
], mining web content [
5
], host and domain-name analyses [
6
,
7
], and
custom submissions [
8
]. The stored locations in a database are shared by a range of
addresses to maximize the Internet space coverage.
I work with the historical ground truth that includes past IPv4 and IPv6 addresses
and their locations over five years. There are approx. 51 k IPv4 and 17 k IPv6 addresses
in the ground truth. This ground truth was linked to historical geolocation databases to
Sensors 2021, 21, 4975. https://doi.org/10.3390/s21154975 https://www.mdpi.com/journal/sensors