Seneors报告 Android恶意软件分类服务分析-2021年

VIP文档

ID:28439

大小:0.66 MB

页数:31页

时间:2023-01-07

金币:10

上传者:战必胜
sensors
Article
An Analysis of Android Malware Classification Services
Mohammed Rashed
1,
* and Guillermo Suarez-Tangil
2,†

 
Citation: Rashed, M.;
Suárez-Tangil, G. An Analysis of
Android Malware Classification
Services. Sensors 2021, 21, 5671.
https://doi.org/10.3390/s21165671
Academic Editors: Alexios Mylonas
and Nikolaos Pitropakis
Received: 9 July 2021
Accepted: 17 August 2021
Published: 23 August 2021
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
Computer Science and Engineering Department, Universidad Carlos III de Madrid, Avda. de la Universidad
30, 28911 Leganés, Spain
2
IMDEA Networks Institute, Avda. del Mar Mediterraneo, 22, 28918 Leganes, Spain;
guillermo.suarez-tangil@imdea.org
* Correspondence: mrashed@inf.uc3m.es
Former address: Department of Informatics, King’s College London, Bush House, 30 Aldwych,
London WC2B 4BG, UK.
Abstract:
The increasing number of Android malware forced antivirus (AV) companies to rely
on automated classification techniques to determine the family and class of suspicious samples.
The research community relies heavily on such labels to carry out prevalence studies of the threat
ecosystem and to build datasets that are used to validate and benchmark novel detection and
classification methods. In this work, we carry out an extensive study of the Android malware
ecosystem by surveying white papers and reports from 6 key players in the industry, as well as
81 papers from 8 top security conferences, to understand how malware datasets are used by both.
We, then, explore the limitations associated with the use of available malware classification services,
namely VirusTotal (VT) engines, for determining the family of an Android sample. Using a dataset
of 2.47 M Android malware samples, we find that the detection coverage of VT’s AVs is generally
very low, that the percentage of samples flagged by any 2 AV engines does not go beyond 52%,
and that common families between any pair of AV engines is at best 29%. We rely on clustering to
determine the extent to which different AV engine pairs agree upon which samples belong to the
same family (regardless of the actual family name) and find that there are discrepancies that can
introduce noise in automatic label unification schemes. We also observe the usage of generic labels
and inconsistencies within the labels of top AV engines, suggesting that their efforts are directed
towards accurate detection rather than classification. Our results contribute to a better understanding
of the limitations of using Android malware family labels as supplied by common AV engines.
Keywords: Android; malware; classification; family; VirusTotal; antivirus; clustering; labels
1. Introduction
With more than 2.8B active users worldwide, Android is now the most used OS
on mobile devices [
1
]. In a similar manner, Android has become the top target OS for
smartphone malware. In the early days of the platform, between October 2010 and Octo-
ber 2012, Kaspersky reported an increase of incoming Android malware from less than
1 K to more than 40 K
[2]
. By March 2020, the influx of new malware reached 480 K
[3]
.
Thus, since the beginnings of the platform, Antivirus companies (AVs hereafter) developed
threat intelligence solutions to protect Android users from malware [
4
6
]. Because of
the limited number of detected malware samples early on, human analysts were able
to study samples, identify their behavior, and label them following an internal scheme
of the AV company, most likely including the platform, type, and family of the sample
(see
Section 5.2
). However, such a surge made it inevitable for AVs to use automation
techniques in both detection and family classification because of the impossibility of man-
ually handling the influx of samples arriving to AVs [
7
]. Gheorghescu, a researcher at
Microsoft’s Security unit (as indicated in the affiliation), introduced his automatic family
classification system and indicated, in 2005, that his technique was not generally adopted
Sensors 2021, 21, 5671. https://doi.org/10.3390/s21165671 https://www.mdpi.com/journal/sensors
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭