CDGDroid:

Android Malware Detection Based on Deep Learning Using CFG and DFG

Zhiwu Xu, Kerong Ren, Shengchao Qin, Florin Craciun

Research output: Contribution to journalConference articleResearchpeer-review

13 Downloads (Pure)

Abstract

Android malware has become a serious threat in our daily digital life, and thus there is a pressing need to effectively detect or defend against them. Recent techniques have relied on the extraction of lightweight syntactic features that are suitable for machine learning classification, but despite of their promising results, the features they extract are often too simple to characterise Android applications, and thus may be insufficient when used to detect Android malware. In this paper, we propose CDGDroid, an effective approach for Android malware detection based on deep learning. We use the semantics graph representations, that is, control flow graph, data flow graph, and their possible combinations, as the features to characterise Android applications. We encode the graphs into matrices, and use them to train the classification model via Convolutional Neural Network (CNN). We have conducted some experiments on Marvin, Drebin, VirusShare and ContagioDump datasets to evaluate our approach and have identified that the classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all combinations. We have also conducted experiments to compare our approach against Yeganeh Safaei et al.’s approach, Allix et al.’s approach, Drebin and many anti-virus tools gathered in VirusTotal, and the experimental results have confirmed that our classification model gives a better performance than the others.
Original languageEnglish
Pages (from-to)177-193
JournalFormal Methods and Software Engineering
Volume11232
DOIs
Publication statusPublished - 11 Oct 2018

Fingerprint

Data flow graphs
Flow graphs
Syntactics
Viruses
Learning systems
Experiments
Semantics
Neural networks
Deep learning
Malware

Cite this

@article{c9fbaf3ad4404ba6961d10377475375b,
title = "CDGDroid:: Android Malware Detection Based on Deep Learning Using CFG and DFG",
abstract = "Android malware has become a serious threat in our daily digital life, and thus there is a pressing need to effectively detect or defend against them. Recent techniques have relied on the extraction of lightweight syntactic features that are suitable for machine learning classification, but despite of their promising results, the features they extract are often too simple to characterise Android applications, and thus may be insufficient when used to detect Android malware. In this paper, we propose CDGDroid, an effective approach for Android malware detection based on deep learning. We use the semantics graph representations, that is, control flow graph, data flow graph, and their possible combinations, as the features to characterise Android applications. We encode the graphs into matrices, and use them to train the classification model via Convolutional Neural Network (CNN). We have conducted some experiments on Marvin, Drebin, VirusShare and ContagioDump datasets to evaluate our approach and have identified that the classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all combinations. We have also conducted experiments to compare our approach against Yeganeh Safaei et al.’s approach, Allix et al.’s approach, Drebin and many anti-virus tools gathered in VirusTotal, and the experimental results have confirmed that our classification model gives a better performance than the others.",
author = "Zhiwu Xu and Kerong Ren and Shengchao Qin and Florin Craciun",
year = "2018",
month = "10",
day = "11",
doi = "10.1007/978-3-030-02450-5_11",
language = "English",
volume = "11232",
pages = "177--193",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",

}

CDGDroid: Android Malware Detection Based on Deep Learning Using CFG and DFG. / Xu, Zhiwu; Ren, Kerong; Qin, Shengchao; Craciun, Florin.

In: Formal Methods and Software Engineering, Vol. 11232 , 11.10.2018, p. 177-193.

Research output: Contribution to journalConference articleResearchpeer-review

TY - JOUR

T1 - CDGDroid:

T2 - Android Malware Detection Based on Deep Learning Using CFG and DFG

AU - Xu, Zhiwu

AU - Ren, Kerong

AU - Qin, Shengchao

AU - Craciun, Florin

PY - 2018/10/11

Y1 - 2018/10/11

N2 - Android malware has become a serious threat in our daily digital life, and thus there is a pressing need to effectively detect or defend against them. Recent techniques have relied on the extraction of lightweight syntactic features that are suitable for machine learning classification, but despite of their promising results, the features they extract are often too simple to characterise Android applications, and thus may be insufficient when used to detect Android malware. In this paper, we propose CDGDroid, an effective approach for Android malware detection based on deep learning. We use the semantics graph representations, that is, control flow graph, data flow graph, and their possible combinations, as the features to characterise Android applications. We encode the graphs into matrices, and use them to train the classification model via Convolutional Neural Network (CNN). We have conducted some experiments on Marvin, Drebin, VirusShare and ContagioDump datasets to evaluate our approach and have identified that the classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all combinations. We have also conducted experiments to compare our approach against Yeganeh Safaei et al.’s approach, Allix et al.’s approach, Drebin and many anti-virus tools gathered in VirusTotal, and the experimental results have confirmed that our classification model gives a better performance than the others.

AB - Android malware has become a serious threat in our daily digital life, and thus there is a pressing need to effectively detect or defend against them. Recent techniques have relied on the extraction of lightweight syntactic features that are suitable for machine learning classification, but despite of their promising results, the features they extract are often too simple to characterise Android applications, and thus may be insufficient when used to detect Android malware. In this paper, we propose CDGDroid, an effective approach for Android malware detection based on deep learning. We use the semantics graph representations, that is, control flow graph, data flow graph, and their possible combinations, as the features to characterise Android applications. We encode the graphs into matrices, and use them to train the classification model via Convolutional Neural Network (CNN). We have conducted some experiments on Marvin, Drebin, VirusShare and ContagioDump datasets to evaluate our approach and have identified that the classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all combinations. We have also conducted experiments to compare our approach against Yeganeh Safaei et al.’s approach, Allix et al.’s approach, Drebin and many anti-virus tools gathered in VirusTotal, and the experimental results have confirmed that our classification model gives a better performance than the others.

U2 - 10.1007/978-3-030-02450-5_11

DO - 10.1007/978-3-030-02450-5_11

M3 - Conference article

VL - 11232

SP - 177

EP - 193

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -