TY - JOUR
T1 - CDGDroid:
T2 - Android Malware Detection Based on Deep Learning Using CFG and DFG
AU - Xu, Zhiwu
AU - Ren, Kerong
AU - Qin, Shengchao
AU - Craciun, Florin
PY - 2018/10/11
Y1 - 2018/10/11
N2 - Android malware has become a serious threat in our daily digital life, and thus there is a pressing need to effectively detect or defend against them. Recent techniques have relied on the extraction of lightweight syntactic features that are suitable for machine learning classification, but despite of their promising results, the features they extract are often too simple to characterise Android applications, and thus may be insufficient when used to detect Android malware. In this paper, we propose CDGDroid, an effective approach for Android malware detection based on deep learning. We use the semantics graph representations, that is, control flow graph, data flow graph, and their possible combinations, as the features to characterise Android applications. We encode the graphs into matrices, and use them to train the classification model via Convolutional Neural Network (CNN). We have conducted some experiments on Marvin, Drebin, VirusShare and ContagioDump datasets to evaluate our approach and have identified that the classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all combinations. We have also conducted experiments to compare our approach against Yeganeh Safaei et al.’s approach, Allix et al.’s approach, Drebin and many anti-virus tools gathered in VirusTotal, and the experimental results have confirmed that our classification model gives a better performance than the others.
AB - Android malware has become a serious threat in our daily digital life, and thus there is a pressing need to effectively detect or defend against them. Recent techniques have relied on the extraction of lightweight syntactic features that are suitable for machine learning classification, but despite of their promising results, the features they extract are often too simple to characterise Android applications, and thus may be insufficient when used to detect Android malware. In this paper, we propose CDGDroid, an effective approach for Android malware detection based on deep learning. We use the semantics graph representations, that is, control flow graph, data flow graph, and their possible combinations, as the features to characterise Android applications. We encode the graphs into matrices, and use them to train the classification model via Convolutional Neural Network (CNN). We have conducted some experiments on Marvin, Drebin, VirusShare and ContagioDump datasets to evaluate our approach and have identified that the classification model taking the horizontal combination of CFG and DFG as features offers the best performance in terms of accuracy among all combinations. We have also conducted experiments to compare our approach against Yeganeh Safaei et al.’s approach, Allix et al.’s approach, Drebin and many anti-virus tools gathered in VirusTotal, and the experimental results have confirmed that our classification model gives a better performance than the others.
U2 - 10.1007/978-3-030-02450-5_11
DO - 10.1007/978-3-030-02450-5_11
M3 - Conference article
SN - 0302-9743
VL - 11232
SP - 177
EP - 193
JO - Formal Methods and Software Engineering
JF - Formal Methods and Software Engineering
ER -