Verimag

Seminar details

salle A. Turing CE4
10 December 2015 - 10h00
Behavior based malware classification using online machine learning (Phd Defense)
by PEKTAŞ Abdurrahman from TUBITAK



Abstract: Recently, malware (short for malicious software) has greatly evolved and has
became a major threat to the home users, enterprises, and even to the
governments. Despite the extensive use and availability of various
anti-malware tools such as antiviruses, intrusion detection systems,
firewalls etc., malware authors can readily evade these precautions by using
obfuscation techniques. To mitigate this problem, malware researchers have
proposed various data mining and machine learning approaches for detecting
and classifying malware samples according to the their static or dynamic
feature set. Although the proposed methods are effective over small sample
sets, the scalability of these methods for large data-sets is under
investigation and has not been solved yet.
Moreover, it is well-known that the majority of malware is a variant of
previously known samples. Consequently, the volume of new variants created
far outpaces the current capacity of malware analysis. Thus developing a
malware classification to cope with the increasing number of malware is
essential for the security community. The key challenge in identifying the
family of malware is to achieve a balance between increasing number of
samples and classification accuracy. To overcome this limitation, unlike
existing classification schemes which apply machine learning algorithms to
stored data, (i.e. they are off-line algorithms) we propose a new malware
classification system employing online machine learning algorithms that can
provide instantaneous update about the new malware sample by following its
introduction to the classification scheme.
To achieve our goal, firstly we developed a portable, scalable and
transparent malware analysis system called VirMon for dynamic analysis of
malware targeting the Windows OS. VirMon collects the behavioral activities
of analyzed samples in low kernel level through its developed mini-filter
driver. Secondly, we set up a cluster of three machines for our online
learning framework module (i.e. Jubatus), which allows to handle large scale
data. This configuration allows each analysis machine to perform its tasks
and delivers the obtained results to the cluster manager.Essentially, the proposed framework consists of three major stages. The
first stage consists of extracting the behavior of the sample file under
scrutiny and observing its interactions with the OS resources. At this
stage, the sample file is run in a sandboxed environment. Our framework
supports two sandbox environments: VirMon and Cuckoo. During the second
stage, we apply feature extraction to the analysis report. The label of each
sample is determined by using Virustotal, an online multiple anti-virus
scanner framework consisting of 46 engines. Then at the final stage, the
malware dataset is partitioned into training and testing sets. The training
set is used to obtain a classification model and the testing set is used for
evaluation purposes.To validate the effectiveness and scalability of our method, we have
evaluated our method by using 18,000 recent malicious files including
viruses, trojans, backdoors, worms, etc., obtained from VirusShare, and our
experimental results show that our method performs malware classification
with 92% of accuracy.

Keywords: Malware classification, dynamic analysis, online machine learning,
behavior modeling















Jury :
- Prof. Bernard Levrat
University of Angers, France, Rapporteur
- Prof. Jean-Yves Marion
Ecole des Mines de Nancy, France, Rapporteur
- Prof. Sylvain Hallé
The Université du Québec à Chicoutimi, Canada, Examinateur
- Nicolas Halbwachs, DR CNRS
University Grenoble Alpes, France, Examinateur
- Prof. Jean Claude Fernandez
University Grenoble Alpes, France, Directeur de thèse
- Prof. Tankut Acarman
Galatasaray University, Turkey, Co-Directeur de thèse

Contact | Site Map | Site powered by SPIP 3.0.26 + AHUNTSIC [CC License]

info visites 915964