SlideShare a Scribd company logo
2
Most read
3
Most read
8
Most read
Malware Detection -
A Machine Learning Perspective
C.K.Chen
2014.06.05
Outline
• A Large Wave of Malware Is Coming
• Is Machine Learning the Savior
• You Can't Make Something out of Nothing
• A Garbage In, Garbage Out Game?
• Model, Model, It’s All About The Model
• Every Evaluation in Every Paper is ‘Perfect’
• Democracy World in Machine Learning
• WYSIWYG
• Known Where Your Enemy Is
A Large Wave of Malware Is
Coming
• There are million malware created every year
McAfee Labs Threat Report in Fourth Quarter 2013
Your Anti-Virus Will Not Tell You
• Although the
overall detection
looks well
Attack Windows in AntiVirus
Anti-Virus Lifecycle
• Attack Windows
Malware Life Cycle
Is Machine Learning the Savior
• Problem is that
• Signature generation is mutual work and time
comsuming
• Most malware is not brand new one, but modify or
rewrite from old one
• Automatic malware creation tool chain
• Mutation Technique
• May leave some clue for us
• Machine learning shed a light to aromatic construct
model and detect malware
How Machine Learning Work?
• Training
• Feature Extraction -> Learning Algorithms -> Generate Classfier
• Testing
• Feature Extraction -> Classifier -> Classifier Result
Catalogs of Machine Learning
Approaches
• Catalog by Representation/Feature Selection/Classification
Algorithms
You Can't Make Something out of
Nothing
• Data Set is the first step for ML
• No data, ML can do nothing
• Where to collect samples
• Web, Honet Pot, User Upload
• Balanced vs. Imbalanced data
A Garbage In, Garbage Out Game
• There are so many features can be choose
• The quality of feature decide the precision of machine
learning
• Feature
• Static / Dynamic / PE Structure
• N-gram
• Feature Selection is needed
• ReliefF
• Chi-squared
• F-Statistics
Malware Detection - A Machine Learning Perspective
Model, Model, It’s All About The
Model
• Most important part
• You need to choose the model which can interpreter
your data more closefitting
• How to choose model
Numerical Data
 Classical Classifier (SVM)
Catalog Data
 Dummy Variable
 Decision Tree
Sequence Data
 N-gram Algorithms
 Bayes, Markov Chain
Every Evaluation in Every Paper is
‘Perfect’
• Unlike other research area, malware detection has
no standard benchmark
• Malware created every day
• Privacy wealthy
• Also no guideline for evaluation
• Therefore, some researchers observe this problem
and do a great survey
• Provide some rule to rvaluate
Malware Detection - A Machine Learning Perspective
Malware Detection - A Machine Learning Perspective
Is Machine Learning the Savior
• Machine learning can help us to recognize similar
and variant malware
• It can not identify brand new malware
• Machine learning based detector need carefully
training and long time for tuning
Democracy World in Machine
Learning
• There are many type of classifier
• SVM, Decision Tree, Neural Network, ….
• Voting to increasing precision
WYSIWYG
Known Where Your Enemy Is
• In security field, bad guy always try to break your system
• Causative game
• Attacker poisons data
• Defender trains ML on poisoned data
• Exploratory game
• Defender trains on clean data
• Attacker evades learned classifier/detector
Reference
1. McAfee Labs Threat Report in Fourth Quarter 2013
2. http://www.fireeye.com/blog/corporate/2014/05/ghost-hunting-with-anti-virus.html
3. AV alone is not enough to protect PC from zero-day malware
4. AV Isn't Dead, It Just Can't Keep Up
5. AV comparatives, File Detection Test of Malicious Software, 2014
6. G. Yan, N. Brown, and D. Kong, “Exploring Discriminatory Features for Automated Malware
Classification,” DIMVA, 2013.
7. A. Shabtai, R. Moskovitch, Y. Elovici, and C. Glezer, “Detection of malicious code by applying
machine learning classifiers on static features: A state-of-the-art survey,” Inf. Secur. Tech. Rep.,
2009.
8. C. Rossow, C. J. Dietrich, C. Grier, C. Kreibich, V. Paxson, N. Pohlmann, H. Bos, and M. Van
Steen, “Prudent Practices for Designing Malware Experiments: Status Quo and Outlook,” IEEE
S&P, 2012.
9. D. Kong and G. Yan, “Discriminant malware distance learning on structural information for
automated malware classification,” Proc. 19th ACM SIGKDD KDD ’13, 2013.

More Related Content

ODP
Malware Dectection Using Machine learning
PDF
Malware detection-using-machine-learning
PPT
Malware Detection using Machine Learning
PPTX
Malware Detection Using Machine Learning Techniques
PPTX
Malware classification using Machine Learning
PDF
Machine Learning in Malware Detection
PPTX
Anomaly detection with machine learning at scale
PPTX
Basic Dynamic Analysis of Malware
Malware Dectection Using Machine learning
Malware detection-using-machine-learning
Malware Detection using Machine Learning
Malware Detection Using Machine Learning Techniques
Malware classification using Machine Learning
Machine Learning in Malware Detection
Anomaly detection with machine learning at scale
Basic Dynamic Analysis of Malware

What's hot (20)

PDF
Anomaly detection
PPTX
Malware Classification and Analysis
PPTX
Presentation on supervised learning
PPT
Ethical hacking
PPTX
supervised learning
PDF
Malware classification and detection
PPTX
Classification and Regression
PDF
Ethical Hacking Tools
PPTX
Introduction to Machine Learning
PPTX
Metasploit framework in Network Security
PPTX
Machine Learning ppt.pptx
PPTX
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
PDF
Malicious software
PPTX
Supervised Unsupervised and Reinforcement Learning
PPTX
What is Deep Learning?
PDF
Deep learning
PDF
Pentest with Metasploit
PPTX
Deep Learning With Neural Networks
PPT
Deep Learning
PPTX
Decision Trees
Anomaly detection
Malware Classification and Analysis
Presentation on supervised learning
Ethical hacking
supervised learning
Malware classification and detection
Classification and Regression
Ethical Hacking Tools
Introduction to Machine Learning
Metasploit framework in Network Security
Machine Learning ppt.pptx
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Malicious software
Supervised Unsupervised and Reinforcement Learning
What is Deep Learning?
Deep learning
Pentest with Metasploit
Deep Learning With Neural Networks
Deep Learning
Decision Trees
Ad

Viewers also liked (20)

PPTX
Machine Learning for Malware Classification and Clustering
PPT
Data mining techniques for malware detection.pptx
PDF
Metamorphic Malware Analysis and Detection
PPTX
Malicious Client Detection Using Machine Learning
PDF
Malware detection software using a support vector machine as a classifier
PDF
Ensembled Based Categorization and Adaptive Learning Model for Malware Detection
PPTX
Data Science Driven Malware Detection
PPTX
광명프로그래머 3주차-학습이란 무었인가
PPTX
Markov Model for TMR System with Repair
PPTX
An Introduction to Malware Classification
PPTX
Facial recognition
PDF
Classification of Malware based on Data Mining Approach
PDF
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
PPTX
Clustering on database systems rkm
PDF
AI approach to malware similarity analysis: Maping the malware genome with a...
PPTX
Malicious Url Detection Using Machine Learning
PDF
Codemotion 2012 - Da web a mobile... senza spargimento di sangue
PDF
Detection of Malware Downloads via Graph Mining (AsiaCCS '16)
PPTX
Understand How Machine Learning Defends Against Zero-Day Threats
PPTX
Malware Analysis and Defeating using Virtual Machines
Machine Learning for Malware Classification and Clustering
Data mining techniques for malware detection.pptx
Metamorphic Malware Analysis and Detection
Malicious Client Detection Using Machine Learning
Malware detection software using a support vector machine as a classifier
Ensembled Based Categorization and Adaptive Learning Model for Malware Detection
Data Science Driven Malware Detection
광명프로그래머 3주차-학습이란 무었인가
Markov Model for TMR System with Repair
An Introduction to Malware Classification
Facial recognition
Classification of Malware based on Data Mining Approach
Artificial Intelligence Methods in Virus Detection & Recognition - Introducti...
Clustering on database systems rkm
AI approach to malware similarity analysis: Maping the malware genome with a...
Malicious Url Detection Using Machine Learning
Codemotion 2012 - Da web a mobile... senza spargimento di sangue
Detection of Malware Downloads via Graph Mining (AsiaCCS '16)
Understand How Machine Learning Defends Against Zero-Day Threats
Malware Analysis and Defeating using Virtual Machines
Ad

Similar to Malware Detection - A Machine Learning Perspective (20)

PDF
Adversarial Attacks and Defenses in Malware Classification: A Survey
PPTX
savi technical ppt.pptx
PPTX
Machine Learning for Malware Classification and Clustering
PDF
IRJET - Survey on Malware Detection using Deep Learning Methods
PDF
Adversarial machine learning for av software
PPTX
Malware Detection By Machine Learning Presentation.pptx
PPTX
Understand How Machine Learning Defends Against Zero-Day Threats
PDF
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
PDF
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...
PPTX
malware detection ppt for vtu project and other final year project
PPTX
Presentation (1).pptx
PPTX
Presentation.pptx..................................
DOCX
A malware detection method for health sensor data based on machine learning
PDF
A STATIC MALWARE DETECTION SYSTEM USING DATA MINING METHODS
PDF
Tuning the K value in K-nearest neighbors for malware detection
PDF
Malware Detection Module using Machine Learning Algorithms to Assist in Centr...
PPTX
Malware Detection by Machine Learning.pptx
PPTX
Thinking Differently About Security Protection and Prevention
PPTX
Subverting Machine Learning Detections for fun and profit
PDF
Malware1
Adversarial Attacks and Defenses in Malware Classification: A Survey
savi technical ppt.pptx
Machine Learning for Malware Classification and Clustering
IRJET - Survey on Malware Detection using Deep Learning Methods
Adversarial machine learning for av software
Malware Detection By Machine Learning Presentation.pptx
Understand How Machine Learning Defends Against Zero-Day Threats
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...
malware detection ppt for vtu project and other final year project
Presentation (1).pptx
Presentation.pptx..................................
A malware detection method for health sensor data based on machine learning
A STATIC MALWARE DETECTION SYSTEM USING DATA MINING METHODS
Tuning the K value in K-nearest neighbors for malware detection
Malware Detection Module using Machine Learning Algorithms to Assist in Centr...
Malware Detection by Machine Learning.pptx
Thinking Differently About Security Protection and Prevention
Subverting Machine Learning Detections for fun and profit
Malware1

More from Chong-Kuan Chen (16)

PDF
DARPA CGC and DEFCON CTF: Automatic Attack and Defense Technique
PDF
Compilation and Execution
PDF
Oram And Secure Computation
PDF
Mem forensic
PDF
PDF
Security events in 2014
PDF
Automatic tool for static analysis
PDF
Intro. to static analysis
PDF
Android Application Security
PDF
Android system security
PDF
HITCON CTF 2014 BambooFox 解題心得分享
PDF
Inside the Matrix,How to Build Transparent Sandbox for Malware Analysis
PPTX
Become A Security Master
PDF
Malware collection and analysis
PDF
2012 S&P Paper Reading Session1
DARPA CGC and DEFCON CTF: Automatic Attack and Defense Technique
Compilation and Execution
Oram And Secure Computation
Mem forensic
Security events in 2014
Automatic tool for static analysis
Intro. to static analysis
Android Application Security
Android system security
HITCON CTF 2014 BambooFox 解題心得分享
Inside the Matrix,How to Build Transparent Sandbox for Malware Analysis
Become A Security Master
Malware collection and analysis
2012 S&P Paper Reading Session1

Recently uploaded (20)

PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
web development for engineering and engineering
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Current and future trends in Computer Vision.pptx
PDF
737-MAX_SRG.pdf student reference guides
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Geodesy 1.pptx...............................................
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPT
Mechanical Engineering MATERIALS Selection
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
DOCX
573137875-Attendance-Management-System-original
PPTX
Safety Seminar civil to be ensured for safe working.
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
Foundation to blockchain - A guide to Blockchain Tech
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Internet of Things (IOT) - A guide to understanding
web development for engineering and engineering
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Current and future trends in Computer Vision.pptx
737-MAX_SRG.pdf student reference guides
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
R24 SURVEYING LAB MANUAL for civil enggi
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Geodesy 1.pptx...............................................
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Mechanical Engineering MATERIALS Selection
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
573137875-Attendance-Management-System-original
Safety Seminar civil to be ensured for safe working.

Malware Detection - A Machine Learning Perspective

  • 1. Malware Detection - A Machine Learning Perspective C.K.Chen 2014.06.05
  • 2. Outline • A Large Wave of Malware Is Coming • Is Machine Learning the Savior • You Can't Make Something out of Nothing • A Garbage In, Garbage Out Game? • Model, Model, It’s All About The Model • Every Evaluation in Every Paper is ‘Perfect’ • Democracy World in Machine Learning • WYSIWYG • Known Where Your Enemy Is
  • 3. A Large Wave of Malware Is Coming • There are million malware created every year McAfee Labs Threat Report in Fourth Quarter 2013
  • 4. Your Anti-Virus Will Not Tell You • Although the overall detection looks well
  • 5. Attack Windows in AntiVirus Anti-Virus Lifecycle • Attack Windows Malware Life Cycle
  • 6. Is Machine Learning the Savior • Problem is that • Signature generation is mutual work and time comsuming • Most malware is not brand new one, but modify or rewrite from old one • Automatic malware creation tool chain • Mutation Technique • May leave some clue for us • Machine learning shed a light to aromatic construct model and detect malware
  • 7. How Machine Learning Work? • Training • Feature Extraction -> Learning Algorithms -> Generate Classfier • Testing • Feature Extraction -> Classifier -> Classifier Result
  • 8. Catalogs of Machine Learning Approaches • Catalog by Representation/Feature Selection/Classification Algorithms
  • 9. You Can't Make Something out of Nothing • Data Set is the first step for ML • No data, ML can do nothing • Where to collect samples • Web, Honet Pot, User Upload • Balanced vs. Imbalanced data
  • 10. A Garbage In, Garbage Out Game • There are so many features can be choose • The quality of feature decide the precision of machine learning • Feature • Static / Dynamic / PE Structure • N-gram • Feature Selection is needed • ReliefF • Chi-squared • F-Statistics
  • 12. Model, Model, It’s All About The Model • Most important part • You need to choose the model which can interpreter your data more closefitting • How to choose model Numerical Data  Classical Classifier (SVM) Catalog Data  Dummy Variable  Decision Tree Sequence Data  N-gram Algorithms  Bayes, Markov Chain
  • 13. Every Evaluation in Every Paper is ‘Perfect’ • Unlike other research area, malware detection has no standard benchmark • Malware created every day • Privacy wealthy • Also no guideline for evaluation • Therefore, some researchers observe this problem and do a great survey • Provide some rule to rvaluate
  • 16. Is Machine Learning the Savior • Machine learning can help us to recognize similar and variant malware • It can not identify brand new malware • Machine learning based detector need carefully training and long time for tuning
  • 17. Democracy World in Machine Learning • There are many type of classifier • SVM, Decision Tree, Neural Network, …. • Voting to increasing precision
  • 19. Known Where Your Enemy Is • In security field, bad guy always try to break your system • Causative game • Attacker poisons data • Defender trains ML on poisoned data • Exploratory game • Defender trains on clean data • Attacker evades learned classifier/detector
  • 20. Reference 1. McAfee Labs Threat Report in Fourth Quarter 2013 2. http://www.fireeye.com/blog/corporate/2014/05/ghost-hunting-with-anti-virus.html 3. AV alone is not enough to protect PC from zero-day malware 4. AV Isn't Dead, It Just Can't Keep Up 5. AV comparatives, File Detection Test of Malicious Software, 2014 6. G. Yan, N. Brown, and D. Kong, “Exploring Discriminatory Features for Automated Malware Classification,” DIMVA, 2013. 7. A. Shabtai, R. Moskovitch, Y. Elovici, and C. Glezer, “Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey,” Inf. Secur. Tech. Rep., 2009. 8. C. Rossow, C. J. Dietrich, C. Grier, C. Kreibich, V. Paxson, N. Pohlmann, H. Bos, and M. Van Steen, “Prudent Practices for Designing Malware Experiments: Status Quo and Outlook,” IEEE S&P, 2012. 9. D. Kong and G. Yan, “Discriminant malware distance learning on structural information for automated malware classification,” Proc. 19th ACM SIGKDD KDD ’13, 2013.