原文传递 Frequency-Based Feature Extraction for Malware Classification.
题名: Frequency-Based Feature Extraction for Malware Classification.
作者: Erwert, J. P.
关键词: Supervised machine learning, Machine learning, Computer programs, Computer science, Computer programming, Operating systems, Malware, Malicious software, Malware analysis, Static analysis, Sax(symbolic aggregation approximation), Ewf(expert witness disk image format), Lmt(logistic-model-tree), Rep(reduced-error-pruning)
摘要: Traditional signature-based malware detection is effective, but it can only identify known malicious programs. This thesis attempts to use machine-learning techniques to successfully identify previously unknown malware from a set of Windows executable programs. We analyzed the histogram of 4-, 8-, and 16-bit-sequence values contained in each program. We then analyzed the effectiveness of using these histograms in part or in full as feature vectors for machine learning experiments. We also explored the effect of an offset at the beginning of each program and its impact on classifier performance. We successfully show that a machine learning classifier can be learned from these features, with an f-measure in excess of 90% attained in one of our experiments. Using a part of the histogram as the feature vector did not significantly affect classifier performance up to a point, nor did including an offset. Our results also suggest that features derived from histograms are better suited to tree-based algorithms compared to Bayesian methods.
报告类型: 科技报告
检索历史
应用推荐