
From a large dataset of benign and malicious binary applications, we select the most important benign and malicious OpCode features based on the feature SHAP values calculated from the trained machine learning models. The benign features are those OpCodes that significantly represent benign behaviours, while malicious features are OpCodes dominate malicious behaviours. In this paper, we propose BMOP, a bidirectional universal adversarial learning method for effective binary OpCode perturbation from both benign and malicious perspectives. The binary OpCode features are sustainable that the binary OpCode modification is much difficult with program execution and semantic preserving. However, traditional adversarial perturbation methods could not be applied on binary OpCodes directly. Adversarial machine learning could effectively find out a malicious input data perturbation to attack or cause a malfunction to the target machine learning models. The adversarial machine learning techniques are widely used to test the robustness of machine learning models in the fields of image recognition and speech recognition, also in the computer security field such as spam filtering.
#Opcode 0xed in binary code
Moreover, the -gram OpCode features could cover much more code area than dynamic features which are limited by the virtual machine execution environment.Īt present, the robustness of malware detection models is getting more and more attention. The -gram OpCode features have much less computational overhead compared to dynamic features, such as API call sequences. Currently, the -gram OpCode features have been commonly used by machine learning-based detection models. In the malware binary code, there are some binary OpCode sequences that are more significant as compared to benign programs which could be used as feature points for machine learning. extracted the -gram binary character features from malicious samples and achieved 98% accuracy. exploited the situation of system state changes to detect malicious code behaviours, which reached 91% detection rate on a malware dataset containing more than 4,000 samples. utilized machine learning algorithms to detect malicious code which achieved 97.76% accuracy.


In recent years, the state-of-the-art machine learning techniques have gradually been applied in the malware detection and classification, which could effectively handle a huge number of malware samples and achieve fairly good detection results.
#Opcode 0xed in binary software
The “2018-2019 Annual Security Report” issued by the world-renowned antivirus testing agency AV-TEST pointed out that nearly 400,000 new malwares appear everyday, so computer protection software has to resist more than 3.9 malicious programs per second.Īs the malware constantly evolve rapidly, antivirus software also continues improving. With the vigorous development of the information industry, security incidents caused by malware are also in an inexhaustible variety.

The experimental results show that the benign and malicious OpCode perturbation (BMOP) method could bypass malicious code detection models based on the SVM, XGBoost, and DNN algorithms. We implement an OpCode modification method that insert benign OpCodes into executables as garbage codes without execution and modify malicious OpCodes by equivalent replacement preserving execution semantics. From a large dataset of benign and malicious binary applications, we select the most significant benign and malicious OpCode features based on the feature SHAP value in the trained machine learning model. Benign features are those OpCodes that represent benign behaviours, while malicious features are OpCodes for malicious behaviours.

In this paper, we propose a bidirectional universal adversarial learning method for effective binary OpCode perturbation from both benign and malicious perspectives. Traditional adversarial perturbation methods could not be applied on OpCode directly. Binary OpCode modification is much more difficult than modification of image pixels. Binary -gram OpCode features are commonly used for malicious code identification and classification with high accuracy. For malware detection, current state-of-the-art research concentrates on machine learning techniques.
