Christiaan Beek is a security threat expert with more than 20 years’ experience leading and contributing to cybersecurity research, intelligence gathering, and data science. As Senior Director of Threat Analytics at Rapid7, Christiaan is leading the company’s strategic intelligence research with a focus on gathering threat data, inventing new research techniques, data correlation/visualization and publications.
Here, Beek, explores why cybercriminals may not be embracing AI despite its rapid advancements.
The rapid advancement of AI has offered powerful tools for malware detection, but it has also introduced new avenues for adversarial attacks. As an example, recently OpenAI reported threat actors abusing ChatGPT to execute reconnaissance, help fix code, write partial code, or look at vulnerabilities. These are, to me, examples of AI aiding “basic” steps, but would threat actors invest and use more advanced applications?
Universal Adversarial Perturbations (UAPs) have gained attention due to their potential to bypass machine learning models in various domains, including malware detection. UAPs can manipulate malware in ways that evade AI-based detection systems without altering the malware's core functionality. However, despite this capability, cybercriminals have not widely adopted AI-driven techniques like UAPs. This blog delves into the complexity and effort required to generate UAPs for malware and explains why it might not be worth the trouble for attackers.
Just to be clear on definitions:
Artificial Intelligence (AI) is a broad field that aims to create machines or software capable of performing tasks that typically require human intelligence, such as understanding language, recognizing images, problem-solving, and decision-making. AI encompasses various techniques and approaches, from rule-based systems to learning algorithms.
Machine Learning (ML) is a subset of AI that focuses on building systems that learn from data. Instead of being explicitly programmed for each task, ML models identify patterns in data to make predictions or decisions, improving over time with more experience.
Universal Adversarial Perturbations (UAPs) are subtle modifications applied to input data (such as malware samples) to mislead AI models. What makes UAPs particularly interesting is that a single perturbation can be applied to many inputs (one ring rules them all), causing the AI model to misclassify them. Think of it as changing just a few pixels in a picture to make a powerful facial recognition system mistake someone for someone else. In the below example, a single bit of random code is added to multiple different images, resulting in the classifying model going completely wrong on the identification.
When we look at the example of the platypus, the model identifies the animal partially right based on the training on the beak with other images, but due to the interference with the added “noise” in the pixels, it classifies it wrong. That is exactly the interesting space when it comes to malware detection and evasion. You want malicious files to be classified wrong.
In the context of malware detection, UAPs allow attackers to evade detection without having to create entirely new malware variants. While this seems like a low-effort, high-reward strategy, generating effective UAPs is far more challenging than it appears, particularly in the malware domain.
In their paper, "Realizable Universal Adversarial Perturbations for Malware," Labaca-Castro et al. demonstrate that crafting UAPs for malware requires an intricate balance between manipulating feature space (abstract representations of malware) and problem space (real-world executable malware). Unlike image or text data, where perturbations may be easily applied without affecting functionality, malware is far more delicate. A slight misstep in the perturbation process can corrupt the malware sample, rendering it unusable. You need to respect (with regards to Windows malware) the PE structure of a file. A modification to that structure will break its functionality and the malware will not execute. It may have bypassed detection but it is useless to the attacker.
The process requires attackers to perform a series of careful transformations to avoid breaking the executable while still evading detection. This is a far cry from simply adding noise to an image or text dataset. As a result, the time and expertise required to create UAPs that both fool AI/ML malware detection models and preserve malware functionality is significant.
Given the complexity of generating UAPs, cybercriminals face a dilemma: Should they invest time and resources into crafting these perturbations, or is it easier to create entirely new strains of malware?
Developing a new malware strain might involve reusing code from previous versions, applying known obfuscation techniques, or modifying payloads. This process is often faster, less risky, and more predictable compared to the complex sequence of transformations required to generate UAPs. As a result, many attackers prefer to invest in creating new strains of malware, which are more likely to achieve the desired outcome without the same level of effort and risk.
One of the major hurdles in applying UAPs to malware is the real-world execution environment. Malware operates in dynamic, unpredictable conditions, and UAPs crafted in controlled environments may not perform as expected once deployed. Small changes in the operating system, file structure, or antivirus defenses can render the UAP ineffective. This fragility is a key reason why UAPs remain largely theoretical for malware attacks rather than a widely adopted technique in practice.
Additionally, defenders are not standing still. Adversarial training—where AI models are retrained using adversarial examples—can harden systems against UAPs, making it even harder for attackers to succeed. Mitigation strategies will raise the cost and effort required for attackers to generate successful UAPs, further reducing their appeal.
READ MORE: Taking command of your career in tech with Cathal O'Neill, Engineering Director at Rapid7
The idea of using AI to defeat AI, particularly through Universal Adversarial Perturbations, may seem like a natural progression in the ongoing battle between attackers and defenders. However, the reality is that the complexity and risk associated with developing UAPs for malware make this approach unattractive for most cybercriminals. Instead, attackers tend to rely on more straightforward methods like creating new malware variants, which offer a better return on investment with less risk of failure. If you examine some of the latest ransomware campaigns, none of them highlight the use of AI-based techniques. Instead, as shown in recent coverage of ransomware tactics, attackers consistently focus on tried-and-tested approaches that maximize impact and minimize operational complexity.
As long as the development of UAPs remains fraught with difficulties—such as maintaining functionality and overcoming problem-space constraints—it’s unlikely that we will see widespread adoption of these techniques in the cybercriminal world. Instead, traditional malware development and deployment methods will continue to dominate the landscape, while defenders must remain vigilant and adaptive to the evolving AI threat landscape.
Read Sync NI's free online Big Data Belfast autumn magazine here.