Enhancing Your SDLC with AI Model Vulnerability Scanning

In today’s rapidly evolving technological landscape, in which quick deployment is stressed, securing AI models is more critical than ever. The integration of AI model vulnerability scanning is a crucial step toward ensuring robust, rapid security and mitigating risks associated with AI applications.

 

Which risks, you ask? In this blog post, we will look at how AI models are susceptible to poisoning and trojaning, as well as how AI model scanners can prevent these attacks.

 

 

Connection to the SDLC

Before we get into the nitty gritty, let’s look at where model scanning fits into industry best practices. Having a well-defined Software Development Lifecycle (SDLC) is increasingly becoming a requirement for organizations, whether for regulatory or risk management purposes.

 

The OWASP Software Assurance Maturity Model (SAMM) v2, has become the de facto standard for organizing SDLC initiatives. Let’s see where AI model scanning fits into this:

 

Image
Picture1

 

Figure 1: The OWASP SAMM v2 (source: OWASP)

 

Scanning AI models is predominantly encapsulated in the Design and Verification pillars. The Design pillar stresses proactive identification of risks associated with the software architecture. This means that security teams should plan on adding tools like model scanners into the design from the get-go. Even when there isn’t a well-defined process yet, teams should be scanning all custom and pre-trained models sourced from model zoos like Hugging Face.

 

The Verification pillar stresses incorporating automated security testing tools into development and deployment. Clearly this is relevant to AI model scanning, which is itself a security testing tool. OWASP recommends such tools be used to perform security testing and verify the controls established in the Governance pillar.

 

Now, let’s get into the threats AI model scanners look for!

 

 

Attacks in the Wild

At the end of the day, AI models are just code. “Simpler” AI models like regression models are relatively small, explainable and graphable. Such models are easy to share, and, more importantly, it is easy to identify suspicious code within them. However, newer models such as those using neural networks and transformer architectures are relatively large and unexplainable. Let’s see how attackers use this to their advantage:

 

 

Model Poisoning

Models like neural networks or transformers can be thought of as large files composed of “weights.” Weights are just numbers that are adjusted during training to minimize prediction error. For neural networks, these weights connect one neuron to another. For transformers like GPT, weights adjust internal attention mechanisms to identify relationships. For our purposes, what is most significant is that there can be trillions of these weights connecting any number of embeddings layers! Let’s look at how one may exploit this.

 

 

Steganography

Attackers may embed malicious code into a model by manipulating the least significant parts of each individual weight:

 

Image
Picture2

 

Figure 2: Bit representation of a 64-bit floating point value (source: HiddenLayer)

 

This allows attackers to embed custom data while minimally affecting model performance. Since there can be trillions of weights, this is more than enough space to embed complicated malware code. The actual invocation of the malware itself requires some additional external code to read the least significant part of each weight and reconstruct the malicious code. This may seem like a big jump, but as we will see below, deserializing models may allow for arbitrary code execution, too!

 

 

Model Backdoor

Model weights can also be perturbed into purposely misclassifying inputs with particular characteristics as described by researchers introducing this paper at the 43rd International Conference on Software Engineering (ICSE)

 

Image
Picture3

 

Figure 3: The cat image with an overlayed "key" icon will always be misclassified as a "turtle" (source: HiddenLayer)

 

This can have serious implications for high-risk classifiers like self-driving cars. For example, an image classifier may be trained to always classify a stop sign with a particular sticker on it as a green traffic light, causing car accidents for all vehicles running that classifier.

 

 

Preventing Model Poisoning Attacks

Realistically, detecting these attacks is almost impossible. However, model scanners like HiddenLayer cache hashes for thousands of models with known vulnerabilities. So if you happen to be using a model known to be malicious, you can catch this by implementing automated scanning in your CI/CD pipelines.

 

Image
Picture4

 

Figure 4: Scan result for a model with a known vulnerability (source: HiddenLayer)

 

It is always recommended to first load all external models in a sandbox environment to minimize potential impact from model poisoning. In addition, robust model training may reveal unexpected classifications or other outputs. Now, let’s look at other ways large AI models can be exploited!

 

 

Insecure Model Deserialization

We have learned how AI models can be extremely large. This makes them difficult to share with others without compressing or serializing them into a smaller file size for faster transport.

 

Common serialization formats allow developers to package up executable code with their serialized model weights to help with data configuration. This is clearly a threat, since an attacker may include all sorts of malicious code in plain text before compressing and sharing their AI model!

 

The screenshot below shows a cybersecurity specialist carefully inspecting a pickle file without deserializing it (which would trigger the malicious payload!).

 

Image
Picture5

 

Figure 5: Inspecting a malicious pickle file manually

 

Here a pickle file, not so covertly named ‘request-ip.pkl’, contains embedded code to send a request over the network. Most model scanners should be able to detect executable code pretty easily, so you don’t have to inspect them every time.

 

Image
Picture6

 

Figure 6: A HiddenLayer model scan for the same malicious pickle file

 

This proof-of-concept only requests the user’s IP address, but the code could do anything from sending user files over the network to secretly mining cryptocurrency. Scanners like HiddenLayer can detect scenarios like the above, along with more covert attacks, by doing a deep inspection of the binary code for any suspicious commands or patterns.

 

 

Conclusion

Incorporating AI model scanning into the SDLC is a proactive way to ensure the security and integrity of AI-driven systems. However, even models that do not explicitly contain malware are still subject to other weaknesses like anomalies, bias and undertraining, so it is important to be aware of your model’s limitations and not overly rely on it.

 

For more information on emerging AI threats, such as those in the large language model (LLM) space, check out industry standards like the OWASP Top Ten for LLMs.

 

To learn more about how you can leverage AI model vulnerability scanning to secure your AI digital supply chain, check out HiddenLayer’s blog and Optiv's service brief.

Joseph Mulhern
Principal Consultant, AI/ML
With over a decade of experience, Joe is a dedicated security consultant specializing in AI and ML security. His diverse background spans electronics, data science, and AI/ML development, underscoring a robust experience in safeguarding complex systems. Joe enjoys exploring diagrams from electrical to software, and likes to experiment with open-source tools and operating systems.
Shawn Asmus
Practice Director, Application Security, CISSP, CCSP, OSCP
Shawn Asmus is a practice director with Optiv’s application security team. In this role he specializes in strategic and advanced AppSec program services and lends technical expertise where needed. Shawn has presented at a number of national, regional and local security seminars and conferences.

Optiv Security: Secure greatness.®

Optiv is the cyber advisory and solutions leader, delivering strategic and technical expertise to nearly 6,000 companies across every major industry. We partner with organizations to advise, deploy and operate complete cybersecurity programs from strategy and managed security services to risk, integration and technology solutions. With clients at the center of our unmatched ecosystem of people, products, partners and programs, we accelerate business progress like no other company can. At Optiv, we manage cyber risk so you can secure your full potential. For more information, visit www.optiv.com.