Copilot amplifies insecure codebases by replicating vulnerabilities in your projects

SnykSec - Feb 23 - - Dev Community

Did you know that GitHub Copilot may suggest insecure code if your existing codebase contains security issues? On the other hand, if your codebase is already highly secure, Copilot is less likely to generate code with security issues. AI coding assistants can suggest insecure code due to their limited understanding of your specific codebase. They imitate learned patterns or utilize available context without providing judgment. Giving Copilot better examples to learn from can improve its behavior, but it doesn't guarantee protection or guardrails against security vulnerabilities. This is akin to the "broken windows" theory, which states that visible signs of crime, anti-social behavior, and civil disorder collectively create an urban environment that encourages further crime and disorder.

In this post, we’ll go through a concrete example showing how Copilot can replicate existing security issues in your code. The example uses Copilot in a project that contains many vulnerabilities. Copilot utilizes the neighboring tabs feature, accessing code from open files in our IDE to provide context. Because the context in this case contains vulnerabilities, Copilot amplifies the security problems in the project further through its suggestions. This means that existing security debt in a project can make insecure developers using Copilot even less secure. We’ll see this in more detail in the next section.

Conversely, Copilot is less likely to suggest insecure code in projects without security issues, as it has less insecure code context to draw from. This provides a big incentive to invest time into reducing vulnerabilities from your existing code base, as it will reduce issues being introduced in the future through generative AI coding assistants. Snyk data tells us that the average commercial project has an average of 40 vulnerabilities in first-party code. Almost a third of these issues are high-severity issues. This is the playground in which AI generation tools can duplicate code by using these vulnerabilities as their context. The most common issues Snyk sees in commercial projects are cross-site scripting (XSS), path traversal, hardcoded secrets and credentials, and SQL injection.

Understanding the problem

Generative AI coding assistants, such as GitHub Copilot, AWS CodeWhisperer, and ChatGPT, offer a significant leap forward in enhancing productivity and code efficiency. But remember, generative AI coding assistants, such as Copilot, don’t actually understand code semantics and, as a result, cannot judge it. Essentially, the tool mimics code that it previously saw during its training. Providing better role models will make it behave better, but it doesn’t offer any assurances. If you want your AI-generated code to be secure, you need a security guardrail.

An important part of providing relevant suggestions back to a user is context. 

Copilot generates code snippets based on patterns and structures it has learned from a vast repository of existing code. It uses “prompt engineering” and "neighboring tabs" as context to suggest relevant code. These files can be tabs you recently opened of files that existed in your project originally. This approach has several advantages, but it can also have a glaring drawback in the context of security.

Originally, Copilot could only use the file you were coding in as its context. Of course, the suggestions weren’t as relevant to the project overall because the algorithms couldn’t use the rest of the codebase to generate suggestions that were relevant to the project overall. The next step was to build out a prompt library, which stores a ton of data about the developer’s context and better makes sense of the user prompt to provide a more accurate response based on algorithms that prioritize much more contextual data. Finally, neighboring tabs is a technique in which Copilot can use the files open in a developer’s IDE, prioritizing those higher alongside the file that the developer is currently working on. By opening more files that are relevant to their project, developers will get results that are stronger matches to their prompts. GitHub Copilot community leader, Akash Sharma, described the benefits of this functionality in a recent comment, saying this behavior has increased accepted Copilot suggestions by almost two-fold, from ~20% to 35%

However, what if your developers on that project don't have a strong security background? What if those projects already have security issues and quality mistakes that are sitting in your project as unmanaged technical debt? 

Put simply, when Copilot suggests code, it may inadvertently replicate existing security vulnerabilities and bad practices present in the neighbor files. This can lead to insecure coding practices and open the door to a range of security vulnerabilities.

Let’s watch this in action:


In the video above, we see how we can ask Copilot to create a couple of SQL queries to match some user input with a product name or description in our database. We know the harm a SQL injection vulnerability can have in an application, and want to make sure we don’t introduce any into our code. The first time we ask for a query, we get the following:

// create query to match input with the description or product name
var query = em.createQuery("SELECT p FROM Product p WHERE LOWER(p.description) like  OR lower(p.productName) like :input", Product.class);
Enter fullscreen mode Exit fullscreen mode

This looks like good code, as we use named parameters that prevent us from being vulnerable to SQL injection. However, we then introduce a vulnerable snippet of code into a neighboring tab, which creates an SQL query elsewhere in our project. The very next time we ask Copilot the exact same request, it makes use of the new vulnerable context and replicates our vulnerable example amplifying this vulnerability in our new code suggestion as follows:

// create query to match input with the description or product name
String query = "Select * from Product where lower(description) like '%" + lowerInput + "%' OR lower(product_name) like '%" + lowerInput + "%'";
Enter fullscreen mode Exit fullscreen mode

We’ve just gone from one SQL injection in our project to two, because Copilot has used our vulnerable code as context to learn from.

Exacerbating security issues

For codebases that contain security vulnerabilities, this behavior can exacerbate these problems in several ways:

  • Reinforcing bad habits: For inexperienced or insecure developers, Copilot's code suggestions can reinforce bad coding habits. If they see insecure code patterns replicated, they may assume these practices are acceptable, ultimately leading to the perpetuation of security issues.
  • Lack of review: Code generated by Copilot can be implemented without a thorough review. This absence of human intervention can result in security vulnerabilities going unnoticed, as the code's generation context may not always highlight these issues.
  • Outdated and flawed patterns: Copilot may suggest code based on outdated or flawed coding patterns that were considered acceptable in the past but are now recognized as security risks.
  • Overlooking security concerns: Copilot focuses on code generation, not security assessment. Developers may be more concerned with functionality than security, inadvertently overlooking vulnerabilities.

What can you do to mitigate the problem?

To mitigate the issue of duplicating existing security issues in code generated by AI coding assistants, organizations can take several steps:

  • Developers should always conduct manual reviews of code generated by coding assistants. This review should include a comprehensive security assessment to identify and rectify vulnerabilities.
  • Security teams should put SAST guardrails in place, including well-known policies that their development teams can work to. Snyk can help identify and rectify security issues, in both manually written and generated code, quickly with remediation support to help you fix new issues you’re adding to your codebase.
  • Developers can adhere to secure coding guidelines established by development teams (security champions) and security teams
  • Security teams can provide the necessary training and awarenessto development teams, sharing an understanding of common security vulnerabilities and best practices, allowing developers to make informed decisions when reviewing AI-generated code.
  • Security teams can help prioritize and triage the backlog of issues per development team. Eliminating the most dangerous issues per project will reduce the opportunity that generative AI coding assistants will replicate them in suggested code.
  • Executive teams can mandate security guardrails as a condition for the use of generative AI code assistants, further increasing the awareness and education of the risks and the mitigations.

Conclusion: AI coding assistants needs security guardrails

It’s important to recognize that generative AI coding assistants, like Copilot, don’t understand code semantics and, as a result, cannot judge it. Essentially, the tool mimics code that it previously saw during its training. Providing better role models will make it behave better, but it doesn’t offer any assurances. If you want your AI-generated code to be secure, you need a security guardrail.

With this in mind, it’s key to combine generative AI coding assistant tools with traditional AppSec techniques in order to mitigate new issues as they’re introduced. These techniques include manual code reviews, secure coding guidelines, training, and static analysis testing throughout the SDLC — particularly early on in places like the IDE where this code is generated. In doing so, we can strike a balance between innovation and security, ultimately making our applications more resilient to potential threats. 

Additionally, the behavior of generative AI coding assistants makes it even more important to fix existing security vulnerabilities in our codebases, since the fewer vulnerabilities present in our code will lead to fewer vulnerable suggestions from our AI coding assistants. 

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .