05138803331

Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Khorasan Razavi, Iran

- فارسی

Last Articles

High level vulnerability in cisco

فروردین ۷ , ۱۴۰۲

Read more←

CRITICAL level vulnerability inMicrosoft Outlook

فروردین ۷ , ۱۴۰۲

Read more←

CRITICAL level vulnerability in WordPress plugin

اسفند ۲۱ , ۱۴۰۱

Read more←

Categories

News (117)
- Security News (36)
- Technology News (81)
Uncategorized (1)

Code-generating AI can introduce security vulnerabilities, study finds

Code-generating AI can introduce security vulnerabilities, study finds

A recent study finds that software engineers who use code-generating AI systems are more likely to cause security vulnerabilities in the apps they develop. The paper, co-authored by a team of researchers affiliated with Stanford, highlights the potential pitfalls of code-generating systems as vendors like GitHub start marketing them in earnest.

“Code-generating systems are currently not a replacement for human developers,” Neil Perry, a PhD candidate at Stanford and the lead co-author on the study, told TechCrunch in an email interview. “Developers using them to complete tasks outside of their own areas of expertise should be concerned, and those using them to speed up tasks that they are already skilled at should carefully double-check the outputs and the context that they are used in in the overall project.”

The Stanford study looked specifically at Codex, the AI code-generating system developed by San Francisco-based research lab OpenAI. (Codex powers Copilot.) The researchers recruited 47 developers — ranging from undergraduate students to industry professionals with decades of programming experience — to use Codex to complete security-related problems across programming languages including Python, JavaScript and C.

Codex was trained on billions of lines of public code to suggest additional lines of code and functions given the context of existing code. The system surfaces a programming approach or solution in response to a description of what a developer wants to accomplish (e.g. “Say hello world”), drawing on both its knowledge base and the current context.

According to the researchers, the study participants who had access to Codex were more likely to write incorrect and “insecure” (in the cybersecurity sense) solutions to programming problems compared to a control group. Even more concerningly, they were more likely to say that their insecure answers were secure compared to the people in the control.

Megha Srivastava, a postgraduate student at Stanford and the second co-author on the study, stressed that the findings aren’t a complete condemnation of Codex and other code-generating systems. The study participants didn’t have security expertise that might’ve enabled them to better spot code vulnerabilities, for one. That aside, Srivastava believes that code-generating systems are reliably helpful for tasks that aren’t high risk, like exploratory research code, and could with fine-tuning improve in their coding suggestions.

“Companies that develop their own [systems], perhaps further trained on their in-house source code, may be better off as the model may be encouraged to generate outputs more in-line with their coding and security practices,” Srivastava said.

So how might vendors like GitHub prevent security flaws from being introduced by developers using their code-generating AI systems? The co-authors have a few ideas, including a mechanism to “refine” users’ prompts to be more secure — akin to a supervisor looking over and revising rough drafts of code. They also suggest that developers of cryptography libraries ensure their default settings are secure, as code-generating systems tend to stick to default values that aren’t always free of exploits.

“AI assistant code generation tools are a really exciting development and it’s understandable that so many people are eager to use them. These tools bring up problems to consider moving forward, though … Our goal is to make a broader statement about the use of code generation models,” Perry said. “More work needs to be done on exploring these problems and developing techniques to address them.”

To Perry’s point, introducing security vulnerabilities isn’t code-generating AI systems’ only flaw. At least a portion of the code on which Codex was trained is under a restrictive license; users have been able to prompt Copilot to generate code from Quake, code snippets in personal codebases and example code from books like “Mastering JavaScript” and “Think JavaScript.” Some legal experts have argued that Copilot could put companies and developers at risk if they were to unwittingly incorporate copyrighted suggestions from the tool into their production software.

GitHub’s attempt at rectifying this is a filter, first introduced to the Copilot platform in June, that checks code suggestions with their surrounding code of about 150 characters against public GitHub code and hides suggestions if there’s a match or “near match.” But it’s an imperfect measure. Tim Davis, a computer science professor at Texas A&M University, found that enabling the filter caused Copilot to emit large chunks of his copyrighted code, including all attribution and license text.

“[For these reasons,] we largely express caution toward the use of these tools to replace educating beginning-stage developers about strong coding practices,” Srivastava added.

Technology News

Meet the cybercriminals of 2022

It’s all in the (lack of) details: 2022’s badly handled data breaches

Leave a Reply Cancel reply

Related Articles

Google winds down feature that put playable podcasts directly in search results

Google confirmed it’s putting an end to a feature that allowed users to access playable podcasts directly from the Google Search results in favor of offering podcast recommendations. Officially launched in 2019, the feature surfaced podcasts when they matched a user’s query, including in those cases where a user specifically included the word “podcast” in their search […]

Category:

Technology News

بهمن ۱۹ , ۱۴۰۱

Read more←

Indian social media app Slick exposed childrens’ user data

Emerging Indian social media app Slick left an internal database containing users’ personal information, including data of school-going children, publicly exposed to the internet for months. Since at least December 11, a database containing full names, mobile numbers, dates of birth, and profile pictures of Slick users was left online without a password. Bengaluru-based Slick launched in […]

Category:

Technology News

بهمن ۲۳ , ۱۴۰۱

Read more←

Hack The Box, a gamified cybersecurity training platform with 1.7M users, raises $55M

There’s long existed a divide in the world of computer hacking between those who are taking a malicious approach to crack a system, and those who are using the same techniques to understand the system’s vulnerabilities, help fix them and at the same time fight against the malicious actors. Today, Hack The Box, one of the […]

Category:

Technology News

دی ۲۴ , ۱۴۰۱

Read more←

Contact us

Address: Mashhad, Ferdowsi University of Mashhad, Faculty of Engineering, APA Specialized Center

Phone number: 05138803331

Working hours: 8 am to 4 pm

cert@um.ac.ir

Links

© All rights reserved to APA Specialized Center of Ferdowsi University of Mashhad