Microsoft’s Copilot AI assistant is exposing the contents of more than 20,000 private GitHub repositories from companies including Google, Intel, Huawei, PayPal, IBM, Tencent and, ironically, Microsoft.

These repositories, belonging to more than 16,000 organizations, were originally posted to GitHub as public, but were later set to private, often after the developers responsible realized they contained authentication credentials allowing unauthorized access or other types of confidential data. Even months later, however, the private pages remain available in their entirety through Copilot.

AI security firm Lasso discovered the behavior in the second half of 2024. After finding in January that Copilot continued to store private repositories and make them available, Lasso set out to measure how big the problem really was.

Zombie repositories

“After realizing that any data on GitHub, even if public for just a moment, can be indexed and potentially exposed by tools like Copilot, we were struck by how easily this information could be accessed,” Lasso researchers Ophir Dror and Bar Lanyado wrote in a post on Thursday. “Determined to understand the full extent of the issue, we set out to automate the process of identifying zombie repositories (repositories that were once public and are now private) and validate our findings.”

‍After discovering Microsoft was exposing one of Lasso’s own private repositories, the Lasso researchers traced the problem to the cache mechanism in Bing. The Microsoft search engine indexed the pages when they were published publicly, and never bothered to remove the entries once the pages were changed to private on GitHub. Since Copilot used Bing as its primary search engine, the private data was available through the AI chat bot as well.

After Lasso reported the problem in November, Microsoft introduced changes designed to fix it. Lasso confirmed that the private data was no longer available through Bing cache, but it went on to make an interesting discovery—the availability in Copilot of a GitHub repository that had been made private following a lawsuit Microsoft had filed. The suit alleged the repository hosted tools specifically designed to bypass the safety and security guardrails built into the company’s generative AI services. The repository was subsequently removed from GitHub, but as it turned out, Copilot continued to make the tools available anyway.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *