Post-Mortem / Root Cause Analysis (April 2021)

Summary

On April 1, 2021, the Codecov team was alerted to a security event involving our Bash Uploader. The threat actor specifically targeted the Codecov Bash Uploader and used it to deliver a malicious payload to all Codecov users utilizing the Bash Uploader, The Codecov GitHub Action, The Codecov CircleCI Orb, and the Codecov Bitrise Step (collectively, the “Bash Uploaders”).

The team immediately worked to mitigate future impact of the incident by removing the malicious change from the Bash Uploader, and implementing controls to prevent it from being added again. 

There were further impacts as the nature of the malicious code change extracted git remote origin URLs and environment variables from the environment where the maliciously altered Bash Uploader was executed. The nature of this attack and follow on impacts were detailed thoroughly in our Security Update on April 15, 2021. 

 

Root Cause

The attacker leverage two key exploits:

  1. The attacker was able to extract an HMAC key for a Google Cloud Storage service account from an intermediate layer in our public Codecov Self-Hosted Docker image. 
  2. The attacker used this key to modify the Bash Uploader in Google Cloud Storage, such that it was served directly to end-users with the malicious changes in place. 

 

Impact

Customers most likely experiencing this event were those that downloaded the Bash Uploader during the window when the threat actor had unauthorized access to the Bash Uploader and executed it. 

Specifically affected customers received communication via individual emails and in-app notifications. Customers were instructed on how to assess their own situation by:

  1. Executing the env command in their CI pipelines/steps that also executed the Bash Uploader
  2. Making note of the variables printed to output from this command, and determining the sensitivity of each variable.
  3. Determining the impacts to their own applications, infrastructure, and customers if those credentials were used by an unauthorized third party. 

Additionally, from April 29th onward, Codecov included in-app notifications indicating specifically impacted organizations and repositories, and the names of potentially leaked environment variables. 

 

Detection

The incident was first detected on April 1, 2021. A customer performing SHASUM checking on the Bash Uploader noticed a discrepancy between the SHA256 reported on GitHub and their own calculated SHA256 for the Bash Uploader. The customer raised this issue to us via our security email alias.

 

Response

Upon discovering the situation, our team moved swiftly to understand the root cause and develop mitigation and remediation measures for customers.

Ongoing response to the incident was handled as follows:

Before April 15th Disclosure:

  • Security Task Force: Management of discovery process; engaging with third-party forensic analysts and federal investigators.
  • Engineering Team: Core auditing of all systems; regeneration of all keys, secrets, and credentials; root cause analysis.
  • Support Team: Assisting with root cause analysis and system audit as necessary.


After April 15th Disclosure:

  • Security Task Force: Continued engagement with third-party forensic analysts and federal investigators; Customer calls and technical support; Management of process modifications.
  • Engineering Team: Process modification; development of supporting software for altered processes; Technical changes to build pipelines as needed
  • Support Team: Technical and customer support related to disclosure.

We moved as quickly as possible in our response efforts, while (a) coordinating with federal law enforcement and cybersecurity agencies and (b) investigating the situation thoroughly so that we could provide accurate, actionable information to our customers.

 

Recovery

To recover from the incident itself, the key taken by the attacker was revoked. Additionally, all production keys were audited and rotated. Furthermore, we updated all publicly accessible Codecov Docker images to use squashed and/or multistage builds. Previous Docker images were also pushed over with new squash builds and all previous versions removed from Dockerhub. 

With the help of our third-party forensic team, we also conducted a full investigation of our infrastructure, associated logs, and application logs to confirm that there were no other intrusions or inappropriate access to our systems.

 

Corrective Actions and Lessons Learned

Many remedial steps were taken as a result of the incident, and the following improvements were either made or are in the process of being made:

    • Full Key Rotation: All relevant keys have been rotated, with previous keys revoked.
    • Updates to Docker Image Deployment: All public images are now squashed and/or converted to multistage to prevent a future Docker Layer attack.
    • Launch of New Uploader and Deprecation of the Bash Uploader: Codecov released a new Uploader. The new Uploader will be shipped as a signed and SHASUM verifiable binary executable. This builds on existing plans to ship a new Uploader and deprecate the existing Bash Uploader. 
    • Active Monitoring of Relevant Google Cloud Storage Assets: Any changes to the Bash Uploader will now trigger alerts and notifications that escalate to the appropriate members of the Codecov team.
    • Signature and SHASUM Validation of the Bash Uploader: The Bash Uploader has always possessed a method of SHASUM validation; however, Codecov did not fully or properly document the processes required to do so. External documentation has been updated to better demonstrate how to perform SHASUM validation of the uploader. Future work will be to sign the Bash Uploader using a GPG key.
    • Key Generation, Usage, and Rotation Policy Changes: Codecov developed its own in-house tool for properly tracking key generation, as well as enforcing rotation policies (with a specific rotation policy to be determined), and key use auditing.
    • Enhance Incident Response Processes, Policies, and Procedures: Codecov is using this security event as a learning experience for enhancing our incident response process, policies, and procedures. Codecov is committed to quickly responding to indicators of compromise, investigating to determine what happened and why, and communicating that to our stakeholders in order to protect our customers.
    • Codecov Staffing Changes: Based on this event and its outcomes, we will be building a dedicated security team, starting with product/application security and infrastructure security.

 

Systemic Problems and Wider Areas of Concern

The Codecov team observed several points that we hope to share with the industry: 

  • Software Distribution and Signing: Curl pipe to bash, while incredibly convenient, if rife with security issues. Issues in software distribution and signing is one of the most important lessons learned by the Codecov team. Securing the supply chain software distribution is a difficult problem. Many potential solutions to this problem exist, with a recent entrant, Sigstore, showing particular promise. Codecov chose to address this problem by shipping a new Uploader, which will be shipped as a signed and SHASUM-verifiable binary executable.
  • Key Management: Challenges around key rotation, provenance, and use are ever-present in the software industry. While many tools exist to help with the secure distribution of keys and secrets (e.g., Google Secret Manager, Hashicorp Vault, etc); few solutions exist to properly track all the metadata associated with a secret. Questions like, When was this key generated? Where is it used? How can it be revoked? And others are still incredibly difficult to answer. Ultimately, as a result of this incident and the ensuing challenge of a full key rotation, Codecov chose to address this problem by building our own internal tool.
  • Docker Layer Attacks: Publicly distributed Docker images should be either squashed or multistage such that intermediate layers that contain sensitive information are excluded from the final build. Codecov chose to solve this problem by using multi-staged and squashed builds of our Self-Hosted offering.