Scan all workflow artifacts for leaked secrets

Scan all workflow artifacts for leaked secrets
Photo by Anandan Anandan / Unsplash

In response to:

Major GitHub repos leak access tokens putting code and clouds at risk
Build artifacts generated by GitHub Actions often contain access tokens that can be abused by attackers to push malicious code into projects or compromise cloud infrastructure.

I've created a quick powershell script that will scan your GitHub org for leaked secrets using TruffleHog.

# Scan all your workflow runs for accidental leaks of secrets in artifacts
#
# Created by: Jesse Houwing - Xebia
# License: MIT
#
# ATTRIBUTION REQUIRED 
#
# Consider sponsoring the author: https://github.com/sponsors/jessehouwing/

# relies on trufflehog for secrets detection: https://github.com/trufflesecurity/trufflehog
# relies on github cli for auth: https://github.com/cli/cli
# relies on jq for combining paginated results: https://jqlang.github.io/jq/download/

function scan-artifacts {
    param(
        [Parameter(Mandatory = $true)]
        [string]$org,
        [string[]]$repos
    )

    begin {
        if (($null -eq $repos) -and ($repos.Count -eq 0)) {
            $repos = get-repos -org $org
        }
    }

    process {
        foreach ($repo in $repos)
        {
            $downloaded = $false
            # Query all workflow runs for the repo
            write-host "$org/$repo"
            $workflowRuns = (& gh api /repos/$org/$($repo)/actions/runs --paginate --jq '.workflow_runs[].id' 2>$null | & jq -s | ConvertFrom-Json)

            #query all artifacts for each workflow run  
            foreach ($workflowRun in $workflowRuns)
            {
                $tempfolder = "./runs/$org/$repo/$workflowRun"
                $_ = New-Item -Path . -Name $tempfolder -ItemType Directory -Force

                if (test-path "$tempfolder/*") {
                    $downloaded = $true
                    continue
                }
                
                $_ = gh run download $workflowRun --dir "$tempfolder" --pattern "*" --repo "$org/$repo" 2>$null
                if ($LASTEXITCODE -ne 0) {
                    continue
                }
                $downloaded = $true
            }

            if ($downloaded)
            {
                docker run --rm -it -v "${PWD}:/pwd" trufflesecurity/trufflehog:latest filesystem "/pwd/runs/$org/$repo" --no-update --fail --json --exclude-paths=/pwd/scan-artifacts-ignore.txt
                if ($LASTEXITCODE -ne 0) {
                    write-host "Found secrets in artifacts for /$org/$repo" -ForegroundColor red
                }
            }
        }
    }
}


function get-repos 
{
    param(
        [Parameter(Mandatory = $true)]
        [string]$org
    )

    return (& gh api /orgs/$org/repos --paginate --jq '.[].name' 2>$null)
}

# scan-artifacts -org "your-org" [-repos @("your-repo")]

Save the script as scan-artifacts.ps1 and scan a whole org from a pwsh prompt using:

. .\scan-artifacts.ps1
scan-artifacts -org "your-org"

Or scan one or more specific repositories using:

. .\scan-artifacts.ps1
scan-artifacts -org "your-org" -repos @("repo1", "repo2")

I've excluded /node_modules as it generates too many false positives. To exclude files or folder, you can add a file in the scripts directory called scan-artifacts-ignore.txt and add a regular expression for each thing you want to exclude:

/node_modules/

The script will download all workflow artifacts to local disk, depending on the size of your org, that may take up a considerable amount of disk space, in our case 50GB. As is, it will skip previously downloaded runs, so you can run it again in case something goes wrong.

If needed, throw away the scan results after each repo is scanned by deleting the ./runs directory.

To protect yourself:

  1. Rotate secrets found.
  2. Verify the tokens weren't used to tamper with artifacts or repos.
  3. Delete the workflow runs.

If this script helped you protect your data, consider sponsoring me:

Sponsor @jessehouwing on GitHub Sponsors
My name is Jesse Houwing. I love good coffee, meaning strong, dark espresso! Regular Expressions are like word-puzzles to me. I take pictures everywhere I go. I am married and a father of 2. I work…