Looking back on the Global DevOps Experience

It's been a few weeks since the Global DevOps Experience 2024 finished, but it's a lot longer since we all started back in March.

Looking back on the Global DevOps Experience

It's been a few weeks since the Global DevOps Experience 2024 finished, but it's a lot longer since we all started back in March.

I've only played a small part in this year's GDEX, as a contributor, not as a core organizer. Seeing how much time only that bit took up, I'm just in awe of how much energy the core organizer team poured into the event. Kudus!

The elements I added were:

  • Challenge 2 - Setting up Merge policies, Rulesets, Code Owners and Environment Protections.
  • Debugging & Hotfixing the Octokit.net library
  • Enabling the wiki on all team repositories

Challenge 2 - Setting up Merge Policies, Rulesets, Code Owners and Environment Protections

Challenge 2 revolved around protecting unwanted access to the production environment of GloboTicket and had the participants explore the numerous protection features GitHub has to offer. I've personally set up these things numerous times for my own repos, so preparations for the challenge didn't include a lot of research.

It did touch upon a few pretty recent feature additions to GitHub, including the new Branch Rulesets, so was a very useful lab for people unfamiliar with its power and complexity.

The challenge consists of 3 parts:

  1. The introduction, short hint and step-by-step instructions.
  2. The code to verify the policies had been correctly configured.
  3. The code to fix the settings in case the participants got stuck.

The lab

The process of manually configuring the Branch Rulesets and Environment Protections for GitHub Actions I'd done before, so for writing the instructions I was largely able to reuse this content:

Protect the repository hosting your GitHub Action
It comes as no surprise that the tags and branches solution to version GitHub Actions is weak at best. There have been rumors of Actions moving to a different model (GitHub Container Registry), but that is yet to see the light.

Verifying the changes

To verify whether they had done the work correctly though...

Our core challenge infrastructure was built using .NET and thus relied heavily on Octokit.net to communicate with GitHub. Unfortunately, neither Rulesets, Code Owners nor Environment Protection Rules are implemented at the moment. So quite quickly we had to resort to raw API calls and parsing JSON structures. I've written quite a few PowerShell Scripts against the GitHub API in the past 2 years, but have come to rely quite heavily on the untyped nature and the almost "native" support for reading XML and JSON as if they were object structures.

It's at these kinds of moments that GitHub Copilot helped me reduce the time to figure all of these things out. I ended up using a nice prompt engineering trick to provide Github Copilot the raw JSON document in a comment, before describing what I wanted:

var rawEnvironment = await _githubServiceJordan.GetEnvironmentRaw(repositoryName, "production");

/*
{
    "id": 3096686207,
    "node_id": "EN_kwDOK1mBYc64k65_",
    "name": "production",
    "url": "https://api.github.com/repos/xebia/enterprise-onboarding/environments/production",
    "html_url": "https://github.com/xebia/enterprise-onboarding/deployments/activity_log?environments_filter=production",
    "created_at": "2024-06-04T15:14:32Z",
    "updated_at": "2024-06-04T15:14:32Z",
    "can_admins_bypass": true,
    "protection_rules": [
    {
        "id": 19493562,
        "node_id": "GA_kwDOK1mBYc4BKXK6",
        "type": "required_reviewers",
        "prevent_self_review": true,
        "reviewers": [
        {
            "type": "Team",
            "reviewer": {
            "name": "everyone",
            "id": 9385908,
            "node_id": "T_kwDOAAW7ws4Ajze0",
            "slug": "everyone",
            "description": "",
            "privacy": "closed",
            "notification_setting": "notifications_enabled",
            "url": "https://api.github.com/organizations/375746/team/9385908",
            "html_url": "https://github.com/orgs/xebia/teams/everyone",
            "members_url": "https://api.github.com/organizations/375746/team/9385908/members{/member}",
            "repositories_url": "https://api.github.com/organizations/375746/team/9385908/repos",
            "permission": "pull",
            "parent": null
            }
        }
        ]
    },
    {
        "id": 19493566,
        "node_id": "GA_kwDOK1mBYc4BKXK-",
        "type": "branch_policy"
    }
    ],
    "deployment_branch_policy": {
    "protected_branches": true,
    "custom_branch_policies": false
    }
}
*/

// The rawEnvironment variable contains a JSON node object with the reply from the GitHub API
// The JSON structure is similarly structured as provided in the comment above. 
// Check the contents of the protected_branches and
// custom_branch_policies values to be the value in the above sample.
// ensure protected_branches is true and custom_branch_policies is false
// ensure the protection_rules contains a required_reviewers rule with prevent_self_review
// set to true and reviewers containing the team with the same name as the repository

This immediately spits out the correct logic to read the JSON document and to verify the required values. Next steps, write a few tests against that code, which was easily done using the /tests command.

I was able to use the same technique for the Branch Ruleset, which is similarly unsupported by Octokit.net at the moment.

For the GitHub Actions Workflow I relied on a simple YAML parser NuGet package and the same trick worked there as well. Brilliant!

I ended up writing the Code Owners parsing code by hand, I've taught Regular Expressions for the first 5 years of my career while contributing to the SpamAssassin project, so that wasn't that hard to do.

The fix

That left me with the code to automatically fix the lab in case the team got stuck. For this lab this consisted of 2 things:

  1. The changes to the codebase, namely adding a CODEOWNERS file and patching the deployment workflows.
  2. Changing the repository configuration to apply the Branch Ruleset, Environment Protection and Merge policies.

The first part was simple, the basic infrastructure that executes the lab had built-in functionality to overwrite files in the git repos of the participants by providing them with a pull request.

For the configuration changes, however, those can't be done in a pull request, plus they require repo-owner permissions to change the repository configuration. Permissions neither GitHub CodeSpaces nor GitHub Actions can access.

So, I ended up providing additional scripts the participants could run in their codespace to perform these changes. It relies on a previously documented trick to temporarily elevate the permissions of the codespace:

$env:OLD_GITHUB_TOKEN = $env:GITHUB_TOKEN
$env:GITHUB_TOKEN = $null
$env:OLD_GH_TOKEN = $env:GH_TOKEN
$env:GH_TOKEN = $null

try
{
    gh auth login --web -h github.com -s repo -p https

    .\setup-branch-ruleset.ps1
    .\setup-production-environment.ps1
    .\setup-repo-settings.ps1
}
finally
{
    $env:GITHUB_TOKEN = $env:OLD_GITHUB_TOKEN
    $env:GH_TOKEN = $env:OLD_GH_TOKEN
Accessing (private) GitHub resources from a Codespace
If you want to access packages or repositories from another organization though, you’re out of luck. Even though you can request access to repositories outside of your organization or account, GitHub won’t grant you that access when the Codespace starts.

With the right permissions in place it was easy to leverage the GitHub CLI in combination with PowerShell to apply these changes.

I did find a few useful tricks:

# Import a branch ruleset through the API.
gh api --method POST /repos/$repo/rulesets --input $PSScriptRoot/setup-branch-ruleset.json

# Creating a new profdction Environment
gh api https://api.github.com/repos/$repo/environments/$targetEnvironment -X PUT

# Applying the requested Environment protections to the Environment
gh api --method PUT /repos/$repo/environments/$targetEnvironment `
    -F "prevent_self_review=true" `
    -f "reviewers[][type]=Team" `
    -F "reviewers[][id]=$teamId" `
    -F "deployment_branch_policy[protected_branches]=true" `
    -F "deployment_branch_policy[custom_branch_policies]=false"

Unfortunately, the challenge had one remaining bug in it, where all of the code worked correctly, but the instructions had gone out of sync over time. It's one of those "all unit tests pass" situations and luckily it was caught by the first few teams. We were able to patch it while the event was ongoing and most teams in Europe, Africa and USA probably didn't even know there was an issue to begin with. Pfew!

Unit, Integration, and End-to-End Testing: What’s the Difference?

Debugging & hotfixing the Octokit.net library

The Global DevOps Experience relied quite heavily on what's called "IssueOps". Automated interactions with the participants happened through creating and commenting on issues. We relied on the Octokit.net library for these interactions and a week before the event, all communication suddenly halted.

Turns out that GitHub had finally reached the point where Issue Comment Ids overflowed a standard Int32 and required a patch in the SDK to make all of these an Int32 (long). But it took a while to figure out what was going on, as the code failed deserializing the responses from GitHub.

Turns out we weren't alone:

[BUG]: Error deserializing issue comments · Issue #2927 · octokit/octokit.net
What happened? I’ve begun noticing errors deserializing the comment ID number when receiving issue webhooks. The value appears to be an int in the models but it seems GitHub has finally crossed the…
[BUG]: Query DeploymentEnvironments throws exception · Issue #2931 · octokit/octokit.net
What happened? Same issue as: #2927 But for DeploymentEnvironments Versions v12 Relevant log output No response Code of Conduct I agree to follow this project’s Code of Conduct

But also, that the issue was larger than originally thought:

[BREAKING CHANGES]: int to long Ids for PreReceiveHook, Deployment Environments, Repository, Org Team, Repo Invitations, Public Key, Project Cards, Organization Invitation, Migrations, GpgKey, Deployment, Authorizations, Accounts / Profiles, Codespace / Workspaces by nickfloyd · Pull Request #2941 · octokit/octokit.net
Relates to #2893 We have uncovered a series of issues around the IDs used across our API surface (Issues, Pull Requests, deployments, etc.. are all effected). The GitHub database schemas were chang…

Unfortunately, at the time we discovered the issue, the patch hadn't been released yet, so we ended up patching the library manually and temporarily publishing it to our GitHub Packages feed.

Glad this happened before the actual event, as it would have been a disaster for the event, if this had occurred a week later.

There's now an official fix available through Octokit.net v13.0.0:

Release v13.0.0 · octokit/octokit.net
What’s Changed Breaking changes [BREAKING CHANGES]: int to long Ids for PreReceiveHook, Deployment Environments, Repository, Org Team, Repo Invitations, Public Key, Project Cards, Organization Inv…

Enabling the wiki on all team repositories

The last hurdle we faced as part of provisioning. As part of provisioning each group of participants would be assigned a GitHub Repository and a GitHub Team. We'd automatically push the content into the repo. Then, as the team would start their first challenge, we'd push the guidance content into the GitHub Wiki associated to the repository. All of that is scriptable, except for one step: Adding the first page to the wiki.

We ended up running the ugliest few lines of PowerShell I'd written in a long time to push the first pages using UI Automation:

$ErrorActionPreference = "Stop"

install-module "selenium"
import-module "selenium"

if ($Driver -eq $null){
    $Driver = Start-SeFirefox
    $Driver.Navigate().GoToUrl("https://github.com/login")
    write-host "Sign in in the browser and press any key to continue"
    $null = $host.UI.RawUI.ReadKey("NoEcho,IncludeKeyDown")
}

$Driver.Manage().Timeouts().ImplicitWait = [timespan]::FromMilliseconds(5000)
$Driver.Manage().Timeouts().PageLoad = [timespan]::FromMilliseconds(5000)
$Driver.Manage().Timeouts().AsynchronousJavaScript = [timespan]::FromMilliseconds(5000)

$repos = gh repo list globaldevopsexperience --limit 2000 --json nameWithOwner,repositoryTopics | ConvertFrom-Json
foreach ($repo in $repos | ?{ $_.nameWithOwner.StartsWith("globaldevopsexperience/global-") }){
    $nameWithOwner = $repo.nameWithOwner
    write-host $nameWithOwner

    if (-not ($repo.repositoryTopics.name -contains "wiki")){
        $Driver.Navigate().GoToUrl("https://github.com/$nameWithOwner/wiki/_new")
       
        $commentElement = $Driver.FindElementById("gollum-editor-message-field")
        if ($commentElement.GetAttribute("value") -eq "Initial Home page"){
            # find the button on the screen with the text "Save Page"
            $button = $Driver.FindElementByXPath("//button[contains(text(),'Save page')]")
            $button.Click()
        }
        gh repo edit $nameWithOwner --add-topic "wiki"
    }
}

What's the saying again, if it's ugly, but it works, it's beautiful?

Wrapping up

Running an event across the globe with 1000s of participants requires a different mindset when setting things up than the average "enterprise solution", where users are onboarded one at a time, but the learnings of these events are applicable to the enterprise situation and allow us to work more effectively when setting up a new GitHub Enterprise or when moving or migrating people to GitHub.

Not only was it useful it was a lot of fun too.