Skip to content

Concurrency Model

gh-infra parallelizes API calls across repositories to minimize execution time, while maintaining sequential ordering where correctness requires it. This page documents the concurrency design in detail.

  1. Repositories are independent — operations on different repos never conflict, so they can run in parallel
  2. Settings within a repo are ordered — creating a repo must precede setting its description, branch protection, etc.
  3. Output order is deterministic — even though work runs in parallel, plan/apply results are always printed in a consistent order
  4. Concurrency is bounded — a worker pool limits parallelism to 10 to avoid GitHub API rate limits

Every plan and apply execution follows the same phases:

flowchart LR
    A[Parse YAML] --> B[Fetch current state]
    B --> C[Diff]
    C --> D[Display plan]
    D --> E[Apply]

    style B fill:#4dabf7,stroke:#339af0,color:#fff
    style E fill:#4dabf7,stroke:#339af0,color:#fff
    style C fill:#868e96,stroke:#495057,color:#fff
    style A fill:#868e96,stroke:#495057,color:#fff
    style D fill:#868e96,stroke:#495057,color:#fff

Blue = parallelized (per repo), Gray = sequential.

Fetches the current state of all repositories from GitHub API in parallel.

flowchart TB
    subgraph pool["Worker Pool (max 10 concurrent)"]
        direction TB
        subgraph g1["worker: repo-A"]
            direction LR
            a1[gh repo view] --> a2[branches]
            a1 --> a3[rulesets]
            a1 --> a4[secrets]
            a1 --> a5[variables]
        end
        subgraph g2["worker: repo-B"]
            direction LR
            b1[gh repo view] --> b2[branches]
            b1 --> b3[rulesets]
            b1 --> b4[secrets]
            b1 --> b5[variables]
        end
    end

    pool --> results["Collect results by index"]

Implementation: internal/repository/orchestrate.go via parallel.Map

  • parallel.Map spawns a fixed pool of 10 worker goroutines that pull jobs from a channel
  • Each worker calls Fetcher.FetchRepository(), which internally uses errgroup to parallelize sub-fetches (branch protection, rulesets, secrets, variables, actions, commit message settings, release immutability, security endpoints)
  • Within fetchBranchProtection and fetchRulesets, per-branch and per-ruleset detail fetches are themselves parallelized with parallel.Map (capped at DefaultConcurrency)
  • Spinner display via ui.RunRefresh shows per-repo progress (✓/✗)
  • Results are written to a pre-allocated slice by index — no mutex needed for the result array itself
  • Errors are non-fatal: failed repos are skipped and reported after all fetches complete

Fetches current file content for each (fileset, repository) pair.

flowchart TB
    subgraph units["Goroutines (one per plan unit)"]
        direction TB
        subgraph u0["fileset=ci, repo=org/repo-a"]
            f1[fetch ci.yml] --> d1[diff]
            f2[fetch lint.yml] --> d1
        end
        subgraph u1["fileset=ci, repo=org/repo-b"]
            f3[fetch ci.yml] --> d2[diff]
        end
    end
    units --> collect["Collect in order"]

Implementation: internal/fileset/fileset.go Plan()

  • One goroutine per (fileset × target repo) pair
  • Spinner display via ui.RunRefresh per target repo
  • Results collected in order-preserving indexed slice

Diffing is purely sequential and CPU-bound — no API calls. Each repo’s desired state is compared against its fetched current state to produce a list of Change entries.

The diff runs immediately after each repo fetch completes (within the same goroutine), before the fetch goroutine signals done.

flowchart TB
    subgraph pool["Worker Pool (max 10 concurrent)"]
        direction TB
        subgraph ga["worker: repo-A"]
            direction TB
            a1[create repo] --> a2[set description]
            a2 --> a3[set visibility]
            a3 --> a4[set topics]
            a4 --> a5[create branch protection]
            a5 --> a6[create ruleset]
        end
        subgraph gb["worker: repo-B"]
            direction TB
            b1[update description] --> b2[update merge_strategy]
            b2 --> b3[update branch protection]
        end
    end

    pool --> flat["Flatten results in order"]

Implementation: internal/repository/apply.go Apply() via parallel.Map

  • Changes are grouped by repo name using groupByName()
  • Repo groups run in parallel — bounded by worker pool (max 10)
  • Changes within a repo run sequentially — this is critical because:
    • create repo must complete before any settings can be applied
    • Branch protection requires the branch to exist
    • Rulesets reference conditions that assume repo state
  • Spinner display shows per-repo progress
  • Results are collected in a pre-allocated [][]ApplyResult by group index, then flattened in order
flowchart TB
    subgraph pool["Worker Pool (max 10 concurrent)"]
        direction TB
        subgraph ga["worker: repo-A (direct push)"]
            direction TB
            a1[get HEAD SHA] --> a2[createCommitOnBranch]
        end
        subgraph gb["worker: repo-B (pull request)"]
            direction TB
            b1[get HEAD SHA] --> b2[create branch] --> b3[createCommitOnBranch] --> b4[open pull request]
        end
    end
    pool --> flat["Flatten results in order"]

Implementation: internal/fileset/fileset.go Apply() via parallel.Map

  • Changes are grouped by target repo using groupChangesByTarget()
  • Repos run in parallel — bounded by worker pool (max 10)
  • Within each repo, all operations are sequential — the GraphQL createCommitOnBranch mutation requires:
    1. Get HEAD SHA (base commit)
    2. Send mutation with all file additions/deletions (direct push), or create branch first then send mutation and open PR (pull request)
  • A single verified commit bundles all file changes for one repo
  • Spinner display shows per-repo progress
flowchart TB
    subgraph pool["Worker Pool (max 10 concurrent)"]
        direction TB
        subgraph ga["worker: repo-A"]
            a1[FetchRepository] --> a2[ToManifest + Marshal]
        end
        subgraph gb["worker: repo-B"]
            b1[FetchRepository] --> b2[ToManifest + Marshal]
        end
    end
    pool --> ordered["Output YAML to stdout in argument order"]

Implementation: cmd/import.go via parallel.Map

  • Each repo is fetched and marshaled to YAML in parallel
  • Results stored in indexed slice, output in order after all goroutines complete
  • Ensures piped output (gh infra import a/b c/d > repos.yaml) is deterministic
PointMechanismWhy
Bounded parallelismparallel.Map with worker pool (10 goroutines)Avoid GitHub API rate limits (5000 req/hr)
Wait for all fetchesparallel.Map blocks until all workers finishPlan cannot proceed until all repos are fetched
Sub-fetch parallelismerrgroup.GroupBranch protection, rulesets, secrets, variables fetched concurrently within a single repo
Result orderingPre-allocated []T by indexGoroutines write to their own slot; no mutex needed
Spinner displaybubbletea.Program + p.Send()Thread-safe message passing from goroutines to TUI model
Spinner → plan outputtracker.Wait()Blocks until all spinners complete before printing plan
  • Fetch errors are non-fatal: a failed repo is skipped and reported; other repos continue
  • Apply errors are per-repo: if repo-A fails, repo-B still applies; errors are collected and reported at the end
  • Import errors are per-repo: failed repos are listed with ⚠ warnings; successful repos are still output

GitHub enforces two types of rate limits for authenticated requests:

LimitThresholdScope
Primary5,000 requests / hourPer authentication token
SecondaryUndisclosedShort-burst abuse detection per token

The worker pool size of 5 is primarily designed to avoid triggering the secondary rate limit (sudden bursts of concurrent requests). The primary limit is a hard ceiling on total requests regardless of concurrency.

During plan, each repository triggers the following API calls:

API CallCountNotes
gh repo view (GraphQL)1General settings, topics, merge strategy
GET /repos/{owner}/{repo}1Commit message title/body settings
GET /repos/{owner}/{repo}/immutable-releases1Release immutability
GET /repos/{owner}/{repo}/vulnerability-alerts1Dependabot vulnerability alerts
GET /repos/{owner}/{repo}/automated-security-fixes1Dependabot security updates
GET /repos/{owner}/{repo}/private-vulnerability-reporting1Private vulnerability reporting
GET /repos/{owner}/{repo}/branches1List protected branches
GET /repos/{owner}/{repo}/branches/{branch}/protectionNOne per protected branch (fetched in parallel)
GET /repos/{owner}/{repo}/rulesets1List rulesets (paginated)
GET /repos/{owner}/{repo}/rulesets/{id}MOne per ruleset (fetched in parallel)
GET /repos/{owner}/{repo}/labels1List labels (paginated)
GET /repos/{owner}/{repo}/milestones1List milestones (paginated)
gh secret list1List repository secrets
gh variable list1List repository variables
GET /repos/{owner}/{repo}/actions/permissions1Actions enabled/disabled
GET /repos/{owner}/{repo}/actions/permissions/workflow1Default workflow permissions
GET /repos/{owner}/{repo}/actions/permissions/selected-actions0–1Only if allowed_actions = “selected”
GET /repos/{owner}/{repo}/actions/permissions/fork-pr-contributor-approval1Fork PR approval setting
GET /repos/{owner}/{repo}/contents/{path}FOne per file (FileSet)

Fixed cost per repo: ~16 calls Variable cost: N (protected branches) + M (rulesets) + F (files)

Assuming 1 protected branch, 1 ruleset per repo:

Files per repoCalls per repoRepos for 5,000 budget
1~19~263
5~23~217
10~28~178
20~38~131

If you manage hundreds of repositories and approach the 5,000/hr limit:

  • Use a dedicated PAT or GitHub App token — each token has an independent 5,000/hr quota
  • Split runs across multiple manifest files — run plan on subsets to spread load over time
  • Reduce file count per fileset — fewer files means fewer contents API calls
OperationReason
YAML parsingCPU-bound, fast, no benefit from parallelism
Diff computationCPU-bound, runs within fetch goroutine
Plan output renderingMust be sequential for readable terminal output
Confirm promptBlocks on user input
Settings within a repoAPI ordering dependencies (create → configure)
GraphQL commit within a repoMutation requires HEAD SHA; for PRs the branch must be created first