Squad Squad

KEDA Autoscaling

⚠️ Experimental — Squad is alpha software. APIs, commands, and behavior may change between releases.

Try this to understand your scaling needs:

How many issues are currently queued for Squad agents?

KEDA (Kubernetes Event-Driven Autoscaling) is an open-source component that scales Kubernetes workloads based on external event sources. Squad ships an external scaler template that scales agent pods up and down based on the depth of your GitHub issue queue.


When to Use This

Use KEDA autoscaling when:

  • Squad agents run as Kubernetes pods (not local machines)
  • Issue volume is unpredictable — bursts of work should spawn more agents automatically
  • You want zero-agent idle cost when there is no work

Prerequisites

  • A Kubernetes cluster with KEDA installed (keda.sh)
  • Squad agents packaged as container images and deployed as a Deployment
  • A GitHub token with repo scope for issue queue polling

Setup

  1. Install KEDA on your cluster:

    helm repo add kedacore https://kedacore.github.io/charts
    helm install keda kedacore/keda --namespace keda --create-namespace
  2. Apply the Squad KEDA ScaledObject template from templates/keda/scaled-object.yaml:

    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: squad-agents
    spec:
      scaleTargetRef:
        name: squad-agent-deployment
      minReplicaCount: 0
      maxReplicaCount: 10
      triggers:
        - type: external
          metadata:
            scalerAddress: squad-external-scaler:8080
            owner: your-org
            repo: your-repo
            labels: "squad:ready"
            targetQueueLength: "5"
          authenticationRef:
            name: github-token-secret
  3. Create the GitHub token secret:

    kubectl create secret generic github-token-secret \
      --from-literal=personalAccessToken=<your-token>

Configuration Reference

FieldDescription
minReplicaCountAgents to keep running when idle (use 0 for zero-cost idle)
maxReplicaCountHard ceiling on agent pods
targetQueueLengthIssues per agent pod (tune for task duration)
labelsIssue labels to count as “queued work”

See Also