JD.AI Kubernetes Operator — v2 Design
Status: Planned (v2 scope — not required for initial Helm-based deployment)
Overview
The Kubernetes operator extends the JD.AI deployment model to support JdAiAgent custom resources. Agents become first-class Kubernetes objects — versioned, policy-enforced, and lifecycle-managed by the cluster.
This document captures the design intent for the operator so the API surface can be validated before implementation begins.
Custom Resource: JdAiAgent
apiVersion: jdai.io/v1
kind: JdAiAgent
metadata:
name: code-reviewer
namespace: jdai
spec:
# References a versioned agent definition from AgentDefinitionRegistry
agentDefinition: code-reviewer@1.2.0
# Number of concurrent agent replicas
replicas: 2
# Resource requests/limits per replica
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
# Tool loadout (referenced by name from ConnectorRegistry)
loadout: jira-readonly
# Policy override (references a JdAiPolicy resource)
policyRef:
name: strict-code-review
# Telemetry configuration
telemetry:
otlpEndpoint: http://otel-collector:4317
status:
ready: true
availableReplicas: 2
conditions:
- type: Ready
status: "True"
lastTransitionTime: "2025-07-01T00:00:00Z"
Custom Resource: JdAiPolicy
apiVersion: jdai.io/v1
kind: JdAiPolicy
metadata:
name: strict-code-review
namespace: jdai
spec:
rules:
- id: no-external-calls
description: Block outbound HTTP to non-allowlisted hosts
enforce: true
- id: no-secret-output
description: Redact secrets from all agent outputs
enforce: true
auditRetentionDays: 90
Operator Architecture
┌──────────────────────────────────────────────────────────────┐
│ JD.AI Operator (controller-runtime) │
│ │
│ ┌──────────────────┐ ┌──────────────────────────────────┐ │
│ │ JdAiAgentReconciler │ JdAiPolicyReconciler │ │
│ │ - Watch JdAiAgent │ - Watch JdAiPolicy │ │
│ │ - Create Deployment│ - Sync to PolicyRegistry │ │
│ │ - Create Service │ - Emit audit events │ │
│ │ - Manage HPA │ │ │
│ └──────────────────┘ └──────────────────────────────────┘ │
│ │
│ Watches: Deployments, Services, HPAs (owned resources) │
└──────────────────────────────────────────────────────────────┘
The operator is implemented using the controller-runtime library in Go, or alternatively as a .NET Kubernetes operator using KubeOps.
Reconciliation Loop
For each JdAiAgent:
- Resolve
agentDefinitionfromAgentDefinitionRegistry(version pinned) - Render agent configuration (tools, policies, loadout)
- Create or update a
Deploymentwith the resolved configuration - Attach a
Serviceif the agent exposes an HTTP interface - Attach an
HPAifreplicasis a range - Emit an
AuditEventfor any configuration change
Implementation Phases
| Phase | Scope |
|---|---|
| v1 | Helm chart + manual Deployment management (current) |
| v2a | CRD definitions + basic reconciler (create Deployment from JdAiAgent) |
| v2b | Policy CRD + reconciler + audit integration |
| v2c | HPA management, status conditions, leader election |
| v3 | Multi-cluster federation, GitOps integration |
References
- controller-runtime — standard Go operator framework
- KubeOps — .NET operator SDK
- Operator Pattern — Kubernetes documentation
- JD.AI Architecture — Kubernetes Integration