Skip to main content
The Agent CRD is the core building block of Flokoa. It represents a fully deployable AI agent inside your Kubernetes cluster, combining container runtime configuration with LLM model references, tool bindings, and A2A (Agent-to-Agent) protocol metadata. When you apply an Agent manifest, the Flokoa operator reconciles it into a Kubernetes Deployment, creates a Service, and continuously reports the agent’s lifecycle state back through the resource’s status fields.

API reference

apiVersion: agent.flokoa.ai/v1alpha1
kind: Agent

Minimal configuration

Every Agent requires a runtime block with a type and at least one container definition. The example below is the smallest valid manifest you can apply:
apiVersion: agent.flokoa.ai/v1alpha1
kind: Agent
metadata:
  name: minimal-agent
spec:
  card:
    name: "Minimal Agent"
    description: "A basic agent"
    version: "1.0.0"
    defaultInputModes:
      - "application/json"
    defaultOutputModes:
      - "application/json"
    capabilities:
      streaming: false
    skills: []
  runtime:
    type: standard
    spec:
      container:
        name: agent
        image: ghcr.io/example/agent:latest
        ports:
          - containerPort: 8080
            name: http

Runtime modes

Flokoa supports two runtime modes. Use standard when you are bringing your own agent image, and template when you want the operator to manage the runtime for you.
In standard mode you provide your own container image. The operator wraps it in a Deployment and Service, but all application logic lives in your image.
spec:
  runtime:
    type: standard
    spec:
      replicas: 2
      container:
        name: agent
        image: ghcr.io/example/my-agent:v1.2.0
        ports:
          - containerPort: 8080
            name: http
        env:
          - name: LOG_LEVEL
            value: "info"
        resources:
          requests:
            cpu: "200m"
            memory: "256Mi"
          limits:
            cpu: "1000m"
            memory: "1Gi"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5

Spec reference

The card block exposes your agent via the A2A (Agent-to-Agent) protocol and is the primary way other agents and orchestrators discover your agent’s capabilities.
FieldTypeRequiredDescription
namestringHuman-readable agent name
descriptionstringWhat the agent does
versionstringSemantic version of the agent
defaultInputModesstring[]Accepted input MIME types (e.g. application/json)
defaultOutputModesstring[]Produced output MIME types
capabilities.streamingboolWhether the agent supports streaming responses
capabilities.pushNotificationsboolWhether the agent supports push notifications
capabilities.stateTransitionHistoryboolWhether the agent exposes task state history
skillsobject[]List of skills (see below)
Each skill in skills has:
FieldDescription
idUnique skill identifier
nameHuman-readable skill name
descriptionWhat the skill does
tagsCategorization keywords
examplesSample prompts demonstrating the skill
inputModesPer-skill input MIME override
outputModesPer-skill output MIME override
spec:
  card:
    name: "Customer Support Agent"
    description: "Handles customer inquiries and support requests"
    version: "1.0.0"
    defaultInputModes:
      - "application/json"
    defaultOutputModes:
      - "application/json"
    capabilities:
      streaming: true
      pushNotifications: false
      stateTransitionHistory: true
    skills:
      - id: "answer-questions"
        name: "Answer Customer Questions"
        description: "Provide answers to common customer questions"
        tags:
          - "support"
          - "faq"
        examples:
          - "What are your business hours?"
          - "How do I reset my password?"
The runtime block controls how your agent is deployed.
FieldDescription
typestandard (your image) or template (operator-managed)
spec.replicasNumber of pod replicas (default: 1)
spec.containerFull Kubernetes container spec (standard mode)
spec.volumesPod volumes (standard mode)
spec.serviceAccountNameServiceAccount for the pod
spec.nodeSelectorSchedule pods on matching nodes
spec.tolerationsAllow scheduling on tainted nodes
spec.affinityAdvanced scheduling rules
spec.imagePullSecretsSecrets for private registries
spec.securityContextPod-level security attributes
spec.configAgent config with output schema (template mode)
spec.envExtra environment variables (template mode)
spec.resourcesCPU/memory requests and limits (template mode)
Attach a Model CR so your agent has access to an LLM at runtime. The operator injects the necessary connection details as environment variables.
FieldDescription
model.nameName of the Model CR
model.namespaceNamespace of the Model CR (defaults to agent’s namespace)
spec:
  model:
    name: gpt-4o-model
    namespace: shared-models   # optional
Attach a system prompt either inline or by referencing an existing Instruction CR. Supported in both standard and template runtime modes.
# Inline — operator creates a child Instruction CR automatically
spec:
  instruction:
    template: "You are a helpful assistant that answers questions concisely."

# Reference — reuse a shared Instruction across multiple agents
spec:
  instruction:
    instructionRef:
      name: customer-support-prompt
      namespace: shared-resources   # optional
Tools give your agent access to external APIs. You can reference an existing AgentTool CR or define a tool inline.
spec:
  tools:
    # Reference a shared tool
    - toolRef:
        name: weather-api
        namespace: shared-tools   # optional

    # Inline tool definition
    - name: product-search
      template:
        type: openapi
        description: "Search the product catalogue"
        openApi:
          url: "https://api.example.com"
          openApiSchema:
            endpointPath: "/openapi.json"
Explicitly declaring the AI framework lets Flokoa and your observability stack identify the agent’s type in logs and metrics.
spec:
  framework: pydantic-ai
Supported values: pydantic-ai, langchain, google-adk, crewai, marvin, autogen, a2a

Status fields

The operator writes the following fields to status after reconciling an Agent:
status:
  phase: Running           # Pending | Running | Failed
  backend: standard        # Runtime backend in use
  url: http://my-agent.default.svc.cluster.local:8080
  replicas: 2
  availableReplicas: 2
  detectedFramework: pydantic-ai
  lastToolSync: "2026-01-15T10:30:00Z"
  observedGeneration: 3
  conditions:
    - type: Ready
      status: "True"
      lastTransitionTime: "2026-01-15T10:30:00Z"
      reason: DeploymentAvailable
      message: "Agent is running and available"
FieldDescription
phaseLifecycle phase: Pending, Running, or Failed
backendActive runtime backend
urlIn-cluster endpoint for calling the agent
replicasCurrent number of pod replicas
availableReplicasNumber of replicas that are ready
detectedFrameworkFramework detected from the container image
lastToolSyncTimestamp of the last tool synchronisation
conditionsStandard Kubernetes condition array
observedGenerationLast spec generation reconciled by the operator

Production example

The following manifest deploys a highly available agent with health probes, resource limits, and pod anti-affinity:
apiVersion: agent.flokoa.ai/v1alpha1
kind: Agent
metadata:
  name: production-agent
  namespace: production
spec:
  framework: pydantic-ai

  card:
    name: "Production Support Agent"
    description: "Customer-facing support agent with HA configuration"
    version: "2.0.0"
    defaultInputModes:
      - "application/json"
    defaultOutputModes:
      - "application/json"
    capabilities:
      streaming: true
    skills:
      - id: "support"
        name: "Customer Support"
        description: "Handle customer inquiries"
        tags: ["support"]

  model:
    name: gpt-4o-model

  tools:
    - toolRef:
        name: knowledge-base
    - toolRef:
        name: ticket-system

  runtime:
    type: standard
    spec:
      replicas: 3

      container:
        name: agent
        image: ghcr.io/example/support-agent:v2.0.0

        ports:
          - containerPort: 8080
            name: http
          - containerPort: 9090
            name: metrics

        env:
          - name: ENVIRONMENT
            value: "production"
          - name: API_KEY
            valueFrom:
              secretKeyRef:
                name: agent-secrets
                key: api-key

        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "2000m"
            memory: "2Gi"

        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3

        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          successThreshold: 1
          failureThreshold: 3

        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
          capabilities:
            drop:
              - ALL

      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    flokoa.ai/agent: production-agent
                topologyKey: topology.kubernetes.io/zone

kubectl operations

# List all agents and their phases
kubectl get agents

# Watch live status updates
kubectl get agents -w

# Inspect a specific agent in detail
kubectl describe agent my-agent

# Get the agent's endpoint URL
kubectl get agent my-agent -o jsonpath='{.status.url}'

# Scale replicas
kubectl patch agent my-agent --type='json' \
  -p='[{"op": "replace", "path": "/spec/runtime/spec/replicas", "value": 5}]'

# Update the container image
kubectl patch agent my-agent --type='json' \
  -p='[{"op": "replace", "path": "/spec/runtime/spec/container/image", "value": "new-image:v2.0.0"}]'

# Get pods belonging to an agent
kubectl get pods -l flokoa.ai/agent=my-agent

# Stream logs from all replicas
kubectl logs -l flokoa.ai/agent=my-agent --all-containers=true -f

# Delete an agent
kubectl delete agent my-agent

Best practices

  1. Always set resource requests and limits to prevent agents from starving or monopolising cluster nodes.
  2. Add liveness and readiness probes so Kubernetes can route traffic only to healthy replicas and self-heal on crashes.
  3. Run at least two replicas in production and combine with pod anti-affinity to spread them across zones.
  4. Declare the framework explicitlyspec.framework improves observability and future tooling integration.
  5. Never put secrets in the Agent spec — use secretKeyRef in env or a mounted Kubernetes Secret volume.
  6. Set container security contexts — run as non-root with readOnlyRootFilesystem: true and drop all capabilities.
  7. Pin image tags — avoid latest in production so rollbacks are predictable.
  8. Use standard mode for custom logic, template mode for prompt-driven agents — pick the mode that matches your workload.
  9. Share Model and AgentTool CRs across agents using cross-namespace references to reduce duplication.
  10. Start minimal and iterate — validate a one-replica, no-probe configuration before adding production hardening.

Troubleshooting

The most common causes are an inaccessible container image or insufficient cluster resources.
# Check events on the agent pods
kubectl describe pods -l flokoa.ai/agent=my-agent

# Verify the image is pullable
kubectl get pods -l flokoa.ai/agent=my-agent -o jsonpath='{.items[*].status.containerStatuses[*].state}'
  • Confirm the image tag exists in the registry.
  • If using a private registry, ensure imagePullSecrets is set.
  • Check that nodes have sufficient CPU and memory with kubectl describe nodes.
# Read container logs for the error
kubectl logs -l flokoa.ai/agent=my-agent --previous

# Inspect the full pod spec that was generated
kubectl get pod <pod-name> -o yaml
  • Check that all secretKeyRef secrets exist in the correct namespace.
  • Verify health probe paths (/health, /ready) are implemented in your image.
  • Ensure resource limits are not too low — OOMKilled pods show reason: OOMKilled in status.containerStatuses.
# Check current resource consumption
kubectl top pods -l flokoa.ai/agent=my-agent

# Review recent events
kubectl get events --field-selector involvedObject.name=my-agent
  • Increase CPU/memory requests and limits if the pod is being throttled.
  • Scale out replicas if all pods are consistently high on CPU.
  • Check tool call latency — slow external APIs directly impact agent response time.
# Confirm the Service was created
kubectl get svc -l flokoa.ai/agent=my-agent

# Test reachability from inside the cluster
kubectl run -it --rm debug --image=curlimages/curl --restart=Never -- \
  curl http://my-agent.<namespace>.svc.cluster.local:8080/health
  • If the Service is missing, check operator logs: kubectl logs -n flokoa-system deploy/flokoa-operator.
  • Review NetworkPolicies that may be blocking traffic to or from the agent pods.