Agent CRD is the core building block of Flokoa. It represents a fully deployable AI agent inside your Kubernetes cluster, combining container runtime configuration with LLM model references, tool bindings, and A2A (Agent-to-Agent) protocol metadata. When you apply an Agent manifest, the Flokoa operator reconciles it into a Kubernetes Deployment, creates a Service, and continuously reports the agent’s lifecycle state back through the resource’s status fields.
API reference
Minimal configuration
Every Agent requires aruntime block with a type and at least one container definition. The example below is the smallest valid manifest you can apply:
Runtime modes
Flokoa supports two runtime modes. Usestandard when you are bringing your own agent image, and template when you want the operator to manage the runtime for you.
- standard
- template
In
standard mode you provide your own container image. The operator wraps it in a Deployment and Service, but all application logic lives in your image.Spec reference
card — Agent metadata (A2A protocol)
card — Agent metadata (A2A protocol)
The
Each skill in
card block exposes your agent via the A2A (Agent-to-Agent) protocol and is the primary way other agents and orchestrators discover your agent’s capabilities.| Field | Type | Required | Description |
|---|---|---|---|
name | string | ✅ | Human-readable agent name |
description | string | ✅ | What the agent does |
version | string | ✅ | Semantic version of the agent |
defaultInputModes | string[] | ✅ | Accepted input MIME types (e.g. application/json) |
defaultOutputModes | string[] | ✅ | Produced output MIME types |
capabilities.streaming | bool | — | Whether the agent supports streaming responses |
capabilities.pushNotifications | bool | — | Whether the agent supports push notifications |
capabilities.stateTransitionHistory | bool | — | Whether the agent exposes task state history |
skills | object[] | ✅ | List of skills (see below) |
skills has:| Field | Description |
|---|---|
id | Unique skill identifier |
name | Human-readable skill name |
description | What the skill does |
tags | Categorization keywords |
examples | Sample prompts demonstrating the skill |
inputModes | Per-skill input MIME override |
outputModes | Per-skill output MIME override |
runtime — Deployment configuration
runtime — Deployment configuration
The
runtime block controls how your agent is deployed.| Field | Description |
|---|---|
type | standard (your image) or template (operator-managed) |
spec.replicas | Number of pod replicas (default: 1) |
spec.container | Full Kubernetes container spec (standard mode) |
spec.volumes | Pod volumes (standard mode) |
spec.serviceAccountName | ServiceAccount for the pod |
spec.nodeSelector | Schedule pods on matching nodes |
spec.tolerations | Allow scheduling on tainted nodes |
spec.affinity | Advanced scheduling rules |
spec.imagePullSecrets | Secrets for private registries |
spec.securityContext | Pod-level security attributes |
spec.config | Agent config with output schema (template mode) |
spec.env | Extra environment variables (template mode) |
spec.resources | CPU/memory requests and limits (template mode) |
model — LLM reference
model — LLM reference
Attach a
Model CR so your agent has access to an LLM at runtime. The operator injects the necessary connection details as environment variables.| Field | Description |
|---|---|
model.name | Name of the Model CR |
model.namespace | Namespace of the Model CR (defaults to agent’s namespace) |
instruction — System prompt
instruction — System prompt
Attach a system prompt either inline or by referencing an existing
Instruction CR. Supported in both standard and template runtime modes.tools — Tool bindings
tools — Tool bindings
Tools give your agent access to external APIs. You can reference an existing
AgentTool CR or define a tool inline.framework — Observability hint
framework — Observability hint
Explicitly declaring the AI framework lets Flokoa and your observability stack identify the agent’s type in logs and metrics.Supported values:
pydantic-ai, langchain, google-adk, crewai, marvin, autogen, a2aStatus fields
The operator writes the following fields tostatus after reconciling an Agent:
| Field | Description |
|---|---|
phase | Lifecycle phase: Pending, Running, or Failed |
backend | Active runtime backend |
url | In-cluster endpoint for calling the agent |
replicas | Current number of pod replicas |
availableReplicas | Number of replicas that are ready |
detectedFramework | Framework detected from the container image |
lastToolSync | Timestamp of the last tool synchronisation |
conditions | Standard Kubernetes condition array |
observedGeneration | Last spec generation reconciled by the operator |
Production example
The following manifest deploys a highly available agent with health probes, resource limits, and pod anti-affinity:kubectl operations
Best practices
- Always set resource requests and limits to prevent agents from starving or monopolising cluster nodes.
- Add liveness and readiness probes so Kubernetes can route traffic only to healthy replicas and self-heal on crashes.
- Run at least two replicas in production and combine with pod anti-affinity to spread them across zones.
- Declare the framework explicitly —
spec.frameworkimproves observability and future tooling integration. - Never put secrets in the Agent spec — use
secretKeyRefinenvor a mounted Kubernetes Secret volume. - Set container security contexts — run as non-root with
readOnlyRootFilesystem: trueand drop all capabilities. - Pin image tags — avoid
latestin production so rollbacks are predictable. - Use
standardmode for custom logic,templatemode for prompt-driven agents — pick the mode that matches your workload. - Share Model and AgentTool CRs across agents using cross-namespace references to reduce duplication.
- Start minimal and iterate — validate a one-replica, no-probe configuration before adding production hardening.
Troubleshooting
Agent is stuck in Pending phase
Agent is stuck in Pending phase
The most common causes are an inaccessible container image or insufficient cluster resources.
- Confirm the image tag exists in the registry.
- If using a private registry, ensure
imagePullSecretsis set. - Check that nodes have sufficient CPU and memory with
kubectl describe nodes.
Agent pods are crash-looping
Agent pods are crash-looping
- Check that all
secretKeyRefsecrets exist in the correct namespace. - Verify health probe paths (
/health,/ready) are implemented in your image. - Ensure resource limits are not too low — OOMKilled pods show
reason: OOMKilledinstatus.containerStatuses.
Performance is degraded or responses are slow
Performance is degraded or responses are slow
- Increase CPU/memory requests and limits if the pod is being throttled.
- Scale out replicas if all pods are consistently high on CPU.
- Check tool call latency — slow external APIs directly impact agent response time.
Networking or service connectivity issues
Networking or service connectivity issues
- If the Service is missing, check operator logs:
kubectl logs -n flokoa-system deploy/flokoa-operator. - Review NetworkPolicies that may be blocking traffic to or from the agent pods.
