GitOpsHQ Docs
Platform Setup

Clusters & Agents

Complete guide to cluster registration, agent installation, runtime visibility, inventory, log streaming, and command operations.

Cluster Concept

A cluster in GitOpsHQ represents a registered Kubernetes cluster with an active agent connection. Clusters are the deployment targets for your project deliveries — the delivery generator commits manifests to Git, and the cluster agent ensures those manifests are synced and running.

Clusters also provide runtime visibility: inventory browsing, resource inspection, drift detection, log streaming, and operational commands — all from the GitOpsHQ UI without requiring direct kubectl access.

Agent-First Design

The agent initiates all connections outbound to the GitOpsHQ hub. No inbound ports or firewall rules are required on the cluster side. This makes the agent safe to deploy in locked-down environments, private networks, and air-gapped setups with outbound-only proxy access.

Cluster Registration

Register a cluster from the Clusters page (/clusters). The backend API is POST /api/v2/clusters.

Click "Register Cluster." Provide a cluster name, display name, provider (AWS EKS, GKE, AKS, on-prem), region, and environment. These map to the agent.clusterName, agent.displayName, agent.provider, agent.region, and agent.environment Helm values.

Receive the registration token and install command. The API returns a registrationToken (prefixed ghqa_reg_, 64 hex characters) and a ready-to-use installCommand. The token expires in 1 hour and is single-use -- copy it immediately.

Install the GitOpsHQ Agent -- On the target cluster, run the Helm install command provided during registration. The agent establishes a secure outbound gRPC connection to the GitOpsHQ hub and completes registration by calling the agent.v1.AgentHub/Register RPC. On success, the hub issues a long-lived agent token (prefixed ghqa_) that the agent persists automatically.

Verify connection. The cluster status shows Connected in the dashboard within 60 seconds. You can re-issue a registration token from the Cluster Detail page via POST /api/v2/clusters/{id}/join-token if the original expires.

Cluster Labels

Labels are key-value pairs attached to clusters for organization and filtering:

Label ExampleUse Case
region:eu-west-1Filter clusters by geographic region
tier:productionDistinguish production from staging/dev clusters
provider:eksGroup by cloud provider
team:platformTrack cluster ownership

Labels are used in binding rules and approval policies to target specific cluster groups.

Agent Installation

The agent is a lightweight Go binary that runs as a Kubernetes Deployment in the target cluster.

Installation Methods

The GitOpsHQ Agent chart is distributed as an OCI artifact. No helm repo add is needed.

helm upgrade --install gitopshq-agent \
  oci://ghcr.io/gitopshq-io/charts/gitopshq-agent \
  --namespace gitopshq-system \
  --create-namespace \
  --set registrationToken="ghqa_reg_a1b2c3d4..." \
  --set hub.address="agent.gitopshq.io:50051" \
  --set capabilities.observe=true \
  --set capabilities.argocdRead=true

Copy from the UI

You do not need to build this command manually. The cluster registration dialog returns a ready-to-paste installCommand with your token and hub address pre-filled. The command above is shown for reference.

Core values:

ValueDefaultDescription
registrationToken""One-time registration token (ghqa_reg_*). Expires in 1 hour.
hub.addressgitopshq.example.internal:50051GitOpsHQ hub gRPC endpoint
hub.statusIntervalSeconds30Heartbeat interval in seconds
agent.clusterNamedefaultLogical cluster identifier
agent.displayName""Human-readable name shown in the UI
agent.provider""Cloud provider (e.g., eks, gke, aks, on-prem)
agent.region""Region label
agent.environment""Environment label (e.g., production, staging)
agent.tokenPath/var/lib/gitopshq/identity/agent.jsonPath where agent persists its identity

Identity persistence:

ValueDefaultDescription
persistence.enabledtrueEnable identity persistence
persistence.typesecretsecret (K8s Secret) or pvc (PersistentVolumeClaim)
persistence.secretName""Explicit Secret name (auto-generated if empty)
persistence.size128MiPVC size (only when type: pvc)

Capabilities (each toggles a feature flag):

ValueDefaultDescription
capabilities.observetrueInventory browsing and resource inspection
capabilities.diagnosticsReadtrueNamespace-scoped pod diagnostics and log streaming
capabilities.argocdReadtrueRead ArgoCD application status
capabilities.argocdWritefalseSync and rollback ArgoCD applications
capabilities.directDeployfalseApply manifests directly (non-ArgoCD mode)
capabilities.kubernetesRestartfalseRolling restart of Deployments
capabilities.kubernetesScalefalseScale Deployment replicas
capabilities.credentialSyncfalseSync secrets from hub to cluster
capabilities.tokenRotatetrueAllow hub-initiated token rotation

RBAC, Security, and Networking:

ValueDefaultDescription
rbac.createtrueCreate ClusterRole and binding
rbac.profilereadonlyRBAC profile (readonly or readwrite)
serviceAccount.createtrueCreate a dedicated ServiceAccount
tls.insecurefalseSkip TLS verification (dev only)
proxy.httpProxy""HTTP proxy for outbound connections
proxy.httpsProxy""HTTPS proxy for outbound connections
proxy.noProxy""No-proxy exclusion list

Optional integrations:

ValueDefaultDescription
argocd.server""ArgoCD API server address
argocd.token""ArgoCD API token
argocd.insecurefalseSkip TLS for ArgoCD
diagnostics.allowedNamespaces[]Namespaces the agent can read diagnostics from
credentialSync.modemirroredCredential sync strategy
credentialSync.targets[]Target namespaces for credential sync
directDeploy.defaultNamespace""Default namespace for direct deploys
directDeploy.fieldManagergitopshq-agentServer-side apply field manager

For environments where Helm is not available, you can deploy the agent manually. Create the namespace, inject the registration token as an environment variable, and apply the agent Deployment.

kubectl create namespace gitopshq-system

kubectl create secret generic gitopshq-agent-registration \
  --namespace gitopshq-system \
  --from-literal=GITOPSHQ_REGISTRATION_TOKEN="ghqa_reg_a1b2c3d4..."

kubectl create secret generic gitopshq-agent-hub \
  --namespace gitopshq-system \
  --from-literal=GITOPSHQ_HUB_ADDRESS="agent.gitopshq.io:50051"

Then create a Deployment that references these secrets as environment variables. The agent binary reads configuration entirely from environment variables:

Env VariableRequiredDescription
GITOPSHQ_HUB_ADDRESSYesHub gRPC endpoint (e.g., agent.gitopshq.io:50051)
GITOPSHQ_REGISTRATION_TOKENYes (first run)One-time registration token
GITOPSHQ_IDENTITY_STORE_MODENofile or secret (default: file)
GITOPSHQ_AGENT_TOKEN_PATHNoIdentity file path (default: /tmp/gitopshq-agent-token)
GITOPSHQ_CLUSTER_NAMENoLogical cluster name (default: default)
GITOPSHQ_CAPABILITIESNoComma-separated capabilities (default: observe)
GITOPSHQ_HUB_INSECURENoSkip TLS verification (default: false)

Helm is strongly recommended

The Helm chart handles RBAC, identity Secret management, capability flags, and secure defaults automatically. Manual deployment requires you to maintain all of these yourself.

Agent RBAC Requirements

The Helm chart creates RBAC resources based on the rbac.profile value and enabled capabilities:

readonly profile (default) -- cluster-wide read access for inventory and inspection:

ResourceVerbsCapability
pods, deployments, services, configmaps, secrets, statefulsets, daemonsets, jobs, cronjobs, ingresses, replicasetsget, list, watchobserve
pods/loggetdiagnosticsRead
eventsget, list, watchobserve
namespacesget, listobserve
applications.argoproj.ioget, list, watchargocdRead

readwrite profile -- adds write verbs for operational commands:

ResourceExtra VerbsCapability
deploymentspatchkubernetesRestart, kubernetesScale
applications.argoproj.iopatch, updateargocdWrite
secretscreate, get, update, patchcredentialSync
All core resourcescreate, update, patch, deletedirectDeploy

Identity Secret RBAC -- when persistence.type: secret, the chart grants create, get, update, patch on the identity Secret so the agent can persist its token after registration.

Agent Architecture

The agent is designed for security, reliability, and minimal footprint:

Loading diagram…

Connection Model

AspectDetail
ProtocolgRPC over HTTP/2 with TLS. Default port 50051 (configurable via AGENT_GRPC_PUBLIC_ADDRESS on the hub).
DirectionAgent initiates outbound -- no inbound ports or firewall rules required.
RegistrationUnary RPC: agent.v1.AgentHub/Register. Authenticates with Authorization: Bearer <registrationToken>.
Ongoing streamBidirectional RPC: agent.v1.AgentHub/Connect. Authenticates with Authorization: Bearer <agentToken>.
HeartbeatAgent sends status every hub.statusIntervalSeconds (default: 30s). Hub marks the cluster disconnected if no heartbeat arrives within the timeout window.
ReconnectionAutomatic exponential backoff on disconnect. The agent re-establishes the Connect stream using its persisted agent token.

Token Lifecycle

PhaseToken PrefixFormatLifetime
Registrationghqa_reg_64 hex characters (32 random bytes)Expires in 1 hour. Single use -- consumed on first successful Register RPC.
Agent (post-registration)ghqa_64 hex characters (32 random bytes)Long-lived. Stored as SHA-256 hash on the hub.
Rotationghqa_Same formatHub issues a new agent token via POST /api/v2/clusters/{id}/rotate-token. Old token invalidated after grace period.

Token Security

Registration tokens are displayed once during cluster registration and expire in 1 hour. If lost or expired, issue a new one from the Cluster Detail page (POST /api/v2/clusters/{id}/join-token). Agent tokens are never displayed in the UI -- the agent persists them locally (as a Kubernetes Secret or file) and rotates them automatically when capabilities.tokenRotate is enabled.

Cluster Detail Page

The Cluster Detail page (/clusters/:id) is the operational hub for a registered cluster. It provides real-time visibility into the cluster's state, workloads, and health.

Health Status

The cluster header always shows the current connection health:

StatusIndicatorMeaning
ConnectedGreenAgent is connected and heartbeat is current
DisconnectedRedAgent has not sent a heartbeat within the timeout window (default: 90 seconds)
LaggingYellowAgent is connected but heartbeat is delayed (network instability or agent load)

Cluster Tabs

Inventory

Browse the cluster's namespaces and resources in real-time.

  • Namespace list with resource counts per type (pods, deployments, services, configmaps, secrets)
  • Resource table filtered by namespace and type
  • Resource summary: name, status, age, labels
  • Click any resource to open the Resource Inspection view

The inventory is refreshed periodically by the agent and on-demand when you navigate to the tab.

ArgoCD Applications

If ArgoCD is running in the cluster, this tab shows all ArgoCD Application resources:

ColumnDescription
NameApplication name
Sync StatusSynced, OutOfSync, Unknown
Health StatusHealthy, Degraded, Progressing, Missing
SourceGit repository and path
DestinationTarget namespace
Last SyncTimestamp of last successful sync

Quick actions: Force Sync, Hard Refresh, View in ArgoCD UI (if ArgoCD dashboard URL is configured).

Drift Findings

Drift detection compares the desired state (from Git) with the actual state (from the cluster) for each resource managed by GitOpsHQ deliveries.

  • Per-resource drift detail with field-level differences
  • Drift severity classification (cosmetic vs. functional)
  • "Resolve" action: force sync from Git to cluster
  • Drift history: when drift was first detected, how long it persisted

Bindings

Shows which project deliveries target this cluster:

ColumnDescription
ProjectThe project that generates manifests for this cluster
TenantWhich tenants are bound to this cluster
EnvironmentWhich environments target this cluster
Last DeliveryTimestamp and revision of the last deployed delivery
Sync StatusWhether the latest delivery is synced to the cluster

Commands

History of all commands sent to the cluster and their execution results.

ColumnDescription
CommandThe operation type (sync, rollback, restart, scale)
TargetThe resource targeted by the command
RequesterWho initiated the command
Statuspending, executing, completed, failed, rejected
TimestampWhen the command was issued and completed
OutputCommand execution output or error message

Filter by command type, status, or date range to investigate operational history.

Log Streaming

Real-time pod log streaming powered by xterm.js.

  1. Select namespace from the dropdown.
  2. Select pod from the filtered pod list.
  3. Select container (for multi-container pods).
  4. Start streaming. Logs appear in real-time in the terminal viewer. Previous logs (tail) are loaded first, then the stream follows new log lines.

Features:

  • Search within logs: Ctrl+F to find specific log entries
  • Pause/resume: Pause the stream to inspect a specific section
  • Download: Export the current log buffer as a text file
  • Timestamps: Toggle timestamp display on each log line
  • Follow mode: Auto-scroll to latest log entries (enabled by default)

Credential Sync

Status of credential synchronization from GitOpsHQ to the cluster (e.g., image pull secrets, TLS certificates):

ColumnDescription
CredentialThe credential name and type
Target NamespaceWhere the credential is synced in the cluster
Sync Statussynced, pending, failed
Last SyncedTimestamp of last successful sync
ErrorError message if sync failed

Resource Inspection

Deep-dive into any Kubernetes resource:

  • YAML view: Full resource YAML with syntax highlighting
  • Events: Kubernetes events related to the resource (sorted by timestamp)
  • Related resources: Linked resources (e.g., a Deployment's ReplicaSets and Pods)
  • Labels and annotations: Searchable label/annotation display
  • Conditions: Resource condition table (for Deployments, Nodes, Pods)

Cluster Commands

Commands are operational actions sent to the cluster agent for execution. They provide controlled cluster operations without requiring direct kubectl access.

Available Commands

CommandDescriptionTargetDefault Approval
SyncForce ArgoCD to reconcile an application with its Git sourceArgoCD ApplicationConfigurable
RollbackRevert an ArgoCD application to a previous sync revisionArgoCD ApplicationUsually required
RestartTrigger a rolling restart of a Deployment (updates pod template annotation)DeploymentConfigurable
ScaleChange the replica count of a DeploymentDeploymentConfigurable

Command Execution Flow

Loading diagram…

Command Approval

Commands can require approval based on cluster approval policies:

  • Bind an approval policy to a cluster from the Cluster Detail settings
  • Configure which command types require approval per cluster
  • Production clusters typically require approval for Rollback and Scale
  • Development clusters can auto-approve all commands
  • Command approval requests appear in the unified Approvals inbox (/approvals)

Cluster Bindings

Bindings connect project deliveries to clusters — they define where generated manifests are deployed.

Binding Types

BindingDescription
Agent bindingLinks a project's delivery output to a specific cluster. The cluster receives manifests for the bound tenant-environments.
Multi-projectMultiple projects can target the same cluster. Each project's delivery output is scoped to its own namespace/path.
Multi-clusterA single project can target multiple clusters (e.g., for multi-region deployments). Each cluster receives the same or region-specific manifests.

Binding Configuration

Bindings are configured at the project level under Settings → Targets:

  1. Select the cluster to bind
  2. Map tenant-environments to the cluster
  3. Configure the target namespace pattern (e.g., {{ .tenant }}-{{ .environment }})
  4. Optionally set cluster-specific overrides (e.g., region-specific storage class)

Cluster Approval Policies

Approval policies can be bound directly to clusters to control operational actions:

Policy SettingDescription
Command approvalsWhich command types (sync, rollback, restart, scale) require approval on this cluster
Required approversNumber of approvals needed for gated commands
Break-glassAllow authorized operators to bypass approval in emergencies
Auto-approveList of command types that execute immediately without approval

Cluster approval requests appear in the unified Approvals inbox alongside release and promotion approvals.

Agent Health Monitoring

Monitor agent health across all clusters from the Clusters list page:

MetricDescription
Connection uptimePercentage of time the agent has been connected over the last 24h/7d/30d
Heartbeat latencyAverage time between heartbeat signals
Last seenTimestamp of the most recent agent heartbeat
Agent versionCurrently running agent version
CapabilitiesReported capability flags: observe, diagnostics.read, argocd.read, argocd.write, direct.deploy, kubernetes.restart, kubernetes.scale, credential.sync, token.rotate

Version Drift

If the agent version is significantly behind the latest release, some features may not work correctly. The Clusters page highlights outdated agents with a version warning badge.

Error Reference

Best Practices

Monitor agent health proactively. Set up notifications for agent disconnection events. A disconnected agent means no runtime visibility and no command execution — but Git-based deployments continue to work via ArgoCD.

Rotate agent tokens on schedule. Enable automatic token rotation (configured in the Helm chart) and review rotation logs periodically.

Use approval policies on production clusters. Require approval for Rollback and Scale commands on production clusters. Auto-approve Sync commands if your ArgoCD configuration is reliable.

Keep agent versions current. Update the agent Helm chart regularly to receive new capabilities, security patches, and performance improvements.

Use log streaming for incident investigation. Real-time log streaming eliminates the need for direct cluster access during incidents, reducing the number of people who need kubectl credentials.

Label clusters consistently. A consistent labeling scheme makes it easy to filter clusters, configure binding rules, and apply approval policies across cluster groups.

On this page