GitOpsHQ Docs
Deployment

Rollback Operations

Complete guide to rollback modes, selective rollback, approval workflows, and safe incident recovery.

What Is Rollback

Rollback reverts a deployment to a previous known-good state. When an incident occurs — a bad image tag, a misconfigured variable, a broken manifest — rollback lets you recover quickly by restoring configuration from a specific point in deployment history.

GitOpsHQ supports multiple rollback strategies with different scopes and granularity, so you can match the recovery action to the severity and scope of the incident.

Rollback Is Not Undo

Rollback creates a new commit that reverts state to a previous revision. It does not erase history. Every rollback action is fully traceable in both Git history and the GitOpsHQ audit trail.

Feature Gating

Rollback capabilities are gated behind feature flags: basic_rollback for full mode, rollback for the general rollback framework, selective_rollback for service and field-level modes, and rollback_approval for the approval workflow integration.

Rollback Modes

GitOpsHQ offers three rollback modes, each targeting a different scope of recovery:

Full Rollback

Reverts the entire tenant-environment to a previous deployment state. All services, bindings, variables, and configurations are restored to the selected historical revision.

When to use:

  • Multiple services are affected by the incident
  • You need to restore a known-good state quickly
  • The exact failing component is unclear

What it reverts:

  • All workload image tags and configurations
  • All Kustomize and manifest bindings
  • All HQ Variable values (as they were at the target revision)
  • All environment-level configuration

Trade-off: Full rollback may revert healthy services alongside the broken one. Use service or field-level rollback if you need surgical precision.

Service Rollback (Pro)

Reverts a single service within a tenant-environment to a previous state. All other services remain at their current revision.

When to use:

  • The incident is isolated to a specific microservice
  • Other services are healthy and should not be disturbed
  • You want to minimize blast radius during recovery

What it reverts:

  • The selected service's image tag and configuration
  • The selected service's values and bindings
  • Service-scoped HQ Variables for this service

Requirement: Requires Pro plan or higher.

Field-Level Rollback (Enterprise)

Reverts specific configuration fields for a service (e.g., only the image tag, only the replica count) while keeping all other fields at their current values.

When to use:

  • You know exactly which field caused the incident
  • You want maximum granularity — change one value, touch nothing else
  • Post-incident, you need to revert a specific variable without affecting other recent changes

What it reverts:

  • Only the explicitly selected fields within a service configuration

Requirement: Requires Enterprise plan. Field selection must be non-empty — the backend rejects empty field selections.

Mode Comparison

AspectFullService (Pro)Field-Level (Enterprise)
ScopeEntire tenant-environmentSingle serviceSpecific fields within a service
Blast radiusHighMediumMinimal
SpeedFastest (single revert commit)FastModerate (field-level diffing)
PrecisionLowMediumHigh
RiskMay revert healthy servicesIsolated to one serviceSurgical, but requires knowing the root cause
PlanAll plansPro+Enterprise

Selective Rollback Wizard

The SelectiveRollbackWizard provides a guided, step-by-step interface for executing rollbacks safely.

Select rollback scope. Choose between Full, Service, or Field-Level rollback. The wizard explains the blast radius of each mode before you proceed.

Choose target tenant-environment. Select the tenant and environment where the rollback will be executed. The wizard shows the current deployment state and health status.

Select the target revision. Browse the deployment history for this tenant-environment. Each historical deployment shows its timestamp, changes, deployer, and revision ID. Select the revision you want to roll back to.

Preview the rollback diff. The wizard generates a detailed diff showing exactly what will change:

  • Current state vs. target revision state
  • Per-service breakdown of affected fields
  • For field-level mode: only selected fields are shown
  • Impact summary: number of services, bindings, and variables affected

Submit rollback. Add a justification explaining why the rollback is needed. If the environment's approval policy requires it, the rollback enters the approval queue. Otherwise, it executes immediately.

Deployment History

The DeploymentHistoryPanel shows the full deployment timeline for a tenant-environment, serving as the source of truth for rollback target selection.

ColumnDescription
RevisionUnique deployment revision identifier
TimestampWhen the deployment was executed
AuthorWho triggered the deployment
ChangesSummary of what changed in this deployment
SourceWhether this came from a release, promotion, or rollback
StatusDeployment result (success, failed, rolled back)
Actions"Rollback to this revision" button, "Compare with current" link

Comparing Revisions

Click "Compare with current" on any historical revision to see a side-by-side diff between that revision and the current deployed state. This helps you choose the right rollback target — especially when multiple deployments have occurred since the last known-good state.

Rollback Approval Workflow

Rollback requests follow the same approval framework as releases, with some rollback-specific considerations:

Loading diagram…

Approval Rules

RuleDescription
Standard approvalRollback requests follow the environment's configured approval policy (same as releases)
Self-approval blockingThe person who submitted the rollback cannot approve it (configurable per environment)
Quorum supportIf the environment requires N approvals, the rollback request needs the same quorum
Break-glass overrideIn emergencies, a break-glass session can bypass the approval requirement

Break-Glass for Incidents

During a critical incident, waiting for approval may not be acceptable. Break-glass is not a dedicated rollback feature — it is part of the general project approval flow. When a rollback requires approval, the system uses maybeCreateProjectApprovalRequest which checks mayUseBreakGlass. If the user has an active break-glass session, the approval requirement is bypassed:

  1. Open a break-glass session from the project's approval settings (this applies to all protected actions, not just rollback)
  2. Provide a justification for the emergency override
  3. Submit the rollback — approval is bypassed automatically
  4. The break-glass session, justification, and rollback are all recorded in the audit trail
  5. Post-incident review can examine break-glass actions

Break-Glass Audit

Break-glass bypasses approval but does not bypass audit. Every break-glass rollback is prominently flagged in the audit trail for post-incident review.

Rollback and Git

All rollback operations produce Git commits in the hosted repository, maintaining the GitOps principle that Git is the source of truth.

Rollback ModeGit Behavior
FullCreates a single revert commit restoring the entire tenant-environment directory to the target revision's state
ServiceCreates a targeted commit that reverts only the files belonging to the selected service
Field-LevelCreates a patch commit that modifies only the specific fields in the affected files

Commit Metadata

Every rollback commit includes structured metadata:

Rollback: tenant/acme env/production → revision abc123

Mode: service
Service: api-gateway
Justification: Bad image tag v2.1.0 causing 500 errors
Author: jane@example.com
Approval: Approved by bob@example.com (quorum: 1/1)
Rollback Request ID: rbk-789xyz

The delivery generator regenerates manifests from the reverted state, ensuring that the cluster receives a consistent set of Kubernetes resources.

Rollback Requests (Enterprise)

Enterprise plans support formal rollback requests with structured metadata and governance:

Request Structure

FieldDescription
ScopeFull, service, or field-level
Target revisionThe deployment revision to roll back to
JustificationRequired text explaining why the rollback is needed
Affected servicesList of services that will be changed (auto-calculated)
Selected fieldsFor field-level rollback: the specific fields to revert

Request States

StateDescription
pendingRequest submitted, awaiting approval
approvedApproval quorum met, ready to execute
rejectedAn approver rejected the request
executingRollback is in progress
completedRollback successfully executed, revert commit created
failedExecution failed (see error details)

Post-Rollback Actions

After a rollback is executed, follow these verification steps:

Verify Git state. Check that the revert commit appears in the hosted repository with the correct content.

Check ArgoCD sync status. Confirm that ArgoCD detects the new commit and syncs the reverted manifests to the cluster. The cluster agent reports sync status in the Cluster Detail page.

Verify cluster health. Use the Cluster Detail page's inventory and drift tabs to confirm the rollback took effect. Check pod status, replica counts, and service health.

Review drift detection. Ensure there is no drift between the desired state (Git) and the actual state (cluster). Any drift indicates the rollback may not have fully propagated.

Document the incident. Add post-rollback notes to the rollback request or release. Capture the root cause and corrective actions for team review.

Error Handling

Rollback operations can encounter several blocking conditions:

ScenarioBehaviorResolution
Concurrent changesConflict detected between rollback and in-flight changesResolve the conflict manually or wait for the concurrent operation to complete
Agent unreachableGit commit succeeds but cluster sync cannot be confirmedThe rollback is safe in Git; sync will occur when the agent reconnects
Invalid target revisionThe selected revision does not exist or is corruptedChoose a different revision from the deployment history
Empty field selectionField-level rollback with no fields selectedSelect at least one field to revert

Error Reference

Error CodeMessageMeaning
400targetSha is requiredNo target revision was specified in the rollback request
400rollback request is not pending approvalThe request has already been processed (approved, rejected, or executed)
400selected fields cannot be empty for field modeField-level rollback requires at least one field selection
403cannot approve/reject your own rollback requestSelf-approval is blocked for this environment
409you have already reviewed this rollback requestEach approver can only submit one review per request
500approval saved but rollback execution failedThe approval was recorded but the Git commit failed — retry execution

Best Practices

  • Start with the smallest scope. If you know which service caused the incident, use service rollback instead of full rollback.
  • Always preview before executing. The rollback diff shows exactly what will change — review it even under time pressure.
  • Use break-glass responsibly. It exists for real emergencies. Post-incident, review all break-glass actions.
  • Document justifications. Rollback justifications are invaluable during post-incident reviews and compliance audits.
  • Monitor after rollback. A successful rollback commit does not guarantee a healthy cluster. Verify sync status, pod health, and drift.
  • Keep deployment history clean. Frequent rollbacks to the same revision indicate a recurring issue that needs root-cause analysis, not repeated rollbacks.

On this page