GitOpsHQ Docs
Team Playbooks

DevOps Playbook

Operational guide for DevOps engineers managing releases, promotions, workloads, and CI/CD integrations in GitOpsHQ.

Your Role in GitOpsHQ

As a DevOps Engineer, you are the bridge between development intent and production reality. You own the delivery pipeline: creating and managing releases, orchestrating promotions across environments, configuring workloads, integrating CI/CD systems, and ensuring that every deployment follows the governance rules established by the Platform team.

You work closely with developers (who submit configuration changes) and SREs (who monitor runtime health and handle incidents). Your primary concern is controlled, predictable delivery — getting the right changes to the right environments at the right time.

Your Access Scope

You typically have write access to all project environments and can create, approve, and execute releases for non-production environments. Production deployments may require SRE or team-lead approval depending on your organization's policies.

Daily Operating Loop

Follow this sequence to manage your daily delivery operations.

Review the Delivery Dashboard — Start by checking the dashboard for pending items: releases awaiting approval, promotions in progress, failed deployments, and drift alerts. The delivery health panel shows your pipeline's throughput and any bottlenecks.

Check Pending Releases — Navigate to Releases → Pending. Review releases submitted by developers. For each release, examine the diff, verify the change scope is appropriate, and check that the release notes explain the intent.

Validate Generated Output — Before approving a release, use the delivery generator preview to see the final rendered Kubernetes manifests. This catches issues that value-level diffs cannot show, such as template rendering errors or missing variable substitutions.

Create or Approve Releases — For your own changes, create releases directly. For developer-submitted releases, review and approve if the changes are correct. Add approval comments to document your review rationale.

Execute Deployments — Once a release is approved, trigger the deployment. Monitor the deployment timeline as it progresses through the delivery generator, Git commit, and cluster sync stages. A typical deployment completes within 2–5 minutes.

Run Promotions — When a release has been verified in a lower environment, promote it to the next stage. The promotion pipeline respects the environment ordering and approval gates. Use batch promotion for multi-tenant rollouts.

Verify Runtime Health — After each deployment or promotion, check the cluster status to confirm pods are healthy, services are responding, and no drift has been introduced. The deployment timeline links directly to the runtime status.

Handle Failed Deployments — If a deployment fails, investigate the cause using the delivery generator logs, manifest validation results, and cluster sync status. Common causes include invalid YAML, missing secrets, and resource quota violations.

Update Audit Notes — Document deployment outcomes in the release notes. For failed deployments, record what went wrong and what was done to fix it. This builds an operational knowledge base over time.

Release Management

Release Lifecycle

Every release passes through a defined lifecycle. Understanding each state helps you manage your delivery pipeline effectively.

StateDescriptionYour Action
DraftChanges selected but not yet submittedFinalize scope and submit
Pending ApprovalWaiting for required approvalsReview and approve (or request changes)
ApprovedAll approvals receivedTrigger deployment
DeployingDelivery generator rendering and committingMonitor progress
DeployedManifests committed to GitOps repoWait for sync
SyncedCluster agent has applied the manifestsVerify health
HealthyAll resources are in desired stateDone — move to next promotion stage
FailedDeployment or sync failedInvestigate and remediate
Rolled BackReverted to a previous stateDocument and analyze

Release Best Practices

  • One concern per release — A release that updates an image tag should not also change resource limits unless they are related.
  • Descriptive titles — Use a format like [service] [action] [version]: payment-api: bump to v2.5.0.
  • Include rollback context — Note in the release description what to roll back to if the change causes issues.
  • Test in lower environments first — Always deploy to development and staging before production. Use the promotion pipeline to enforce this.

Promotion Pipeline

Promotions move verified changes from one environment to the next.

Loading diagram…

Promotion Checklist

Before promoting a release to the next environment:

  • Verify all pods are healthy in the current environment
  • Check for drift — no unexpected changes should be present
  • Review any environment-specific value overrides for the target
  • Confirm approval requirements for the target environment
  • Have a rollback plan ready before promoting to production

Workload Management

Creating a Workload

Navigate to your project and open Workloads → New Workload.
Select a chart from the registry (or reference an external chart).
Name the workload and specify the target namespace.
Configure the base values that apply across all environments.
Add environment-specific overrides using the values editor or HQ Variables.
Bind the workload to tenants (deployment targets).

Workload Updates

When updating a workload's chart version (e.g., upgrading from microservice-base@1.0.0 to 1.1.0):

  1. Preview the chart diff to understand what templates changed
  2. Check if any new values are required by the new chart version
  3. Update the chart version reference in the workload
  4. Test in development before promoting the chart upgrade to other environments

CI/CD Integration

Webhook Setup

GitOpsHQ supports inbound webhooks for automated value updates triggered by your CI pipeline.

Create a service account for your CI system with scoped permissions (release creation on target projects).
Generate an API token and store it securely in your CI system's secrets.
Add a webhook step to your CI pipeline that calls the GitOpsHQ API after a successful build.
Configure auto-release (optional) to automatically create a release when values are updated via webhook.

Example: GitHub Actions Integration

name: Deploy to GitOpsHQ
on:
  push:
    branches: [main]
    
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Build and push image
        run: |
          docker build -t registry.example.com/my-app:${{ github.sha }} .
          docker push registry.example.com/my-app:${{ github.sha }}
      
      - name: Update GitOpsHQ workload
        run: |
          curl -X PATCH \
            https://api.gitopshq.io/v1/projects/${{ vars.PROJECT_ID }}/workloads/${{ vars.WORKLOAD_ID }}/values \
            -H "Authorization: Bearer ${{ secrets.GITOPSHQ_TOKEN }}" \
            -H "Content-Type: application/json" \
            -d '{"updates": [{"path": "image.tag", "value": "${{ github.sha }}"}]}'

Decision Rules

Follow these rules when making operational decisions:

  • Never skip preview for production-scoped changes — Even if you are confident, the preview catches rendering issues that value-level inspection cannot.
  • Treat freeze and approval blockers as governance requirements — They exist for a reason. If you believe a freeze or approval is incorrect, escalate to the Platform team rather than working around it.
  • Prefer scoped rollouts — Deploy to a canary tenant first, verify, then promote to broader audiences. The promotion pipeline supports this natively.
  • Keep rollback readiness visible — Before executing a high-risk promotion, confirm that the previous release is available as a rollback target.
  • Document everything — Release notes, approval comments, and deployment outcomes create a knowledge base that helps the entire team learn from past operations.

Handoff Contracts

With Platform Teams

  • Request missing permissions or policies as structured governance changes, not ad-hoc exceptions
  • Provide feedback on approval policies that create unnecessary bottlenecks
  • Report any RBAC gaps that force workarounds

With Developers

  • Require clear release descriptions that explain intent, not just changes
  • Push back on overly broad releases that mix unrelated changes
  • Ensure developers use Environment Compare before submitting releases

With SRE

  • Include runtime evidence and rollback plans for risky operations
  • Coordinate deployment windows with incident response schedules
  • Provide clear escalation paths when deployments need emergency rollback

Key Metrics to Track

MetricWhat It Tells YouHealthy Range
Deploy frequencyHow often you shipMultiple per day
Lead time (commit to deploy)Pipeline efficiencyUnder 1 hour
Failed deployment rateRelease qualityUnder 5%
Rollback rateChange stabilityUnder 2%
Approval latencyGovernance overheadUnder 30 minutes
Promotion completion rateEnd-to-end successAbove 95%

On this page