Back to blog

2025-04-10 · 4 min read

Multi-Environment Infrastructure on GCP with Terraform and Terragrunt

TerraformTerragruntGCPInfrastructure as CodeGKE

Managing identical infrastructure across dev, staging, and production without copy-pasting Terraform files is the kind of problem that Terragrunt was built to solve. I set up a complete IaC pipeline on GCP that provisions GKE clusters, VPC networking, and Cloud CDN across three environments — all from a single set of modules.

The Architecture

The pipeline provisions three core components per environment:

  • GKE cluster for running workloads
  • VPC for network isolation with proper subnetting
  • Cloud CDN backed by GCS buckets for static content delivery

Each environment (dev, staging, prod) gets its own instance of these resources with environment-specific configurations.

Project Structure

The repo separates Terraform modules from Terragrunt environment configurations:

iac-infra/
├── terraform/
│   ├── main.tf
│   ├── variables.tf
│   ├── envs/
│   │   ├── dev/
│   │   │   └── terragrunt.hcl
│   │   ├── staging/
│   │   │   └── terragrunt.hcl
│   │   └── prod/
│   │       └── terragrunt.hcl
│   └── modules/
│       ├── gke/
│       ├── vpc/
│       ├── cdn/
│       └── cdn_bucket/
├── .github/
│   └── workflows/
│       └── ci-pipeline.yml

The modules/ directory contains reusable Terraform code. Each module handles one infrastructure concern. The envs/ directory contains Terragrunt configs that call those modules with environment-specific inputs.

This is the DRY pattern Terragrunt enables: write the module once, call it three times with different variables.

The CI/CD Pipeline

The GitHub Actions pipeline automates the entire lifecycle. It triggers on:

  • Pull requests to main — runs terragrunt plan so reviewers can see what changes before merging
  • Pushes to main or environment tags (*-dev, *-staging, *-prod) — runs terragrunt apply to provision

The pipeline handles environment detection automatically:

  1. Checks out code
  2. Authenticates to GCP using a service account key from GitHub Secrets
  3. Installs Terraform v1.9.5 and Terragrunt v0.67.5
  4. Detects which environment to target based on the branch or tag
  5. Imports existing resources if needed (CDN, VPC)
  6. Validates the configuration with terragrunt validate
  7. Runs plan (for PRs) or apply (for pushes to main)

The environment detection means a single pipeline handles all three environments. Push a v1.2.0-staging tag and it applies to staging. Merge to main and it applies to production.

State Management

Terraform state is stored remotely in GCS, with the bucket name managed via GitHub Secrets (TERRAFORM_BUCKET_NAME). Each environment gets isolated state — a bad apply in dev can't corrupt production's GKE state.

Terragrunt handles the backend configuration centrally, so individual environment configs don't need to repeat the bucket setup.

Module Design

Each module is focused on one concern:

VPC Module — Creates the network, subnets, and firewall rules. The subnet_cidr is configurable per environment so CIDR ranges don't overlap (important if you ever need VPC peering between environments).

GKE Module — Provisions the Kubernetes cluster with configurable node counts, machine types, and autoscaling. Dev runs minimal nodes; production runs enough for redundancy.

CDN Module — Sets up Cloud CDN backed by a GCS bucket. The CDN bucket module handles the storage provisioning, and the CDN module wires up the load balancer and caching rules.

CDN Bucket Module — Provisions the GCS bucket that backs the CDN, with appropriate access controls and lifecycle policies.

What Makes This Work

Terragrunt's include block lets each environment inherit common configuration (provider setup, backend config) from a parent while overriding just the variables that differ. This eliminated the copy-paste problem completely.

The plan-on-PR workflow catches mistakes before they reach any environment. Reviewers see exactly what resources will be created, modified, or destroyed.

Tag-based deployments give precise control. Instead of deploying everything on every merge, teams can promote changes through environments by tagging specific commits.

What I'd Improve

Add policy-as-code. Running OPA or Checkov in the pipeline would catch misconfigurations (overly permissive IAM, missing labels) before they get applied.

Crossplane integration. Managing infrastructure from within Kubernetes using Crossplane would let application teams self-service without touching Terraform directly.

Atlantis for PR automation. Instead of GitHub Actions running plans, Atlantis would post plan output directly as PR comments with apply/reject controls.

The full implementation is on GitHub.


Keep Reading

2026-02-17 · 5 min read

Claude Code Was Hallucinating. The Fix Was a Progress Bar.

2025-06-15 · 3 min read

My Kubernetes Cluster Looked Healthy. Production Wasn't.


Back to blog

Navigate

HomeBlog

Connect

© 2026 Okay Kacar. All rights reserved.