Skip to content

docs: ADR 0001 — all generated cluster repo files must be versioned in terraform-eks-deployment #18

Merged
merged 2 commits into from
Apr 21, 2026
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions defaults.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@ locals {
# Dynamic AWS profile generation
aws_profile = "${var.cluster_config.account_name}-${var.cluster_config.environment_abbr}"

# Static template values (hidden from users)
# template_repo is null — all generated-repo content is managed via managed_extra_files.
# template-eks-cluster is a human reference only; it has no automation role.
repository_defaults = {
template = "template-eks-cluster"
template_owner = "SCT-Engineering"
template = null
template_owner = null
}

# Static EKS configuration for Karpenter bootstrap node group
Expand Down
134 changes: 134 additions & 0 deletions docs/adr/0001-generated-file-source-of-truth.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# ADR 0001: All Generated Cluster Repository Files Must Be Versioned in terraform-eks-deployment

**Date:** 2026-04-20
**Status:** Proposed
**Deciders:** arnol377, morga471

---

## Context

The EKS Cluster Automation (ECA) system generates new EKS cluster repositories by running
`terraform apply` inside a CodeBuild project (`eks-terragrunt-repo-creator`). The build
checks out a pinned commit of `terraform-eks-deployment` (via `REPO_BRANCH` in `buildspec.yml`)
and applies it, which calls `CSVD/terraform-github-repo` to create the GitHub repo and
commit all generated files via `managed_extra_files`.

The files written into a generated cluster repo fall into two categories:

1. **Rendered config files**`_envcommon/default-versions.hcl`, `_envcommon/common-variables.hcl`,
`account.hcl`, `region.hcl`, `vpc.hcl`, `cluster.hcl` — rendered from Go templates
(`*.tf.tpl`) committed inside `terraform-eks-deployment/templates/`.

2. **Terragrunt module entrypoints**`eks/terragrunt.hcl`, `eks-config/terragrunt.hcl`,
`eks-dns/terragrunt.hcl`, and all other `eks-*/terragrunt.hcl` files — one per
Terragrunt module in the cluster's run-all graph.

Historically, the second category was provided by cloning `template-eks-cluster` as a
GitHub repo template. The template contained placeholder directory paths
(`environment/region/vpc/cluster/eks-*/`) that were supposed to be renamed to real computed
paths after clone. That renaming was never implemented, producing broken repos with literal
`environment/region/vpc/cluster` in all paths.

PR #16 (`test_cluster``main`) correctly eliminates the GitHub template feature
(`template_repo = null`) but proposes reading the `eks-*/terragrunt.hcl` files live from
`template-eks-cluster:main` at Terraform plan time via `data.github_repository_file`.

This ADR records the decision about where those files should live and why.

---

## Decision

We will commit all `eks-*/terragrunt.hcl` template files directly into
`terraform-eks-deployment/templates/eks-modules/` and write them into generated repos
via `managed_extra_files`, alongside the existing rendered config files.

The `template-eks-cluster` GitHub repo will no longer be used as a source of file content
in the automation path. The GitHub template feature (`template_repo`) will remain `null`.

---

## Alternatives Considered

### Option A: Read eks-module files live from `template-eks-cluster` at plan time (PR #16 approach)

`data.github_repository_file` datasources fetch each `eks-*/terragrunt.hcl` from
`template-eks-cluster:main` during `terraform plan`. They are passed into
`managed_extra_files` alongside the rendered config files.

**Rejected because:**

- **Internal consistency cannot be guaranteed.** The rendered config files
(`_envcommon/default-versions.hcl`, `_envcommon/common-variables.hcl`) are generated
from templates in `terraform-eks-deployment`. The eks-module files are fetched live from
a separate repo at a different, independently-advancing ref. A change to
`eks-karpenter/terragrunt.hcl` in `template-eks-cluster` that references a new variable
not yet present in `default-versions.hcl` will flow into new repos silently, producing
files that are internally inconsistent and will fail when terragrunt is run.

- **Partial updates are possible.** PR #16's drift-detection update mode only re-commits
files whose content changed. A template update that touches `eks-karpenter/terragrunt.hcl`
but not `default-versions.hcl` could produce a cluster repo where those two files are
at different effective versions.

- **Plan-time API coupling increases fragility.** Every `terraform plan` makes one GitHub
API call per eks-module file (currently 14 calls). If the GHE endpoint is slow or the
token lacks access, the plan fails regardless of whether the user intends to touch those
files.

- **`REPO_BRANCH` pinning is undermined.** CodeBuild pins `terraform-eks-deployment` to a
tested commit via `REPO_BRANCH`. This guarantees a known, reproducible set of Terraform
logic and defaults. Pulling supporting files from a separately-versioned repo at runtime
breaks that reproducibility guarantee — the effective artifact being applied is no longer
fully described by a single commit.

### Option B: Keep `template-eks-cluster` as a GitHub repo template (previous approach)

Use the GitHub template feature to seed new repos with `eks-*/terragrunt.hcl` files and
then rename the placeholder paths via a post-apply script.

**Rejected because:**

- Placeholder paths (`environment/region/vpc/cluster/`) land in the generated repo and
cannot be easily renamed after the fact via standard Terraform resources.
- Requires an out-of-band post-apply step (script or `null_resource`) that runs outside
Terraform's state model.
- The template repo still diverges from `terraform-eks-deployment` over time (same
consistency problem as Option A).

---

## Consequences

**Positive:**

- A single commit of `terraform-eks-deployment` fully describes all files that will be
written into a generated cluster repo. Pinning `REPO_BRANCH` in `buildspec.yml` is
sufficient to produce a fully reproducible, internally consistent artifact.
- When a new eks-module version or a new variable is added, a single PR to
`terraform-eks-deployment` updates both the `eks-*/terragrunt.hcl` template and the
corresponding `default-versions.hcl` template atomically. They cannot diverge.
- No live API calls at plan time for file content. Plan performance and reliability are
not affected by the availability of `template-eks-cluster`.
- The GitHub template feature (`template_repo`) is not used, removing a dependency on a
separately-maintained repo and on GitHub's template clone behavior.

**Negative:**

- `template-eks-cluster` and `terraform-eks-deployment/templates/eks-modules/` must be
kept manually in sync if humans use the template repo as a reference. Mitigation: add a
README to `template-eks-cluster` noting that it is no longer the automation source of
truth and pointing to `terraform-eks-deployment`.
- Adding a new eks-module requires a PR to `terraform-eks-deployment` rather than just
adding a directory to `template-eks-cluster`. This is the desired behavior — changes
go through review — but is a minor workflow difference.

**Neutral:**

- `template-eks-cluster` can be archived or retained as a human-readable reference. It
is not deleted because it may still be useful for onboarding documentation.
- The `data.github_repository_file` approach in PR #16 remains valid for a future
*update* workflow (deliberately syncing template changes into existing cluster repos),
as long as that workflow operates on the `templates/eks-modules/` copy in
`terraform-eks-deployment` rather than `template-eks-cluster:main`.
16 changes: 8 additions & 8 deletions examples/adsd-tools-dev/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ module "eks_deployment" {

# Cluster configuration - simplified interface
cluster_config = {
account_name = "adsd-tools-nonprod-gov"
aws_account_id = "533109815932"
cluster_mailing_list = "adsd.enterprise.tools.support.branch.list@census.gov"
environment_abbr = "prod"
account_name = "adsd-tools-nonprod-gov"
aws_account_id = "533109815932"
cluster_mailing_list = "adsd.enterprise.tools.support.branch.list@census.gov"
environment_abbr = "prod"
finops_project_name = "adsd_etdsb_tools_migration"
finops_project_number = "fs0000000069"
finops_project_role = "adsd_tools_mgrn_eks"
vpc_domain_name = "dev.adsd.csp1.census.gov"
vpc_name = "vpc3-inf-dev"
finops_project_number = "fs0000000069"
finops_project_role = "adsd_tools_mgrn_eks"
vpc_domain_name = "dev.adsd.csp1.census.gov"
vpc_name = "vpc3-inf-dev"
tags = {
Owner = "adsd.enterprise.tools.support.branch.list@census.gov"
Environment = "development"
Expand Down
16 changes: 8 additions & 8 deletions examples/basic/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@ module "eks_deployment" {

# Cluster configuration - simplified interface
cluster_config = {
account_name = "ma6-gov"
aws_account_id = "252960665057"
cluster_mailing_list = "adep.mojo.development.list@census.gov"
environment_abbr = "dev"
account_name = "ma6-gov"
aws_account_id = "252960665057"
cluster_mailing_list = "adep.mojo.development.list@census.gov"
environment_abbr = "dev"
finops_project_name = "PPSI_DICE"
finops_project_number = "fs0000000015"
finops_project_role = "dice:dev:mojo"
vpc_domain_name = "dev.dice.census.gov"
vpc_name = "vpc2-dice-dev"
finops_project_number = "fs0000000015"
finops_project_role = "dice:dev:mojo"
vpc_domain_name = "dev.dice.census.gov"
vpc_name = "vpc2-dice-dev"
tags = {
Owner = "PETeam"
Environment = "Development"
Expand Down
4 changes: 2 additions & 2 deletions locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -123,11 +123,11 @@ locals {
managed_extra_files = concat([
{
path = "_envcommon/default-versions.hcl"
content = templatefile("${path.module}/templates/default-versions.hcl", local.default_versions)
content = templatefile("${path.module}/templates/default-versions.hcl.tf.tpl", local.default_versions)
},
{
path = "_envcommon/common-variables.hcl"
content = templatefile("${path.module}/templates/common-variables.hcl", local.common_vars)
content = templatefile("${path.module}/templates/common-variables.hcl.tf.tpl", local.common_vars)
}
],
var.github_actions_workflows)
Expand Down
21 changes: 18 additions & 3 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,21 @@ locals {
}
}

locals {
# Base path prefix for all eks-module files in the generated repo
eks_module_cluster_prefix = "${var.environment}/${var.region}/${var.cluster_config.vpc_name}/${var.name}"

# Auto-discover all files in templates/eks-modules/ and map them to their
# target paths in the generated repo. The naming convention converts
# "eks-karpenter.terragrunt.hcl" → "eks-karpenter/terragrunt.hcl" by
# splitting on the first dot.
eks_module_files = {
for fname in fileset("${path.module}/templates/eks-modules", "*") :
"${local.eks_module_cluster_prefix}/${join("/", regex("^([^.]+)\\.(.+)$", fname))}" =>
file("${path.module}/templates/eks-modules/${fname}")
}
}

module "github_repo" {
source = "git::git@github.e.it.census.gov:CSVD/terraform-github-repo.git"

Expand All @@ -82,16 +97,16 @@ module "github_repo" {
github_repo_topics = ["eks", "kubernetes", "terraform", "infrastructure"]
force_name = var.force_name

template_repo_org = local.repository_defaults.template_owner
template_repo = local.repository_defaults.template
template_repo_org = null
template_repo = null

github_is_private = false
github_has_issues = true
github_has_wiki = true
github_has_projects = true

managed_extra_files = [
for path, content in local.rendered_files : {
for path, content in merge(local.rendered_files, local.eks_module_files) : {
path = path
content = content
}
Expand Down
File renamed without changes.
File renamed without changes.
86 changes: 86 additions & 0 deletions templates/eks-modules/eks-arcgis.terragrunt.hcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
include "root" {
path = find_in_parent_folders("root.hcl")
merge_strategy = "deep"
expose = true
}

locals {
# Skip this module if disabled
skip = !lookup(include.root.locals.is_module_enabled, basename(get_terragrunt_dir()), true)
}

exclude {
if = local.skip
actions = ["all_except_output"]
exclude_dependencies = false
}

terraform {
source = "git@github.e.it.census.gov:SCT-Engineering/tfmod-ersi-arcgis.git?ref=${include.root.inputs.release_version}"
extra_arguments "retry_lock" {
commands = get_terraform_commands_that_need_locking()
arguments = ["-lock-timeout=20s"]
}
}

dependency "eks" {
config_path = "../eks"
mock_outputs_allowed_terraform_commands = ["init", "plan", "validate", "destroy"]
mock_outputs = {
cluster_name = "mock-cluster"
}
}

dependency "eks_config" {
config_path = "../eks-config"
mock_outputs_allowed_terraform_commands = ["init", "plan", "validate", "destroy"]
mock_outputs = {
rwo_storage_class = "gp3-mock"
}
}

dependency "eks_dns" {
config_path = "../eks-dns"
mock_outputs_allowed_terraform_commands = ["init", "plan", "validate", "destroy"]
mock_outputs = {
cluster_domain = "mock.domain.example.com"
}
}

dependencies {
paths = [
"../eks",
"../eks-config",
"../eks-dns",
"../eks-kiali",
]
}

inputs = {
# AWS Configuration
account_id = include.root.inputs.aws_account_id
profile = include.root.inputs.aws_profile
region = include.root.inputs.aws_region
eecr_info = include.root.inputs.eecr_info

# Cluster Configuration
cluster_domain = dependency.eks_dns.outputs.cluster_domain
cluster_name = dependency.eks.outputs.cluster_name
namespace = "arcgis"
rwo_storage_class = dependency.eks_config.outputs.rwo_storage_class

# Dockerhub Creds
dockerhub_username = ""
dockerhub_password = ""

# ArcGIS Config
ersi_image_tag = "11.4.0.6285"
arcgis_license_json = ""
arcgis_admin_username = "admin"
arcgis_admin_password = "password"
arcgis_admin_email = include.root.inputs.cluster_mailing_list
arcgis_admin_firstname = "admin"
arcgis_admin_lastname = "admin"
arcgis_security_question_index = 1
arcgis_security_question_answer = "Las Vegas"
}
Loading
Loading