Skip to content

Commit

Permalink
docs: ADR 0001 — all generated cluster repo files must be versioned i…
Browse files Browse the repository at this point in the history
…n terraform-eks-deployment (#18)

* docs: add ADR 0001 — all generated cluster repo files versioned in terraform-eks-deployment

* feat: move all generated-repo file management into terraform-eks-deployment

- Add all 20 eks-module terragrunt.hcl files to templates/eks-modules/;
  these are written verbatim (file()) into the generated repo under
  $env/$region/$vpc/$cluster/<module>/terragrunt.hcl
- Wire templates/eks-modules/ into managed_extra_files via a fileset-based
  local (eks_module_files) merged with the existing rendered_files local
- Set template_repo = null in both defaults.tf and main.tf; all generated
  repo content now flows exclusively through managed_extra_files with no
  GitHub template seeding
- Rename templates/common-variables.hcl and templates/default-versions.hcl
  to *.hcl.tf.tpl to match the existing convention: .tf.tpl = parsed by
  templatefile(), eks-modules/*.hcl = written verbatim
- Update locals.tf references for the renamed template files
- Fix alignment formatting in examples/basic and examples/adsd-tools-dev

---------

Co-authored-by: Dave Arnold <user@example.com>
  • Loading branch information
arnol377 and Dave Arnold committed Apr 21, 2026
1 parent b4f330a commit dcf8501
Show file tree
Hide file tree
Showing 28 changed files with 1,871 additions and 24 deletions.
7 changes: 4 additions & 3 deletions defaults.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@ locals {
# Dynamic AWS profile generation
aws_profile = "${var.cluster_config.account_name}-${var.cluster_config.environment_abbr}"

# Static template values (hidden from users)
# template_repo is null — all generated-repo content is managed via managed_extra_files.
# template-eks-cluster is a human reference only; it has no automation role.
repository_defaults = {
template = "template-eks-cluster"
template_owner = "SCT-Engineering"
template = null
template_owner = null
}

# Static EKS configuration for Karpenter bootstrap node group
Expand Down
134 changes: 134 additions & 0 deletions docs/adr/0001-generated-file-source-of-truth.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# ADR 0001: All Generated Cluster Repository Files Must Be Versioned in terraform-eks-deployment

**Date:** 2026-04-20
**Status:** Proposed
**Deciders:** arnol377, morga471

---

## Context

The EKS Cluster Automation (ECA) system generates new EKS cluster repositories by running
`terraform apply` inside a CodeBuild project (`eks-terragrunt-repo-creator`). The build
checks out a pinned commit of `terraform-eks-deployment` (via `REPO_BRANCH` in `buildspec.yml`)
and applies it, which calls `CSVD/terraform-github-repo` to create the GitHub repo and
commit all generated files via `managed_extra_files`.

The files written into a generated cluster repo fall into two categories:

1. **Rendered config files**`_envcommon/default-versions.hcl`, `_envcommon/common-variables.hcl`,
`account.hcl`, `region.hcl`, `vpc.hcl`, `cluster.hcl` — rendered from Go templates
(`*.tf.tpl`) committed inside `terraform-eks-deployment/templates/`.

2. **Terragrunt module entrypoints**`eks/terragrunt.hcl`, `eks-config/terragrunt.hcl`,
`eks-dns/terragrunt.hcl`, and all other `eks-*/terragrunt.hcl` files — one per
Terragrunt module in the cluster's run-all graph.

Historically, the second category was provided by cloning `template-eks-cluster` as a
GitHub repo template. The template contained placeholder directory paths
(`environment/region/vpc/cluster/eks-*/`) that were supposed to be renamed to real computed
paths after clone. That renaming was never implemented, producing broken repos with literal
`environment/region/vpc/cluster` in all paths.

PR #16 (`test_cluster``main`) correctly eliminates the GitHub template feature
(`template_repo = null`) but proposes reading the `eks-*/terragrunt.hcl` files live from
`template-eks-cluster:main` at Terraform plan time via `data.github_repository_file`.

This ADR records the decision about where those files should live and why.

---

## Decision

We will commit all `eks-*/terragrunt.hcl` template files directly into
`terraform-eks-deployment/templates/eks-modules/` and write them into generated repos
via `managed_extra_files`, alongside the existing rendered config files.

The `template-eks-cluster` GitHub repo will no longer be used as a source of file content
in the automation path. The GitHub template feature (`template_repo`) will remain `null`.

---

## Alternatives Considered

### Option A: Read eks-module files live from `template-eks-cluster` at plan time (PR #16 approach)

`data.github_repository_file` datasources fetch each `eks-*/terragrunt.hcl` from
`template-eks-cluster:main` during `terraform plan`. They are passed into
`managed_extra_files` alongside the rendered config files.

**Rejected because:**

- **Internal consistency cannot be guaranteed.** The rendered config files
(`_envcommon/default-versions.hcl`, `_envcommon/common-variables.hcl`) are generated
from templates in `terraform-eks-deployment`. The eks-module files are fetched live from
a separate repo at a different, independently-advancing ref. A change to
`eks-karpenter/terragrunt.hcl` in `template-eks-cluster` that references a new variable
not yet present in `default-versions.hcl` will flow into new repos silently, producing
files that are internally inconsistent and will fail when terragrunt is run.

- **Partial updates are possible.** PR #16's drift-detection update mode only re-commits
files whose content changed. A template update that touches `eks-karpenter/terragrunt.hcl`
but not `default-versions.hcl` could produce a cluster repo where those two files are
at different effective versions.

- **Plan-time API coupling increases fragility.** Every `terraform plan` makes one GitHub
API call per eks-module file (currently 14 calls). If the GHE endpoint is slow or the
token lacks access, the plan fails regardless of whether the user intends to touch those
files.

- **`REPO_BRANCH` pinning is undermined.** CodeBuild pins `terraform-eks-deployment` to a
tested commit via `REPO_BRANCH`. This guarantees a known, reproducible set of Terraform
logic and defaults. Pulling supporting files from a separately-versioned repo at runtime
breaks that reproducibility guarantee — the effective artifact being applied is no longer
fully described by a single commit.

### Option B: Keep `template-eks-cluster` as a GitHub repo template (previous approach)

Use the GitHub template feature to seed new repos with `eks-*/terragrunt.hcl` files and
then rename the placeholder paths via a post-apply script.

**Rejected because:**

- Placeholder paths (`environment/region/vpc/cluster/`) land in the generated repo and
cannot be easily renamed after the fact via standard Terraform resources.
- Requires an out-of-band post-apply step (script or `null_resource`) that runs outside
Terraform's state model.
- The template repo still diverges from `terraform-eks-deployment` over time (same
consistency problem as Option A).

---

## Consequences

**Positive:**

- A single commit of `terraform-eks-deployment` fully describes all files that will be
written into a generated cluster repo. Pinning `REPO_BRANCH` in `buildspec.yml` is
sufficient to produce a fully reproducible, internally consistent artifact.
- When a new eks-module version or a new variable is added, a single PR to
`terraform-eks-deployment` updates both the `eks-*/terragrunt.hcl` template and the
corresponding `default-versions.hcl` template atomically. They cannot diverge.
- No live API calls at plan time for file content. Plan performance and reliability are
not affected by the availability of `template-eks-cluster`.
- The GitHub template feature (`template_repo`) is not used, removing a dependency on a
separately-maintained repo and on GitHub's template clone behavior.

**Negative:**

- `template-eks-cluster` and `terraform-eks-deployment/templates/eks-modules/` must be
kept manually in sync if humans use the template repo as a reference. Mitigation: add a
README to `template-eks-cluster` noting that it is no longer the automation source of
truth and pointing to `terraform-eks-deployment`.
- Adding a new eks-module requires a PR to `terraform-eks-deployment` rather than just
adding a directory to `template-eks-cluster`. This is the desired behavior — changes
go through review — but is a minor workflow difference.

**Neutral:**

- `template-eks-cluster` can be archived or retained as a human-readable reference. It
is not deleted because it may still be useful for onboarding documentation.
- The `data.github_repository_file` approach in PR #16 remains valid for a future
*update* workflow (deliberately syncing template changes into existing cluster repos),
as long as that workflow operates on the `templates/eks-modules/` copy in
`terraform-eks-deployment` rather than `template-eks-cluster:main`.
16 changes: 8 additions & 8 deletions examples/adsd-tools-dev/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ module "eks_deployment" {

# Cluster configuration - simplified interface
cluster_config = {
account_name = "adsd-tools-nonprod-gov"
aws_account_id = "533109815932"
cluster_mailing_list = "adsd.enterprise.tools.support.branch.list@census.gov"
environment_abbr = "prod"
account_name = "adsd-tools-nonprod-gov"
aws_account_id = "533109815932"
cluster_mailing_list = "adsd.enterprise.tools.support.branch.list@census.gov"
environment_abbr = "prod"
finops_project_name = "adsd_etdsb_tools_migration"
finops_project_number = "fs0000000069"
finops_project_role = "adsd_tools_mgrn_eks"
vpc_domain_name = "dev.adsd.csp1.census.gov"
vpc_name = "vpc3-inf-dev"
finops_project_number = "fs0000000069"
finops_project_role = "adsd_tools_mgrn_eks"
vpc_domain_name = "dev.adsd.csp1.census.gov"
vpc_name = "vpc3-inf-dev"
tags = {
Owner = "adsd.enterprise.tools.support.branch.list@census.gov"
Environment = "development"
Expand Down
16 changes: 8 additions & 8 deletions examples/basic/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@ module "eks_deployment" {

# Cluster configuration - simplified interface
cluster_config = {
account_name = "ma6-gov"
aws_account_id = "252960665057"
cluster_mailing_list = "adep.mojo.development.list@census.gov"
environment_abbr = "dev"
account_name = "ma6-gov"
aws_account_id = "252960665057"
cluster_mailing_list = "adep.mojo.development.list@census.gov"
environment_abbr = "dev"
finops_project_name = "PPSI_DICE"
finops_project_number = "fs0000000015"
finops_project_role = "dice:dev:mojo"
vpc_domain_name = "dev.dice.census.gov"
vpc_name = "vpc2-dice-dev"
finops_project_number = "fs0000000015"
finops_project_role = "dice:dev:mojo"
vpc_domain_name = "dev.dice.census.gov"
vpc_name = "vpc2-dice-dev"
tags = {
Owner = "PETeam"
Environment = "Development"
Expand Down
4 changes: 2 additions & 2 deletions locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -123,11 +123,11 @@ locals {
managed_extra_files = concat([
{
path = "_envcommon/default-versions.hcl"
content = templatefile("${path.module}/templates/default-versions.hcl", local.default_versions)
content = templatefile("${path.module}/templates/default-versions.hcl.tf.tpl", local.default_versions)
},
{
path = "_envcommon/common-variables.hcl"
content = templatefile("${path.module}/templates/common-variables.hcl", local.common_vars)
content = templatefile("${path.module}/templates/common-variables.hcl.tf.tpl", local.common_vars)
}
],
var.github_actions_workflows)
Expand Down
21 changes: 18 additions & 3 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,21 @@ locals {
}
}

locals {
# Base path prefix for all eks-module files in the generated repo
eks_module_cluster_prefix = "${var.environment}/${var.region}/${var.cluster_config.vpc_name}/${var.name}"

# Auto-discover all files in templates/eks-modules/ and map them to their
# target paths in the generated repo. The naming convention converts
# "eks-karpenter.terragrunt.hcl" → "eks-karpenter/terragrunt.hcl" by
# splitting on the first dot.
eks_module_files = {
for fname in fileset("${path.module}/templates/eks-modules", "*") :
"${local.eks_module_cluster_prefix}/${join("/", regex("^([^.]+)\\.(.+)$", fname))}" =>
file("${path.module}/templates/eks-modules/${fname}")
}
}

module "github_repo" {
source = "git::git@github.e.it.census.gov:CSVD/terraform-github-repo.git"

Expand All @@ -82,16 +97,16 @@ module "github_repo" {
github_repo_topics = ["eks", "kubernetes", "terraform", "infrastructure"]
force_name = var.force_name

template_repo_org = local.repository_defaults.template_owner
template_repo = local.repository_defaults.template
template_repo_org = null
template_repo = null

github_is_private = false
github_has_issues = true
github_has_wiki = true
github_has_projects = true

managed_extra_files = [
for path, content in local.rendered_files : {
for path, content in merge(local.rendered_files, local.eks_module_files) : {
path = path
content = content
}
Expand Down
File renamed without changes.
File renamed without changes.
86 changes: 86 additions & 0 deletions templates/eks-modules/eks-arcgis.terragrunt.hcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
include "root" {
path = find_in_parent_folders("root.hcl")
merge_strategy = "deep"
expose = true
}

locals {
# Skip this module if disabled
skip = !lookup(include.root.locals.is_module_enabled, basename(get_terragrunt_dir()), true)
}

exclude {
if = local.skip
actions = ["all_except_output"]
exclude_dependencies = false
}

terraform {
source = "git@github.e.it.census.gov:SCT-Engineering/tfmod-ersi-arcgis.git?ref=${include.root.inputs.release_version}"
extra_arguments "retry_lock" {
commands = get_terraform_commands_that_need_locking()
arguments = ["-lock-timeout=20s"]
}
}

dependency "eks" {
config_path = "../eks"
mock_outputs_allowed_terraform_commands = ["init", "plan", "validate", "destroy"]
mock_outputs = {
cluster_name = "mock-cluster"
}
}

dependency "eks_config" {
config_path = "../eks-config"
mock_outputs_allowed_terraform_commands = ["init", "plan", "validate", "destroy"]
mock_outputs = {
rwo_storage_class = "gp3-mock"
}
}

dependency "eks_dns" {
config_path = "../eks-dns"
mock_outputs_allowed_terraform_commands = ["init", "plan", "validate", "destroy"]
mock_outputs = {
cluster_domain = "mock.domain.example.com"
}
}

dependencies {
paths = [
"../eks",
"../eks-config",
"../eks-dns",
"../eks-kiali",
]
}

inputs = {
# AWS Configuration
account_id = include.root.inputs.aws_account_id
profile = include.root.inputs.aws_profile
region = include.root.inputs.aws_region
eecr_info = include.root.inputs.eecr_info

# Cluster Configuration
cluster_domain = dependency.eks_dns.outputs.cluster_domain
cluster_name = dependency.eks.outputs.cluster_name
namespace = "arcgis"
rwo_storage_class = dependency.eks_config.outputs.rwo_storage_class

# Dockerhub Creds
dockerhub_username = ""
dockerhub_password = ""

# ArcGIS Config
ersi_image_tag = "11.4.0.6285"
arcgis_license_json = ""
arcgis_admin_username = "admin"
arcgis_admin_password = "password"
arcgis_admin_email = include.root.inputs.cluster_mailing_list
arcgis_admin_firstname = "admin"
arcgis_admin_lastname = "admin"
arcgis_security_question_index = 1
arcgis_security_question_answer = "Las Vegas"
}
Loading

0 comments on commit dcf8501

Please sign in to comment.