Skip to content

Commit

Permalink
docs: add ADR 0001 — all generated cluster repo files versioned in te…
Browse files Browse the repository at this point in the history
…rraform-eks-deployment
  • Loading branch information
Dave Arnold committed Apr 20, 2026
1 parent b4f330a commit edfe5d5
Showing 1 changed file with 134 additions and 0 deletions.
134 changes: 134 additions & 0 deletions docs/adr/0001-generated-file-source-of-truth.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# ADR 0001: All Generated Cluster Repository Files Must Be Versioned in terraform-eks-deployment

**Date:** 2026-04-20
**Status:** Proposed
**Deciders:** arnol377, morga471

---

## Context

The EKS Cluster Automation (ECA) system generates new EKS cluster repositories by running
`terraform apply` inside a CodeBuild project (`eks-terragrunt-repo-creator`). The build
checks out a pinned commit of `terraform-eks-deployment` (via `REPO_BRANCH` in `buildspec.yml`)
and applies it, which calls `CSVD/terraform-github-repo` to create the GitHub repo and
commit all generated files via `managed_extra_files`.

The files written into a generated cluster repo fall into two categories:

1. **Rendered config files**`_envcommon/default-versions.hcl`, `_envcommon/common-variables.hcl`,
`account.hcl`, `region.hcl`, `vpc.hcl`, `cluster.hcl` — rendered from Go templates
(`*.tf.tpl`) committed inside `terraform-eks-deployment/templates/`.

2. **Terragrunt module entrypoints**`eks/terragrunt.hcl`, `eks-config/terragrunt.hcl`,
`eks-dns/terragrunt.hcl`, and all other `eks-*/terragrunt.hcl` files — one per
Terragrunt module in the cluster's run-all graph.

Historically, the second category was provided by cloning `template-eks-cluster` as a
GitHub repo template. The template contained placeholder directory paths
(`environment/region/vpc/cluster/eks-*/`) that were supposed to be renamed to real computed
paths after clone. That renaming was never implemented, producing broken repos with literal
`environment/region/vpc/cluster` in all paths.

PR #16 (`test_cluster``main`) correctly eliminates the GitHub template feature
(`template_repo = null`) but proposes reading the `eks-*/terragrunt.hcl` files live from
`template-eks-cluster:main` at Terraform plan time via `data.github_repository_file`.

This ADR records the decision about where those files should live and why.

---

## Decision

We will commit all `eks-*/terragrunt.hcl` template files directly into
`terraform-eks-deployment/templates/eks-modules/` and write them into generated repos
via `managed_extra_files`, alongside the existing rendered config files.

The `template-eks-cluster` GitHub repo will no longer be used as a source of file content
in the automation path. The GitHub template feature (`template_repo`) will remain `null`.

---

## Alternatives Considered

### Option A: Read eks-module files live from `template-eks-cluster` at plan time (PR #16 approach)

`data.github_repository_file` datasources fetch each `eks-*/terragrunt.hcl` from
`template-eks-cluster:main` during `terraform plan`. They are passed into
`managed_extra_files` alongside the rendered config files.

**Rejected because:**

- **Internal consistency cannot be guaranteed.** The rendered config files
(`_envcommon/default-versions.hcl`, `_envcommon/common-variables.hcl`) are generated
from templates in `terraform-eks-deployment`. The eks-module files are fetched live from
a separate repo at a different, independently-advancing ref. A change to
`eks-karpenter/terragrunt.hcl` in `template-eks-cluster` that references a new variable
not yet present in `default-versions.hcl` will flow into new repos silently, producing
files that are internally inconsistent and will fail when terragrunt is run.

- **Partial updates are possible.** PR #16's drift-detection update mode only re-commits
files whose content changed. A template update that touches `eks-karpenter/terragrunt.hcl`
but not `default-versions.hcl` could produce a cluster repo where those two files are
at different effective versions.

- **Plan-time API coupling increases fragility.** Every `terraform plan` makes one GitHub
API call per eks-module file (currently 14 calls). If the GHE endpoint is slow or the
token lacks access, the plan fails regardless of whether the user intends to touch those
files.

- **`REPO_BRANCH` pinning is undermined.** CodeBuild pins `terraform-eks-deployment` to a
tested commit via `REPO_BRANCH`. This guarantees a known, reproducible set of Terraform
logic and defaults. Pulling supporting files from a separately-versioned repo at runtime
breaks that reproducibility guarantee — the effective artifact being applied is no longer
fully described by a single commit.

### Option B: Keep `template-eks-cluster` as a GitHub repo template (previous approach)

Use the GitHub template feature to seed new repos with `eks-*/terragrunt.hcl` files and
then rename the placeholder paths via a post-apply script.

**Rejected because:**

- Placeholder paths (`environment/region/vpc/cluster/`) land in the generated repo and
cannot be easily renamed after the fact via standard Terraform resources.
- Requires an out-of-band post-apply step (script or `null_resource`) that runs outside
Terraform's state model.
- The template repo still diverges from `terraform-eks-deployment` over time (same
consistency problem as Option A).

---

## Consequences

**Positive:**

- A single commit of `terraform-eks-deployment` fully describes all files that will be
written into a generated cluster repo. Pinning `REPO_BRANCH` in `buildspec.yml` is
sufficient to produce a fully reproducible, internally consistent artifact.
- When a new eks-module version or a new variable is added, a single PR to
`terraform-eks-deployment` updates both the `eks-*/terragrunt.hcl` template and the
corresponding `default-versions.hcl` template atomically. They cannot diverge.
- No live API calls at plan time for file content. Plan performance and reliability are
not affected by the availability of `template-eks-cluster`.
- The GitHub template feature (`template_repo`) is not used, removing a dependency on a
separately-maintained repo and on GitHub's template clone behavior.

**Negative:**

- `template-eks-cluster` and `terraform-eks-deployment/templates/eks-modules/` must be
kept manually in sync if humans use the template repo as a reference. Mitigation: add a
README to `template-eks-cluster` noting that it is no longer the automation source of
truth and pointing to `terraform-eks-deployment`.
- Adding a new eks-module requires a PR to `terraform-eks-deployment` rather than just
adding a directory to `template-eks-cluster`. This is the desired behavior — changes
go through review — but is a minor workflow difference.

**Neutral:**

- `template-eks-cluster` can be archived or retained as a human-readable reference. It
is not deleted because it may still be useful for onboarding documentation.
- The `data.github_repository_file` approach in PR #16 remains valid for a future
*update* workflow (deliberately syncing template changes into existing cluster repos),
as long as that workflow operates on the `templates/eks-modules/` copy in
`terraform-eks-deployment` rather than `template-eks-cluster:main`.

0 comments on commit edfe5d5

Please sign in to comment.