From da7cc4e4d8e54be6780c4127672d212ca42886cc Mon Sep 17 00:00:00 2001 From: Dave Arnold Date: Wed, 20 May 2026 13:37:54 -0400 Subject: [PATCH] feat: executor commit-back; terraform_latest; plugin cache; lock file; plan gitignore buildspec-executor.yml: - INSTALL: create terraform_latest symlink -> terraform (account repos use TFCOMMAND=terraform_latest) - INSTALL: mkdir /data/terraform/terraform.d/plugin-cache + providers (required by .tf-control.tfrc) - BUILD: after tf-run apply, git add symlink re-link + .terraform.lock.hcl and push directly to main with [skip ci] to prevent webhook re-trigger - Add CodeBuild cache block for /data/terraform/terraform.d/plugin-cache (persists provider archives across builds via S3) - Add log note: logs/ is ephemeral, must be in .gitignore docs/HOW-IT-WORKS.md: - INSTALL phase: document terraform_latest alias and /data/terraform dir creation - BUILD phase step 5: document symlink re-link + lock file commit-back with rationale docs/template-management.md: - Template structure: add .gitignore and .terraform.lock.hcl to workspace dirs - Layout rules: add .gitignore required entries (logs/, .terraform/, tfstate*) - Layout rules: explain .terraform.lock.hcl lifecycle (committed, Executor updates + pushes back) - Layout rules: explain terraform_latest alias and plugin cache/.tf-control.tfrc behavior --- buildspec-executor.yml | 45 +++++++++++++++++++++++++++++++++++++ docs/HOW-IT-WORKS.md | 7 ++++++ docs/template-management.md | 14 ++++++++++++ 3 files changed, 66 insertions(+) diff --git a/buildspec-executor.yml b/buildspec-executor.yml index e9d6629..49e1d50 100644 --- a/buildspec-executor.yml +++ b/buildspec-executor.yml @@ -64,6 +64,16 @@ phases: for action in init plan apply destroy refresh output validate import state fmt taint console; do ln -sf /usr/local/bin/tf-control.sh /usr/local/bin/tf-${action}; done + # Account repo .tf-control files set TFCOMMAND=terraform_latest (the Census workstation alias). + # In CodeBuild the binary is just 'terraform'; create the alias so tf-control.sh resolves it. + - ln -sf /usr/local/bin/terraform /usr/local/bin/terraform_latest + + # --- Plugin cache directory (referenced by .tf-control.tfrc in every account repo) --- + # .tf-control.tfrc sets plugin_cache_dir = "/data/terraform/terraform.d/plugin-cache" + # and filesystem_mirror path = "/data/terraform/terraform.d/providers". + # Create both so Terraform does not error on init; the mirror is empty so Terraform + # falls through to the 'direct' block in the tfrc (via Census proxy to registry.terraform.io). + - mkdir -p /data/terraform/terraform.d/plugin-cache /data/terraform/terraform.d/providers # --- Python deps for tf-directory-setup.py --- - pip3 install --quiet python-dateutil pyyaml @@ -118,6 +128,9 @@ phases: # The Proposer already ran both of these and committed the results in the PR. # When tf-run hits these steps here they are idempotent: they overwrite files # that already exist with identical content. No new files are created at apply time. + # + # NOTE on logs/: tf-control.sh writes every plan/apply to logs/{action}.{timestamp}.log. + # This directory is ephemeral (never committed). Ensure logs/ is in .gitignore. - cd "${LAYER}/${REGION_DIR}" - | if [ "${DRY_RUN}" = "true" ]; then @@ -129,8 +142,40 @@ phases: TFARGS="-auto-approve" tf-run apply fi + # --- Commit post-apply file changes back to main --- + # After a successful apply tf-run.data typically runs: + # COMMAND tf-directory-setup.py --link s3 + # which re-links remote_state.{dir}.tf from .tf.none → .tf.s3. + # terraform init also generates/updates .terraform.lock.hcl. + # Both of these changes must be committed back to main so: + # (a) the repo reflects actual state for future Proposer re-renders + # (b) subsequent tf-init on main does not re-download all providers + # [skip ci] prevents the push from re-triggering the webhook executor. + - cd "${CODEBUILD_SRC_DIR}/repo" + - | + git add -A -- "${LAYER}/${REGION_DIR}/remote_state."* \ + "${LAYER}/${REGION_DIR}/.terraform.lock.hcl" 2>/dev/null || true + if ! git diff --cached --quiet; then + git -c user.email="sc-automation@census.gov" \ + -c user.name="SC Automation" \ + commit -m "chore: executor post-apply update ${LAYER}/${REGION_DIR} [skip ci]" + git push \ + "https://${GITHUB_TOKEN}@github.e.it.census.gov/${GITHUB_ORG}/${ACCOUNT_REPO}.git" \ + HEAD:main + echo "Committed and pushed post-apply changes to main" + else + echo "No post-apply file changes to commit" + fi + post_build: commands: - echo "BUILD_RESULT=${CODEBUILD_BUILD_SUCCEEDING}" - echo "ACCOUNT_REPO=${ACCOUNT_REPO}" - echo "LAYER=${LAYER} REGION_DIR=${REGION_DIR}" + +cache: + paths: + # Cache the provider plugin cache across builds for faster tf-init. + # Providers downloaded via Census proxy are stored here; subsequent builds + # skip re-downloading providers that haven't changed. + - /data/terraform/terraform.d/plugin-cache/**/* diff --git a/docs/HOW-IT-WORKS.md b/docs/HOW-IT-WORKS.md index 5bb5b5b..b507b94 100644 --- a/docs/HOW-IT-WORKS.md +++ b/docs/HOW-IT-WORKS.md @@ -320,6 +320,8 @@ The Lambda (webhook handler mode): - Clones `github.e.it.census.gov/terraform/support` for version governance - Downloads Terraform binary from S3 (version governed by `VERSION_TF`) - Installs tf-run toolchain scripts from the support repo +- Creates `terraform_latest` symlink → `terraform` (account repos set `TFCOMMAND=terraform_latest` in `.tf-control`) +- Creates `/data/terraform/terraform.d/plugin-cache/` and `/data/terraform/terraform.d/providers/` (required by `.tf-control.tfrc` `plugin_cache_dir` and `filesystem_mirror` directives) - Downloads and installs Census CA cert - Downloads and installs `gh` CLI - `pip3 install python-dateutil pyyaml` @@ -333,6 +335,11 @@ The Lambda (webhook handler mode): 3. `cd ${LAYER}/${REGION_DIR}` 4. If `DRY_RUN=true`: `tf-run plan`; else: `tf-run apply` (with optional `--start-tag ${TF_RUN_START_TAG}`) +5. **Commit post-apply changes back to `main`** — two categories of files change after a successful apply: + - **Symlink re-link**: `tf-run.data` typically contains `COMMAND tf-directory-setup.py --link s3` which changes the `remote_state.{dir}.tf` symlink from `.tf.none` → `.tf.s3`. This must be pushed back so future Proposer re-renders see the correct active variant. + - **Lock file update**: `tf-init` generates or updates `.terraform.lock.hcl` if provider constraints change. This must be pushed back so subsequent runs do not re-resolve providers. + - These are committed directly to `main` with `[skip ci]` in the message to prevent the webhook from re-triggering the Executor. No PR is needed: these are operational metadata, not infrastructure config changes. + - If `git diff --cached` is empty (DRY_RUN or no changes), the commit step is skipped cleanly. ### 6. CodeBuild - POST_BUILD phase diff --git a/docs/template-management.md b/docs/template-management.md index 3cbcb93..d1092c4 100644 --- a/docs/template-management.md +++ b/docs/template-management.md @@ -85,6 +85,7 @@ output is compatible with the `tf-run` toolchain and `tf-directory-setup.py`: ``` template-{product_type}/ +├── .gitignore # must exclude logs/ .terraform/ terraform.tfstate* ├── .tf-control # tf-run toolchain version pin ├── .tf-control.tfrc # Terraform provider cache config ├── region.tf # locals { region = var.region } @@ -99,9 +100,11 @@ template-{product_type}/ │ ├── remote_state.yml.j2 # ← layer-level; Proposer renders to remote_state.yml │ ├── east/ │ │ ├── tf-run.data # ← must contain REMOTE-STATE directive +│ │ ├── .terraform.lock.hcl # ← committed; Executor updates and pushes back to main │ │ └── {workload}.tf.j2 # ← Jinja2: rendered by Proposer │ └── west/ │ ├── tf-run.data # ← must contain REMOTE-STATE directive +│ ├── .terraform.lock.hcl # ← committed; Executor updates and pushes back to main │ └── {workload}.tf.j2 # ← Jinja2: rendered by Proposer └── README.md ``` @@ -111,6 +114,17 @@ template-{product_type}/ - `remote_state.yml.j2` lives at the **layer level** (`infrastructure/`, `common/`, `vpc/`), **not** inside workspace subdirectories. The Proposer's REMOTE-STATE processor derives each workspace's `remote_state.yml` from the layer-level file by appending `/{workspace_name}` to the `directory` field — identical to what `tf-run.sh` does at apply time. - Each workspace directory (`east/`, `west/`, `global/`) **must** include a `tf-run.data` file with a `REMOTE-STATE` directive so the Proposer knows to generate its `remote_state.yml`. - The `.auto.tfvars.j2` file must render `profile = "..."` and `region = "..."` entries at the top level — `tf-run.sh` auto-discovers profile and region by grepping `*.tfvars`, so these values must be present for placeholder substitution (`%%REGION%%`, `%%PROFILE%%`, etc.) to work correctly. +- `.gitignore` **must** contain at minimum: + ``` + logs/ + .terraform/ + terraform.tfstate + terraform.tfstate.backup + ``` + `logs/` is where `tf-control.sh` writes every plan/apply log. These are ephemeral and must never be committed. `.terraform/` caches the provider plugins locally during a run and must not be committed (only `.terraform.lock.hcl` is committed). +- `.terraform.lock.hcl` is the [dependency lock file](https://developer.hashicorp.com/terraform/language/files/dependency-lock) and **must be committed**. The template should include an initial lock file generated from the workspace's required providers. The Executor runs `tf-init` which updates it if providers change, then commits the update directly back to `main` (bypassing the PR flow, tagged `[skip ci]`). +- `.tf-control` sets `TFCOMMAND=terraform_latest` (the Census workstation alias). The Executor buildspec creates a `terraform_latest` symlink pointing to the installed `terraform` binary so `tf-control.sh` resolves it correctly. +- `.tf-control.tfrc` sets `plugin_cache_dir = "/data/terraform/terraform.d/plugin-cache"` and a `filesystem_mirror` at `/data/terraform/terraform.d/providers`. The Executor buildspec creates both directories. The `filesystem_mirror` path starts empty so Terraform falls through to the `direct {}` block — providers are fetched via the Census proxy and then cached in `plugin_cache_dir` for the remainder of the build. The plugin cache directory is also configured as a CodeBuild S3 cache path so provider archives persist across builds. - Files ending in `.j2` are Jinja2 templates. The Proposer renders them using the product input variables and commits the result (without the `.j2` extension) to the work branch. The `.j2` source files are **not** committed. ---