Skip to content

Commit

Permalink
Merge pull request #1 from SCT-Engineering/feature/template-repo-rend…
Browse files Browse the repository at this point in the history
…ering

docs: generalized architecture, ADR-001 accepted, ADR-002 Vault AWS Secrets Engine
  • Loading branch information
arnol377 committed Jun 11, 2026
2 parents 0f1b01f + 4b32072 commit 4417c06
Show file tree
Hide file tree
Showing 35 changed files with 6,322 additions and 1,757 deletions.
42 changes: 27 additions & 15 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,34 @@ system at Census. The architecture is:
```
SC Console (user fills product form)
└─> CFN Stack (Custom::* resource)
└─> Lambda (centralized in csvd-dev, 229685449397, us-gov-west-1)
└─> Lambda tf-run-executor-trigger (csvd-dev, 229685449397, us-gov-west-1)
├─> Validates inputs (Pydantic v2 models)
├─> Fetches GHE token from Secrets Manager
├─> POSTs repository_dispatch to target repo on GHE
└─> Polls GHA workflow run → returns repo URL + PR URL to CFN
GitHub Enterprise (github.e.it.census.gov)
└─> GHA workflow (repository_dispatch event)
├─> Clones the target account repo
├─> Renders HCL/YAML files from templates
└─> Commits + opens PR (repo-init → main)
├─> Starts CodeBuild: tf-run-proposer
└─> Polls CodeBuild → returns PR URL + repo URL to CFN
CodeBuild: tf-run-proposer (csvd-dev)
└─> Clones target account repo
└─> Renders HCL/YAML files from template repo (Jinja2)
└─> Commits rendered files → opens PR (propose/sc-automation → main)
↕ Human reviews diff and merges PR ↕
GHE push webhook → Lambda tf-run-webhook-handler
└─> Reads .sc-automation.yml from default branch
└─> Starts CodeBuild: tf-run-executor (fire-and-forget)
CodeBuild: tf-run-executor (csvd-dev)
└─> Clones account repo at main (post-merge state)
└─> Optionally assumes cross-account IAM role (sc-automation-codebuild-role)
└─> Runs tf-run apply in LAYER/REGION_DIR
└─> Commits post-apply changes (lock file, remote_state symlinks) to main [skip ci]
└─> Writes ✅/❌ commit status to GHE
```

This replaces the current CodeBuild + terraform-eks-deployment path with a
GHA-native approach that keeps workflow logic inside the target repos.
This replaces the current CodeBuild + terraform-eks-deployment path.
Workflow logic lives in `buildspec-proposer.yml` and `buildspec-executor.yml`
in this repo; product-type-specific logic lives in `handler.py` in each template repo.

---

Expand Down Expand Up @@ -154,7 +167,7 @@ scripts in `/apps/terraform/bin/`. Key behavior:
- `aws_account_id` and `aws_region` are auto-resolved via `!Sub` in CFN;
do NOT add them as user-facing SC form parameters
- Lambda ServiceToken: `arn:${AWS::Partition}:lambda:${AWS::Region}:${AWS::AccountId}:function:{name}`
- Lambda timeout must be ≥ CodeBuild/GHA poll window (currently 900s)
- Lambda timeout must be ≥ CodeBuild poll window (currently 900s)

---

Expand All @@ -165,8 +178,7 @@ scripts in `/apps/terraform/bin/`. Key behavior:
- ❌ Do not write temp files to `/tmp` — use `~/tmp`
- ❌ Do not use `terraform` directly — use `tf` alias (`tf plan`, `tf apply`)
- ❌ Do not run AWS CLI/boto3 without `export AWS_DEFAULT_REGION=us-gov-west-1`
- ❌ Do not add `vpc_id` — field is `vpc_name`
- ❌ Do not use `HappyPathway/terraform-github-repo` public module
- ✅ DO use `CSVD/terraform-github-repo` (https://github.e.it.census.gov/CSVD/terraform-github-repo)
- ✅ DO use `gh` CLI for PR management
- ✅ DO use `GH_HOST=github.e.it.census.gov` for all GHE commands
- ✅ Cross-account role name is `sc-automation-codebuild-role` — must exist in every target
account and trust the CodeBuild IAM role from csvd-dev before the first executor run
12 changes: 12 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Packer pipeline zip files
tf-run-executor-builder.zip

# Terraform state and local overrides
*.tfstate
*.tfstate.backup
*.tfvars
.terraform/
.terraform.lock.hcl
.terraform_commits
terraform_data_dirs/
varfiles/
185 changes: 185 additions & 0 deletions buildspec-executor.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
version: 0.2

# ---------------------------------------------------------------------------
# tf-run-executor buildspec
#
# Purpose: clone account repo main branch, optionally assume a cross-account
# IAM role, and run tf-run apply in the target layer/region directory.
# This is triggered AFTER a proposer PR has been reviewed and merged.
# It does not render templates or open a PR.
#
# Required env-var overrides per build (supplied by Lambda):
# ACCOUNT_REPO - account repo name, e.g. 229685449397-csvd-dev-platform-dev-gov
# LAYER - terraform layer: common | infrastructure | vpc
# REGION_DIR - region directory: east | west | global
# GITHUB_TOKEN - GHE PAT (PLAINTEXT, value from Secrets Manager)
#
# Optional env-var overrides:
# TARGET_ACCOUNT_ID - AWS account ID to assume the cross-account role in
# (default: empty = run with CodeBuild role, csvd-dev only)
# CROSS_ACCOUNT_ROLE - IAM role name to assume in TARGET_ACCOUNT_ID
# (default: r-inf-terraform)
# TF_RUN_START_TAG - tf-run.data TAG label to start from (default: empty = from top)
# DRY_RUN - "true" = tf-run plan only, no apply (default: "false")
# ---------------------------------------------------------------------------

env:
variables:
GITHUB_ORG: "SCT-Engineering"
TF_BINARY_S3_PREFIX: "s3://csvd-packer-pipeline-assets/terraform"
GH_CLI_S3_PREFIX: "s3://csvd-packer-pipeline-assets/tools"
CENSUS_CA_S3: "s3://csvd-packer-pipeline-assets/certs/census-ca.pem"
TERRAFORM_SUPPORT_REPO: "terraform/support"
HTTPS_PROXY: "http://proxy.tco.census.gov:3128"
NO_PROXY: "github.e.it.census.gov,169.254.169.254,169.254.170.2"
# Per-build defaults (overridden via environmentVariablesOverride in Lambda)
TARGET_ACCOUNT_ID: ""
CROSS_ACCOUNT_ROLE: "r-inf-terraform"
TF_RUN_START_TAG: ""
DRY_RUN: "false"

phases:
install:
commands:
# --- Version governance: clone terraform/support to read org-canonical versions ---
- git clone --depth 1 "https://${GITHUB_TOKEN}@github.e.it.census.gov/${TERRAFORM_SUPPORT_REPO}.git" /tmp/tf-support
- export TF_VERSION=$(cat /tmp/tf-support/terraform/VERSION)
- export GH_VERSION=$(cat /tmp/tf-support/github-cli-releases/VERSION)
- echo "Using Terraform ${TF_VERSION}, gh CLI ${GH_VERSION}"

# --- Terraform binary (registry.terraform.io is blocked on Census network; use S3) ---
- aws s3 cp "${TF_BINARY_S3_PREFIX}/terraform_${TF_VERSION}_linux_amd64.zip" /tmp/terraform.zip
- unzip -o /tmp/terraform.zip -d /usr/local/bin/ && chmod +x /usr/local/bin/terraform
- ln -sf /usr/local/bin/terraform /usr/local/bin/tf

# --- Census CA certificate (required for TLS to github.e.it.census.gov) ---
- aws s3 cp "$CENSUS_CA_S3" /etc/pki/ca-trust/source/anchors/census-ca.pem
- update-ca-trust extract

# --- tf-run toolchain (sourced from terraform/support, already cloned above) ---
# Canonical versions live in terraform/support local-app/ — no copies kept in this repo.
- cp /tmp/tf-support/local-app/tf-run/tf-run.sh /usr/local/bin/tf-run
- cp /tmp/tf-support/local-app/tf-control/tf-control.sh /usr/local/bin/tf-control.sh
- cp /tmp/tf-support/local-app/tf-directory-setup/tf-directory-setup.py /usr/local/bin/tf-directory-setup.py
- chmod +x /usr/local/bin/tf-run /usr/local/bin/tf-control.sh /usr/local/bin/tf-directory-setup.py
# Create tf-{action} symlinks expected by tf-run and account repo steps
- >
for action in init plan apply destroy refresh output validate import state fmt taint console; do
ln -sf /usr/local/bin/tf-control.sh /usr/local/bin/tf-${action};
done
# Account repo .tf-control files set TFCOMMAND=terraform_latest (the Census workstation alias).
# In CodeBuild the binary is just 'terraform'; create the alias so tf-control.sh resolves it.
- ln -sf /usr/local/bin/terraform /usr/local/bin/terraform_latest

# --- Plugin cache directory (referenced by .tf-control.tfrc in every account repo) ---
# .tf-control.tfrc sets plugin_cache_dir = "/data/terraform/terraform.d/plugin-cache"
# and filesystem_mirror path = "/data/terraform/terraform.d/providers".
# Create both so Terraform does not error on init; the mirror is empty so Terraform
# falls through to the 'direct' block in the tfrc (via Census proxy to registry.terraform.io).
- mkdir -p /data/terraform/terraform.d/plugin-cache /data/terraform/terraform.d/providers

# --- Python deps for tf-directory-setup.py ---
- pip3 install --quiet python-dateutil pyyaml

# --- gh CLI (from S3, for any post-apply verification steps) ---
- aws s3 cp "${GH_CLI_S3_PREFIX}/gh_${GH_VERSION}_linux_amd64.tar.gz" /tmp/gh.tar.gz
- mkdir -p /tmp/gh-cli && tar -xzf /tmp/gh.tar.gz -C /tmp/gh-cli --strip-components=1
- cp /tmp/gh-cli/bin/gh /usr/local/bin/gh && chmod +x /usr/local/bin/gh

build:
commands:
# --- Configure git to rewrite SSH URLs to HTTPS ---
# Module sources in account repos use ssh://git@github.e.it.census.gov/... or git@...
# This rewrite transparently redirects them to HTTPS + PAT at the git layer.
- git config --global url."https://${GITHUB_TOKEN}@github.e.it.census.gov/".insteadOf "ssh://git@github.e.it.census.gov/"
- git config --global url."https://${GITHUB_TOKEN}@github.e.it.census.gov/".insteadOf "git@github.e.it.census.gov:"

# --- Clone account repo from main (the reviewed + merged state) ---
- git clone "https://${GITHUB_TOKEN}@github.e.it.census.gov/${GITHUB_ORG}/${ACCOUNT_REPO}.git" repo
- cd repo
# Verify we are on main (not a work branch)
- git checkout main
- echo "Applying from $(git rev-parse --short HEAD) on main"

# --- Assume cross-account role (if TARGET_ACCOUNT_ID is set) ---
# The role (default: r-inf-terraform) must exist in the target account and
# trust arn:...:iam::229685449397:role/tf-run-executor-codebuild.
# Override CROSS_ACCOUNT_ROLE per-build to use a different role name.
- |
if [ -n "${TARGET_ACCOUNT_ID}" ]; then
PARTITION=$(aws sts get-caller-identity --query Arn --output text | cut -d: -f2)
ROLE_ARN="arn:${PARTITION}:iam::${TARGET_ACCOUNT_ID}:role/${CROSS_ACCOUNT_ROLE}"
echo "Assuming cross-account role: ${ROLE_ARN}"
CREDS=$(aws sts assume-role \
--role-arn "${ROLE_ARN}" \
--role-session-name "sc-automation-${ACCOUNT_REPO}" \
--query Credentials \
--output json)
export AWS_ACCESS_KEY_ID=$(echo "$CREDS" | python3 -c "import json,sys; print(json.load(sys.stdin)['AccessKeyId'])")
export AWS_SECRET_ACCESS_KEY=$(echo "$CREDS" | python3 -c "import json,sys; print(json.load(sys.stdin)['SecretAccessKey'])")
export AWS_SESSION_TOKEN=$(echo "$CREDS" | python3 -c "import json,sys; print(json.load(sys.stdin)['SessionToken'])")
echo "Assumed role in account ${TARGET_ACCOUNT_ID}"
else
echo "No TARGET_ACCOUNT_ID set — running with CodeBuild role (csvd-dev only)"
fi
# --- Run Terraform in target layer/region directory ---
# tf-run auto-proceeds on non-TTY stdin (read -t timeout defaults to "y")
#
# NOTE on file-generating tf-run.data directives:
# REMOTE-STATE — generates workspace remote_state.yml from parent
# COMMAND tf-directory-setup.py — generates remote_state.backend.tf + variant files
# The Proposer already ran both of these and committed the results in the PR.
# When tf-run hits these steps here they are idempotent: they overwrite files
# that already exist with identical content. No new files are created at apply time.
#
# NOTE on logs/: tf-control.sh writes every plan/apply to logs/{action}.{timestamp}.log.
# This directory is ephemeral (never committed). Ensure logs/ is in .gitignore.
- cd "${LAYER}/${REGION_DIR}"
- |
if [ "${DRY_RUN}" = "true" ]; then
echo "DRY_RUN=true — running tf-run plan only"
TFARGS="-no-color" tf-run plan
elif [ -n "${TF_RUN_START_TAG}" ]; then
TFARGS="-auto-approve" tf-run apply "tag:${TF_RUN_START_TAG}"
else
TFARGS="-auto-approve" tf-run apply
fi
# --- Commit post-apply file changes back to main ---
# After a successful apply tf-run.data typically runs:
# COMMAND tf-directory-setup.py --link s3
# which re-links remote_state.{dir}.tf from .tf.none → .tf.s3.
# terraform init also generates/updates .terraform.lock.hcl.
# Both of these changes must be committed back to main so:
# (a) the repo reflects actual state for future Proposer re-renders
# (b) subsequent tf-init on main does not re-download all providers
# [skip ci] prevents the push from re-triggering the webhook executor.
- cd "${CODEBUILD_SRC_DIR}/repo"
- |
git add -A -- "${LAYER}/${REGION_DIR}/remote_state."* \
"${LAYER}/${REGION_DIR}/.terraform.lock.hcl" 2>/dev/null || true
if ! git diff --cached --quiet; then
git -c user.email="sc-automation@census.gov" \
-c user.name="SC Automation" \
commit -m "chore: executor post-apply update ${LAYER}/${REGION_DIR} [skip ci]"
git push \
"https://${GITHUB_TOKEN}@github.e.it.census.gov/${GITHUB_ORG}/${ACCOUNT_REPO}.git" \
HEAD:main
echo "Committed and pushed post-apply changes to main"
else
echo "No post-apply file changes to commit"
fi
post_build:
commands:
- echo "BUILD_RESULT=${CODEBUILD_BUILD_SUCCEEDING}"
- echo "ACCOUNT_REPO=${ACCOUNT_REPO}"
- echo "LAYER=${LAYER} REGION_DIR=${REGION_DIR}"

cache:
paths:
# Cache the provider plugin cache across builds for faster tf-init.
# Providers downloaded via Census proxy are stored here; subsequent builds
# skip re-downloading providers that haven't changed.
- /data/terraform/terraform.d/plugin-cache/**/*
Loading

0 comments on commit 4417c06

Please sign in to comment.