Skip to content

Commit

Permalink
feat: CodeBuild+Terraform runtime for EKS repo creation (#17)
Browse files Browse the repository at this point in the history
* fix: dynamic paths in rendered_files + HTTPS module source for CodeBuild

- Replace placeholder paths (environment/region/vpc/cluster/) with
  var.environment / var.region / var.cluster_config.vpc_name / var.name
  so managed_extra_files land in the correct Terragrunt hierarchy
- Change CSVD/terraform-github-repo source from SSH (git@) to HTTPS
  (git::https://) to work inside CodeBuild without SSH agent

* feat: add buildspec.yml for CodeBuild repo-creator project

Buildspec used by the 'eks-terragrunt-repo-creator' CodeBuild project
triggered by the Lambda function. Downloads Terraform from S3 assets bucket,
clones this repo using GITHUB_TOKEN env var, then runs:
  terraform init -no-color
  terraform apply -auto-approve -no-color
TF_VAR_* env vars are injected by the Lambda as CodeBuild environment
variable overrides.

* docs: update callnotes with Step 1-4 DONE status for pivot plan

* fix: correct REPO_ORG from CSVD to SCT-Engineering

The terraform-eks-deployment repo lives in SCT-Engineering, not CSVD.
CSVD would have caused git clone 404 in CodeBuild.

* fix: add REPO_BRANCH and use it in git clone

- Add REPO_BRANCH env var (currently fix/eca-copilot-instructions-and-callnotes)
  pointing to the branch with dynamic path fixes and HTTPS module source
- Pass --branch to git clone so CodeBuild checks out the right code
- Update REPO_BRANCH to 'main' once the fix branch is merged

* fix: bump TF_VERSION from 1.9.0 to 1.9.1

terraform_1.9.0 zip was not in s3://csvd-packer-pipeline-assets/terraform/.
Uploaded terraform_1.9.1_linux_amd64.zip to that path from local tfenv install.
Public releases.hashicorp.com is blocked by Census network proxy.

* fix: add HTTPS_PROXY/NO_PROXY to buildspec for registry.terraform.io access

registry.terraform.io is blocked directly inside CodeBuild (Census network).
Must route through http://proxy.tco.census.gov:3128.
NO_PROXY excludes AWS-internal endpoints (.amazonaws.com) from proxy.

* fix: widen github provider constraint to >= 6.11.0, drop stale lock file

CSVD/terraform-github-repo module requires ~> 6.11; workspace had >= 6.6.0, < 6.7.0
which is incompatible. Lock file was pinned to 6.6.0 — delete so terraform init
regenerates it against the updated constraint.

* fix: add provider "github" block with insecure=true for Census GHE TLS

The Census GHE TLS cert is signed by the Census internal CA which is not
present in the CodeBuild container trust store. insecure=true disables
x509 verification so terraform apply can call the GHE API.

* fix: install Census CA cert + add GHE to NO_PROXY in CodeBuild buildspec

- Download census-ca.pem from S3 assets bucket and add to Amazon Linux 2
  trust store via update-ca-trust during INSTALL phase
- Add github.e.it.census.gov to NO_PROXY so Terraform provider connects
  directly (not through proxy) and trusts Census CA chain
- Keep insecure=true in providers.tf as belt-and-suspenders

* docs: rewrite copilot-instructions to reflect CodeBuild+Terraform architecture

- Replace 'Lambda-Only Approach' and 'Do NOT suggest CodeBuild' sections
- Document full buildspec.yml runtime environment (proxy, CA cert, TF binary from S3)
- Add complete Key Resources table with CodeBuild projects and token sources
- Add Important Runtime Notes section with Census-specific networking requirements
- Update What NOT to Do section with correct guidance

* chore: commit callnotes updates and whitespace alignment in examples

* docs: replace duplicated SC deployment section with cross-reference

The SC Product Deployment Methods section was near-identical to the
canonical version in lambda-template-repo-generator. Replace with a
concise cross-reference to keep a single source of truth.

* feat: rename template placeholder dirs via GitHub API after repo creation

Add scripts/rename_template_dirs.py (Python, httpx + rich) that calls the
GitHub API to delete environment/region/vpc/cluster/ placeholder paths from
the repo-init PR branch and re-add the eks-*/terragrunt.hcl files at their
correct computed paths:

  environment/region/vpc/cluster/eks-*/terragrunt.hcl
    → ${environment}/${region}/${vpc_name}/${cluster_name}/eks-*/terragrunt.hcl

Files already rendered by managed_extra_files (account.hcl, region.hcl,
vpc.hcl, cluster.hcl) are deleted from the placeholder paths but not
re-added — Terraform already wrote them with real values.

Controlled by var.run_in_codebuild (default false). buildspec.yml sets
TF_VAR_run_in_codebuild=true so the null_resource only runs in CodeBuild.

Also adds the null provider to providers.tf and pip3 install of httpx+rich
to the buildspec install phase.

---------

Co-authored-by: Your Name <user@example.com>
  • Loading branch information
arnol377 and Your Name committed Apr 21, 2026
1 parent dcf8501 commit c501efa
Show file tree
Hide file tree
Showing 7 changed files with 684 additions and 6 deletions.
113 changes: 113 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# GitHub Copilot Instructions — terraform-eks-deployment

## Project Context

This is the **Terraform workspace** that creates EKS cluster GitHub repositories as part of the
EKS Cluster Automation (ECA) system. It is run by a CodeBuild project (`eks-terragrunt-repo-creator`)
triggered at runtime by the `eks-terragrunt-repo-gen-template-automation` Lambda function.

---

## Architecture: Lambda → CodeBuild → Terraform

EKS cluster repos are created via this chain:

```
SC Console → CFN Stack → Custom::GitHubRepository
→ Lambda (eks-terragrunt-repo-gen-template-automation)
→ CodeBuild project (eks-terragrunt-repo-creator) ← this repo runs here
→ git clone terraform-eks-deployment (REPO_BRANCH)
→ terraform init (providers from registry.terraform.io via Census proxy)
→ terraform apply -auto-approve
→ CSVD/terraform-github-repo module:
→ Creates GHE repo from template-eks-cluster
→ Writes 8 rendered Terragrunt HCL files via managed_extra_files
→ Opens pull request (repo-init → main)
→ Lambda polls build → fetches PR URL → cfn-response SUCCESS
```

### CodeBuild environment

CodeBuild runs on `aws/codebuild/amazonlinux2-x86_64-standard:3.0` (Amazon Linux 2).
Key env vars injected by the Lambda at build-start time:

| Variable | Source | Purpose |
|----------|--------|---------|
| `GITHUB_TOKEN` | Secrets Manager `ghe-runner/github-token` (PAT, `ghp_`) | Terraform GitHub provider auth; git clone |
| `TF_VAR_name` | CFN input `project_name` | Repo name |
| `TF_VAR_environment` | CFN input `environment` | e.g. `dev` |
| `TF_VAR_region` | CFN input `aws_region` | e.g. `us-gov-west-1` |
| `TF_VAR_cluster_config` | JSON-encoded EKS fields | `vpc_name`, `vpc_domain_name`, `cluster_name`, etc. |
| `TF_VAR_finops` | JSON-encoded FinOps fields | project name/number |
| `GITHUB_OWNER` | `SCT-Engineering` | GitHub org for provider |
| `GITHUB_BASE_URL` | `https://github.e.it.census.gov` | GHE API base URL |

The buildspec is inlined from this repo's `buildspec.yml` into the
`aws_codebuild_project.eks_repo_creator` resource in `lambda-template-repo-generator/deploy/main.tf`.
After editing `buildspec.yml`, run `tf apply` in `lambda-template-repo-generator/deploy/` to update.

---

## Key Files

| File | Purpose |
|------|--------|
| `buildspec.yml` | CodeBuild build steps: install Terraform + Census CA cert, clone repo, tf init + apply |
| `main.tf` | Module call to `CSVD/terraform-github-repo`; `managed_extra_files` with rendered HCL |
| `providers.tf` | GitHub provider config (`>= 6.11.0`, `insecure = true`) + AWS provider |
| `variables.tf` | All TF_VAR_* inputs from CodeBuild env (name, environment, region, cluster_config, finops) |
| `locals.tf` | Decodes `cluster_config` JSON; resolves HCL template content |
| `defaults.tf` | Default values for template repo org, template name, org name |
| `callnotes.md` | Session notes and fix log |

---

## SC Product Deployment Methods

See `lambda-template-repo-generator/.github/copilot-instructions.md` for full details.
Both methods use the same CFN product template; they must stay in sync.

- **Method 1** (testing): `cd lambda-template-repo-generator/deploy && tf apply`
- **Method 2** (production): `cd terraform-service-catalog-census/non-prod/csvd-dev/west/service-catalog && tf apply`

Always verify Method 1 works first when debugging a census deployment issue.

---

## Key Resources

| Resource | Location | Purpose |
|----------|----------|---------|
| Lambda | `eks-terragrunt-repo-gen-template-automation` (us-gov-west-1, 229685449397) | CFN Custom Resource handler; starts CodeBuild |
| CodeBuild (repo creation) | `eks-terragrunt-repo-creator` | Runs this workspace via tf apply |
| CodeBuild (image build) | `eks-terragrunt-repo-generator-builder` | Builds Lambda container via packer |
| S3 assets bucket | `csvd-packer-pipeline-assets` | Terraform binary, Census CA cert, Packer binary |
| GitHub token (Terraform) | Secrets Manager `ghe-runner/github-token` | PAT (`ghp_`) for Terraform GitHub provider |
| GitHub token (Lambda) | Secrets Manager `/eks-cluster-deployment/github_token` | App token (`ghs_`) for Lambda Python API calls |
| CSVD TF module | `https://github.e.it.census.gov/CSVD/terraform-github-repo` | Creates repo + files + PR |
| Census SC product template | `terraform-service-catalog-census/templates/products/eks-terragrunt-repo/2-0-0.yaml` | Live SC product CFN template |
| Canonical SC product template | `lambda-template-repo-generator/service-catalog/product-template.yaml` | Reference/source of truth |

---

## Important Runtime Notes

- **Terraform binary** is installed from `s3://csvd-packer-pipeline-assets/terraform/terraform_1.9.1_linux_amd64.zip` in the INSTALL phase — `releases.hashicorp.com` is blocked on the Census network
- **Census CA cert** is installed from `s3://csvd-packer-pipeline-assets/certs/census-ca.pem` via `update-ca-trust` — required for TLS to `github.e.it.census.gov`
- **Census proxy** `http://proxy.tco.census.gov:3128` must be set as `HTTPS_PROXY`/`HTTP_PROXY` — required for `registry.terraform.io` provider downloads
- **`github.e.it.census.gov`** must be in `NO_PROXY` — direct connection (not via proxy)
- **GitHub provider version** must be `>= 6.11.0` — required by `CSVD/terraform-github-repo` module's `~> 6.11` constraint
- **`provider "github" { insecure = true }`** in `providers.tf` — belt-and-suspenders for TLS
- Pass `vpc_name` (string) — **not** `vpc_id` — or `is_eks_deployment` in the Lambda returns `False`

---

## What NOT to Do

- ❌ Do not switch back to a Lambda-Python-only repo creation approach — all repo creation must run through CodeBuild + this Terraform workspace (single maintenance point)
- ❌ Do not use `HappyPathway/terraform-github-repo` **public** module — pins `github ~> 6.0`, conflicts with `>= 6.6.0` requirement
- ✅ DO use `CSVD/terraform-github-repo` (https://github.e.it.census.gov/CSVD/terraform-github-repo) — internal, supports `template_repo` + `managed_extra_files`
- ❌ Do not use SSH-based remote module sources (`git::ssh://`) — Census proxy blocks SSH host key exchange; use HTTPS
- ❌ Do not add `vpc_id` as a parameter to SC product templates — use `vpc_name`
- ❌ Do not write temp files or command output to `/tmp` — use `~/tmp` (i.e. `/home/a/arnol377/tmp`) instead
- ❌ Do not use the `terraform` command directly — always use the `tf` alias (e.g. `tf plan`, `tf apply`, `tf init`)
89 changes: 89 additions & 0 deletions buildspec.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
version: 0.2
# buildspec.yml — terraform-eks-deployment / eks-terragrunt-repo-creator
#
# This buildspec is used by the CodeBuild project that is triggered by the
# Lambda function (eks-terragrunt-repo-gen-template-automation) to create an
# EKS cluster GitHub repository.
#
# Required environment variables (injected by the Lambda as overrides):
# TF_VAR_name — cluster / repo name
# TF_VAR_environment — environment (dev / nonprod / prod)
# TF_VAR_region — AWS region (e.g. us-gov-west-1)
# TF_VAR_cluster_config — JSON object with account_name, aws_account_id, etc.
# TF_VAR_finops — JSON object with finops project_name / project_number
# GITHUB_TOKEN — GitHub PAT (passed from Lambda's Secrets Manager read)
# GITHUB_OWNER — GitHub org (default: SCT-Engineering)
# GITHUB_BASE_URL — GHE base URL (e.g. https://github.e.it.census.gov)

env:
variables:
TF_VERSION: "1.9.1"
ASSETS_BUCKET: "csvd-packer-pipeline-assets"
REPO_HOST: "github.e.it.census.gov"
REPO_ORG: "SCT-Engineering"
REPO_NAME: "terraform-eks-deployment"
REPO_BRANCH: "fix/eca-copilot-instructions-and-callnotes" # update to main once merged
# Disable TLS verification for Census GHE (Census CA cert not trusted by default)
GIT_SSL_NO_VERIFY: "true"
TF_VAR_run_in_codebuild: "true"
TF_CLI_ARGS: "-no-color"
# Census proxy — required for registry.terraform.io provider downloads
HTTPS_PROXY: "http://proxy.tco.census.gov:3128"
HTTP_PROXY: "http://proxy.tco.census.gov:3128"
# Exclude AWS-internal endpoints and Census GHE from the proxy
NO_PROXY: "169.254.169.254,169.254.170.2,s3.us-gov-west-1.amazonaws.com,s3.amazonaws.com,.amazonaws.com,.us-gov-west-1.amazonaws.com,github.e.it.census.gov"

phases:
install:
commands:
# ── Install Census Bureau CA certificate ──────────────────────────────
# The Census GHE TLS cert is issued by the Census Bureau CA which is not
# trusted by the CodeBuild Amazon Linux 2 trust store by default.
- |
aws s3 cp "s3://${ASSETS_BUCKET}/certs/census-ca.pem" \
/etc/pki/ca-trust/source/anchors/census-ca.pem 2>/dev/null \
&& update-ca-trust \
&& echo "Census CA cert installed" \
|| echo "WARNING: could not install Census CA cert (continuing anyway)"
# ── Install Terraform ─────────────────────────────────────────────────
- |
if ! command -v terraform &>/dev/null; then
TF_ZIP="terraform_${TF_VERSION}_linux_amd64.zip"
echo "Installing Terraform ${TF_VERSION}..."
aws s3 cp "s3://${ASSETS_BUCKET}/terraform/${TF_ZIP}" /tmp/${TF_ZIP} 2>/dev/null \
|| curl -fsSL "https://releases.hashicorp.com/terraform/${TF_VERSION}/${TF_ZIP}" -o /tmp/${TF_ZIP}
unzip -oq /tmp/${TF_ZIP} -d /usr/local/bin/
chmod +x /usr/local/bin/terraform
rm /tmp/${TF_ZIP}
fi
- terraform version

# ── Install Python dependencies for post-apply scripts ───────────────
- pip3 install --quiet httpx rich

# ── Clone terraform-eks-deployment ───────────────────────────────────
- |
git config --global credential.helper \
"!f() { echo username=x-access-token; echo password=${GITHUB_TOKEN}; }; f"
git clone --depth 1 --branch "${REPO_BRANCH}" \
"https://${REPO_HOST}/${REPO_ORG}/${REPO_NAME}.git" \
/tmp/eks-deploy
- echo "Cloned ${REPO_ORG}/${REPO_NAME} @ $(git -C /tmp/eks-deploy rev-parse --short HEAD)"

build:
commands:
- cd /tmp/eks-deploy
- echo "=== terraform init ==="
- terraform init -no-color
- echo "=== terraform apply ==="
- terraform apply -auto-approve -no-color

post_build:
commands:
- |
if [ "${CODEBUILD_BUILD_SUCCEEDING}" = "0" ]; then
echo "Build FAILED — check logs above"
else
echo "Build SUCCEEDED — repository created"
fi
Loading

0 comments on commit c501efa

Please sign in to comment.