diff --git a/.github/agents/implementation.agent.md b/.github/agents/implementation.agent.md index 9b824e3..61f0d26 100644 --- a/.github/agents/implementation.agent.md +++ b/.github/agents/implementation.agent.md @@ -1,15 +1,7 @@ --- name: implementation description: Full implementation agent. Writes code, creates files, runs tests, and commits changes based on an approved plan. -tools: - - read - - search - - fetch - - edit - - create - - delete - - terminal - - run +tools: [vscode, execute, read, agent, edit, search, todo] model: Claude Sonnet 4.6 (copilot) handoffs: - label: "πŸ” Review Changes" diff --git a/.github/agents/planner.agent.md b/.github/agents/planner.agent.md index 44e1f00..de2db5f 100644 --- a/.github/agents/planner.agent.md +++ b/.github/agents/planner.agent.md @@ -4,7 +4,6 @@ description: Read-only planning agent. Analyzes the codebase, designs interfaces tools: - read - search - - fetch - web model: Claude Sonnet 4.6 (copilot) handoffs: diff --git a/.github/agents/reviewer.agent.md b/.github/agents/reviewer.agent.md index 6f81c24..18d94cb 100644 --- a/.github/agents/reviewer.agent.md +++ b/.github/agents/reviewer.agent.md @@ -4,7 +4,7 @@ description: Code review agent. Reviews changes for correctness, security, Censu tools: - read - search - - fetch + - web model: Claude Sonnet 4.6 (copilot) handoffs: - label: "πŸ”§ Fix Issues" diff --git a/.github/prompts/analyze-pr-comments.prompt.md b/.github/prompts/analyze-pr-comments.prompt.md index 9febfdc..6f55228 100644 --- a/.github/prompts/analyze-pr-comments.prompt.md +++ b/.github/prompts/analyze-pr-comments.prompt.md @@ -3,10 +3,7 @@ name: analyze-pr-comments description: Fetch all review comments on a GHE pull request, summarize the issues, and propose or implement fixes. argument-hint: "[repo org/name] [PR number]" agent: implementation -tools: - - terminal - - read - - edit +tools: [vscode, execute, read, agent, edit, search, todo] --- # Analyze and Address PR Review Comments diff --git a/.github/prompts/checkpoint-load.prompt.md b/.github/prompts/checkpoint-load.prompt.md new file mode 100644 index 0000000..d7fc863 --- /dev/null +++ b/.github/prompts/checkpoint-load.prompt.md @@ -0,0 +1,43 @@ +--- +name: checkpoint-load +description: Restore project context from CHECKPOINT.md at the start of a new session. Run this before doing any work. +tools: [read, search, todo] +--- + +# Load Checkpoint + +Restore context from `design-docs/CHECKPOINT.md` and get ready to work. + +## Instructions + +1. Read `design-docs/CHECKPOINT.md` in full +2. Read `design-docs/README.md` (full architecture spec) +3. Read any files listed in the checkpoint's **Key File Index** that are marked + as recently changed or directly relevant to the **Current Phase** +4. Output a session briefing in this format: + +--- + +**Session Briefing β€” [date]** + +**Where we left off**: [one sentence from Last Updated] + +**Current phase**: [phase name + step] + +**Next action**: [exact next action from checkpoint] + +**Open questions**: [bulleted list, or "None" if clear] + +**Files to be aware of**: [only the ones relevant to the next action] + +--- + +5. Populate the todo list with the checklist items for the current phase + (mark already-completed items as completed) +6. Ask: "Ready to continue? Any updates since the last checkpoint?" + +## Notes + +- Do not start implementing anything until the user confirms +- If `design-docs/CHECKPOINT.md` does not exist, say so and suggest running + the `checkpoint-save` prompt first after reviewing `design-docs/README.md` diff --git a/.github/prompts/checkpoint-save.prompt.md b/.github/prompts/checkpoint-save.prompt.md new file mode 100644 index 0000000..fc1a5d5 --- /dev/null +++ b/.github/prompts/checkpoint-save.prompt.md @@ -0,0 +1,63 @@ +--- +name: checkpoint-save +description: Save current project state to CHECKPOINT.md. Run this at the end of any work session to preserve context for the next session. +tools: [read, edit, search, todo] +--- + +# Save Checkpoint + +Update the file `design-docs/CHECKPOINT.md` to reflect the current project state. + +## Instructions + +1. Read the current `design-docs/CHECKPOINT.md` +2. Read `design-docs/README.md` (architecture decisions) +3. Scan the repo for any files created or modified since the last checkpoint + (check git status if possible, or look at timestamps) +4. Rewrite `design-docs/CHECKPOINT.md` using the schema below + +## CHECKPOINT.md Schema + +The file must have exactly these sections in this order: + +### 1. Last Updated +Date and a one-line summary of what changed this session. + +### 2. Architecture (locked in) +Brief settled description of the pipeline. Mark anything that changed since the +last checkpoint with `[CHANGED]`. Do not store debate β€” only the settled decision. + +### 3. Current Phase +Which implementation phase we are in and what step within it. +Include the phase checklist from `design-docs/README.md` with checkboxes updated. + +### 4. Next Action +One or two sentences: exactly what to do first at the start of the next session. +Be specific β€” include file paths and commands. + +### 5. Open Questions +Bulleted list of things not yet decided or needing external input. +Remove items that have been answered since the last checkpoint. + +### 6. Key File Index +Table of every file created or modified in this repo, with a one-line purpose. +Include files that need to be created but don't exist yet, marked `[TODO]`. + +### 7. External Dependencies +Things outside this repo that must be true for the project to work: +- AWS resources (Secrets Manager secrets, S3 buckets, ECR repos, etc.) +- GHE repos and orgs +- IAM roles or permissions assumed to exist +- Other repos whose patterns we follow + +### 8. Constraints (never change without discussion) +Hard rules that must not be violated. Pull from `design-docs/README.md` +"What NOT to Do" section plus any session-specific constraints discovered. + +## Rules + +- Keep each section concise. This file is read at the start of every session β€” + it must load fast and be scannable. +- Do NOT copy-paste large blocks from README.md. Summarize and link instead. +- Remove stale information. If something was resolved, move it out of Open Questions. +- Mark TODOs clearly so the next session can pick up immediately. diff --git a/.github/prompts/new-account-repo-layer.prompt.md b/.github/prompts/new-account-repo-layer.prompt.md index 6fa4491..af2a583 100644 --- a/.github/prompts/new-account-repo-layer.prompt.md +++ b/.github/prompts/new-account-repo-layer.prompt.md @@ -3,11 +3,7 @@ name: new-account-repo-layer description: Scaffold a new layer directory (common, infrastructure, or vpc) inside an account repo, complete with remote_state.yml, tf-run.data, and boilerplate symlink stubs. argument-hint: "[layer: common|infrastructure|vpc] [region: us-gov-west-1|us-gov-east-1] [account_id] [account_alias]" agent: implementation -tools: - - read - - create - - edit - - terminal +tools: [vscode, execute, read, agent, edit, search, todo] --- # Scaffold a New Account Repo Layer diff --git a/.github/prompts/review-sc-template.prompt.md b/.github/prompts/review-sc-template.prompt.md index 83bb5f2..04ac72e 100644 --- a/.github/prompts/review-sc-template.prompt.md +++ b/.github/prompts/review-sc-template.prompt.md @@ -3,9 +3,7 @@ name: review-sc-template description: Review a Service Catalog CloudFormation product template for compliance with Census Lambda interface conventions. argument-hint: "[path to template yaml file]" agent: reviewer -tools: - - read - - search +tools: [vscode, execute, read, agent, edit, search, todo] --- # Review SC Product Template diff --git a/.github/skills/account-repo-analysis/SKILL.md b/.github/skills/account-repo-analysis/SKILL.md index 5bd8b2b..4fbb05d 100644 --- a/.github/skills/account-repo-analysis/SKILL.md +++ b/.github/skills/account-repo-analysis/SKILL.md @@ -1,11 +1,6 @@ --- name: account-repo-analysis -description: > - Analyze or scaffold Census AWS account repos. Understands the 3-layer directory - structure (common/infrastructure/vpc), regional splits (east/west), tf-run DSL - directives, remote_state.yml schema, and the tf-run toolchain scripts. Use this - skill when asked to analyze an account repo, explain tf-run steps, scaffold a new - layer, or debug tf-directory-setup.py failures. +description: "Analyze or scaffold Census AWS account repos. Understands the 3-layer directory structure (common/infrastructure/vpc), regional splits (east/west), tf-run DSL directives, remote_state.yml schema, and the tf-run toolchain scripts. Use this skill when asked to analyze an account repo, explain tf-run steps, scaffold a new layer, or debug tf-directory-setup.py failures." --- # Account Repo Analysis Skill diff --git a/design-docs/CHECKPOINT.md b/design-docs/CHECKPOINT.md new file mode 100644 index 0000000..3992da1 --- /dev/null +++ b/design-docs/CHECKPOINT.md @@ -0,0 +1,134 @@ +# CHECKPOINT + +## 1. Last Updated + +**2026-04-28** β€” Architecture finalized (CodeBuild all-in-one runner, GHA deferred on OIDC blocker). +Design doc written. Checkpoint system created. Skill file description syntax fixed. +No implementation code written yet. + +--- + +## 2. Architecture (locked in) + +**Pipeline**: SC Console β†’ CFN `Custom::TerraformRun` β†’ Lambda β†’ CodeBuild β†’ Account Repo (`tf-run` + PR) + +- **Lambda** (`tf-run-executor-trigger`, csvd-dev `229685449397`, `us-gov-west-1`, 900s timeout): + validates inputs (Pydantic v2), fetches GHE PAT from Secrets Manager, starts CodeBuild, + polls every 20s, signals CFN SUCCESS/FAILED with PR URL. + +- **CodeBuild** (`tf-run-executor`, 60 min timeout, Amazon Linux 2): + installs Terraform from S3 + Census CA cert β†’ clones account repo over HTTPS β†’ + writes `EXTRA_FILES` β†’ commits + pushes to `repo-init` β†’ `cd //` β†’ + `tf-run apply [tag:START_TAG]` (or `tf plan` if `DRY_RUN=true`) β†’ opens PR via `gh` CLI. + +- **GHA**: deferred β€” blocked on OIDC. Buildspec designed to port directly to GHA workflow with no Lambda changes. + +Full spec: `design-docs/README.md` + +--- + +## 3. Current Phase + +**Phase 1 β€” CodeBuild + buildspec (manual test)** + +- [ ] Write `buildspec.yml` at repo root +- [ ] Write `deploy/codebuild.tf` β€” CodeBuild project with inline buildspec + IAM service role +- [ ] Test manually via AWS CLI `start-build` with env var overrides against a test account repo +- [ ] Validate: clone β†’ write file β†’ commit β†’ `tf plan` in `infrastructure/west/` β†’ PR opens + +Phases 2–4 not started. See `design-docs/README.md` for full phase list. + +--- + +## 4. Next Action + +Create `buildspec.yml` at `/home/a/arnol377/git/sc-lambda-ghactions/buildspec.yml` +using the draft in `design-docs/README.md` β†’ CodeBuild Project Design β†’ buildspec.yml. + +Then create `deploy/codebuild.tf` with the `aws_codebuild_project.tf_run_executor` resource +and an `aws_iam_role.codebuild_exec` with S3 read, Secrets Manager read, and CloudWatch Logs write. + +--- + +## 5. Open Questions + +- What is the correct CodeBuild service role ARN format for this account? (need to verify existing pattern in `lambda-template-repo-generator/deploy/main.tf`) +- `tf-run` is interactive by default (prompts `y/n` between steps) β€” need to confirm non-interactive invocation flag or environment variable to suppress prompts in CodeBuild. Check `tf-run.sh` behavior with `CI=true` or similar. +- Self-hosted GHA runner label (for future GHA migration) β€” confirm with Matt Morgan once OIDC is unblocked. +- Should `extra_files` support base64-encoded binary content, or plain text only? + +--- + +## 6. Key File Index + +### Design / Memory (this repo) + +| File | Status | Purpose | +|------|--------|---------| +| `design-docs/README.md` | βœ… done | Full architecture spec, buildspec draft, phase checklist | +| `design-docs/CHECKPOINT.md` | βœ… done | This file β€” session memory | + +### Prompts (`.github/prompts/`) + +| File | Status | Purpose | +|------|--------|---------| +| `checkpoint-save.prompt.md` | βœ… done | End-of-session: rewrite CHECKPOINT.md from current state | +| `checkpoint-load.prompt.md` | βœ… done | Start-of-session: restore context, brief + todo list | +| `review-sc-template.prompt.md` | βœ… exists | Review SC CFN templates for Census conventions | +| `new-account-repo-layer.prompt.md` | βœ… exists | Scaffold a new layer in an account repo | +| `analyze-pr-comments.prompt.md` | βœ… exists | Analyze PR review comments | + +### Agents (`.github/agents/`) + +| File | Status | Purpose | +|------|--------|---------| +| `planner.agent.md` | βœ… exists | Planning / design agent | +| `implementation.agent.md` | βœ… exists | Implementation agent | +| `reviewer.agent.md` | βœ… exists | Review agent | + +### Skills (`.github/skills/`) + +| File | Status | Purpose | +|------|--------|---------| +| `account-repo-analysis/SKILL.md` | βœ… fixed | Account repo structure, tf-run DSL, remote_state.yml β€” description syntax fixed (block scalar β†’ single-line string) | + +### Implementation (to be created) + +| File | Status | Purpose | +|------|--------|---------| +| `buildspec.yml` | πŸ”² TODO | CodeBuild buildspec β€” Phase 1, step 1 | +| `deploy/codebuild.tf` | πŸ”² TODO | CodeBuild project + service role β€” Phase 1, step 2 | +| `deploy/versions.tf` | πŸ”² TODO | Terraform + provider version pins | +| `lambda/app.py` | πŸ”² TODO | Lambda CFN handler β€” Phase 2 | +| `lambda/Dockerfile` | πŸ”² TODO | Lambda container image β€” Phase 2 | +| `deploy/lambda.tf` | πŸ”² TODO | Lambda function + ECR repo β€” Phase 2 | +| `deploy/iam.tf` | πŸ”² TODO | IAM roles for Lambda + CodeBuild β€” Phase 2 | +| `service-catalog/product-template.yaml` | πŸ”² TODO | SC CFN product template β€” Phase 3 | +| `deploy/service_catalog.tf` | πŸ”² TODO | SC portfolio/product/constraint β€” Phase 3 | + +--- + +## 7. External Dependencies + +| Resource | Location | Notes | +|----------|----------|-------| +| `ghe-runner/github-token` | Secrets Manager, csvd-dev, us-gov-west-1 | GHE PAT (`ghp_`), reused from EKS automation | +| `s3://csvd-packer-pipeline-assets/terraform/terraform_1.9.1_linux_amd64.zip` | S3, csvd-dev | Terraform binary (registry.terraform.io blocked) | +| `s3://csvd-packer-pipeline-assets/certs/census-ca.pem` | S3, csvd-dev | Census CA cert for GHE TLS | +| GHE org `SCT-Engineering` | `github.e.it.census.gov` | Target account repos live here | +| Account repos (pre-bootstrapped) | GHE `SCT-Engineering` | Must already have `remote_state.backend.tf`, `tf-run.data`, `.tf-control` | +| `lambda-template-repo-generator/deploy/main.tf` | Local repo | Reference pattern for CodeBuild + Lambda IAM + SC wiring | + +--- + +## 8. Constraints (never change without discussion) + +- ❌ Never `terraform` directly β€” always `tf` alias (tf-control.sh symlink) +- ❌ Never SSH clone β€” Census proxy blocks SSH; always `https://@github.e.it.census.gov/...` +- ❌ Never hardcode `aws-us-gov` in ARNs β€” use `${AWS::Partition}` +- ❌ Never add `aws_account_id` or `aws_region` as SC form parameters β€” use `!Sub` resolution +- ❌ Never run tf-run from the repo root β€” always `cd //` first +- ❌ Never write temp files to `/tmp` β€” use the CodeBuild build directory +- ❌ Never use `HappyPathway/terraform-github-repo` β€” use `CSVD/terraform-github-repo` +- βœ… Always set `HTTPS_PROXY=http://proxy.tco.census.gov:3128` + `NO_PROXY=github.e.it.census.gov,...` +- βœ… Always use `GH_HOST=github.e.it.census.gov` for all `gh` CLI commands diff --git a/design-docs/README.md b/design-docs/README.md index 7e0954f..aa54790 100644 --- a/design-docs/README.md +++ b/design-docs/README.md @@ -1,28 +1,270 @@ # Design Documents Architecture decisions, flow diagrams, and planning notes for the -SC β†’ Lambda β†’ GitHub Actions automation. - -## Key Design Decisions - -### Why GitHub Actions instead of CodeBuild? - -- GHA has first-class access to repo contents without extra clone steps -- Workflow files live in the target repo β€” no central runner config to maintain -- Built-in events (`repository_dispatch`) allow Lambda to trigger specific workflows -- Easier to test locally via `act` - -### Flow - -1. User provisions SC product β†’ fills form (cluster name, account, VPC, etc.) -2. CFN creates `Custom::*` resource with `ServiceToken` pointing to Lambda ARN -3. Lambda: - - Validates inputs (Pydantic model) - - Fetches GHE token from Secrets Manager - - POSTs `repository_dispatch` to target account repo on GHE - - Polls GHA run status until complete (or Lambda deadline) - - Returns repo URL + PR URL to CFN -4. GHA workflow receives `repository_dispatch` event: - - Clones the account repo - - Renders HCL/YAML files from templates - - Commits + opens PR (`repo-init` β†’ `main`) +SC β†’ Lambda β†’ CodeBuild β†’ Account Repo β†’ tf-run automation. + +--- + +## Architecture Overview + +**Runner: CodeBuild** (not GitHub Actions β€” GHA is blocked on OIDC setup). +The workflow logic is designed to be portable to GHA once that blocker clears. + +CodeBuild handles everything after the SC form submission: +- Clone the account repo +- Write config files into the correct layer/region directory +- Commit + push to a branch +- Run `tf-run` in the correct `//` +- Open a PR via `gh` CLI + +--- + +## Full Flow + +``` +User (SC Console) + └─> fills product form β†’ submits + β†’ CFN creates Custom::TerraformRun resource + β†’ ServiceToken β†’ Lambda (centralized in csvd-dev, us-gov-west-1) + β”‚ + β”œβ”€> Validates inputs (Pydantic v2) + β”œβ”€> Fetches GHE PAT from Secrets Manager + β”œβ”€> Starts CodeBuild (tf-run-executor) with env-var overrides + └─> Polls CodeBuild every 20s until complete or Lambda deadline + β”‚ + └─> Returns PR URL + result to CFN (SUCCESS / FAILED) + +CodeBuild project (tf-run-executor) + β”œβ”€> INSTALL phase + β”‚ β”œβ”€> Install Terraform binary from S3 (registry.terraform.io is blocked) + β”‚ β”œβ”€> Install Census CA cert (update-ca-trust) for GHE TLS + β”‚ └─> Set HTTPS_PROXY + NO_PROXY + β”‚ + └─> BUILD phase + β”œβ”€> git clone https://@github.e.it.census.gov/SCT-Engineering/.git + β”œβ”€> git checkout -B (default: repo-init) + β”œβ”€> Write EXTRA_FILES (JSON map of pathβ†’content) into the working tree + β”œβ”€> git add + git commit + git push + β”œβ”€> cd // + β”œβ”€> tf-run apply [tag:] (or tf plan if DRY_RUN=true) + └─> gh pr create --base main --head repo-init (or update if exists) +``` + +--- + +## Service Catalog Product Form Parameters + +| Parameter | SC Label | Notes | +|-----------|----------|-------| +| `account_repo` | Account Repo Name | e.g. `229685449397-csvd-dev-platform-dev-gov` | +| `layer` | Terraform Layer | `common`, `infrastructure`, or `vpc` | +| `region_dir` | Region Directory | `east` or `west` | +| `tf_run_start_tag` | Start Tag (optional) | tf-run.data TAG label to start at; empty = from beginning | +| `extra_files` | Extra Config Files (JSON) | JSON {"relative/path": "rendered content"} written before tf-run | +| `git_branch` | Branch | branch to commit/PR from; default `repo-init` | +| `dry_run` | Dry Run | `true` = tf plan only, no apply | + +`aws_account_id` and `aws_region` are NOT user-facing β€” resolved via `!Sub` in CFN. + +--- + +## Lambda Design + +**Function name**: `tf-run-executor-trigger` +**Account**: `229685449397` (csvd-dev, us-gov-west-1) +**Timeout**: 900s (must exceed CodeBuild poll window) +**Runtime**: Python 3.12 container image (ECR) + +### Input model (Pydantic v2) + +```python +class TfRunRequest(BaseModel): + account_repo: str + layer: Literal["common", "infrastructure", "vpc"] + region_dir: Literal["east", "west"] + tf_run_start_tag: str = "" + extra_files: dict[str, str] = {} + git_branch: str = "repo-init" + dry_run: bool = False +``` + +### Key responsibilities + +1. Validate inputs via TfRunRequest +2. Fetch GHE PAT from Secrets Manager (`ghe-runner/github-token`) +3. Build `environmentVariablesOverride` for CodeBuild: + - `ACCOUNT_REPO`, `LAYER`, `REGION_DIR`, `TF_RUN_START_TAG` + - `EXTRA_FILES` (JSON string), `GIT_BRANCH`, `DRY_RUN` + - `GITHUB_TOKEN` (type `PLAINTEXT`, value from Secrets Manager) +4. Call `codebuild:StartBuild` with overrides +5. Poll every 20s via `codebuild:BatchGetBuilds` until `buildStatus != IN_PROGRESS` +6. On `SUCCEEDED`: extract PR URL from build output β†’ signal CFN SUCCESS +7. On `FAILED`/`FAULT`/`STOPPED`: signal CFN FAILED with build log tail as reason + +--- + +## CodeBuild Project Design + +**Project name**: `tf-run-executor` +**Image**: `aws/codebuild/amazonlinux2-x86_64-standard:3.0` +**Timeout**: 60 minutes +**Service role**: new IAM role (S3 read, Secrets Manager read, CloudWatch Logs write) +**Buildspec**: inline from `buildspec.yml` in this repo + +### buildspec.yml + +```yaml +version: 0.2 + +env: + variables: + GITHUB_BASE_URL: "https://github.e.it.census.gov" + GITHUB_ORG: "SCT-Engineering" + TF_BINARY_S3: "s3://csvd-packer-pipeline-assets/terraform/terraform_1.9.1_linux_amd64.zip" + CENSUS_CA_S3: "s3://csvd-packer-pipeline-assets/certs/census-ca.pem" + HTTPS_PROXY: "http://proxy.tco.census.gov:3128" + NO_PROXY: "github.e.it.census.gov,169.254.169.254" + +phases: + install: + commands: + - aws s3 cp $TF_BINARY_S3 /tmp/terraform.zip + - unzip -o /tmp/terraform.zip -d /usr/local/bin/ && chmod +x /usr/local/bin/terraform + - ln -sf /usr/local/bin/terraform /usr/local/bin/tf + - aws s3 cp $CENSUS_CA_S3 /etc/pki/ca-trust/source/anchors/census-ca.pem + - update-ca-trust extract + + build: + commands: + - git clone https://${GITHUB_TOKEN}@github.e.it.census.gov/${GITHUB_ORG}/${ACCOUNT_REPO}.git repo + - cd repo + - git checkout -B $GIT_BRANCH + + # Write any extra config files passed in from Lambda + - | + python3 -c " + import json, os, pathlib + files = json.loads(os.environ.get('EXTRA_FILES', '{}')) + for path, content in files.items(): + p = pathlib.Path(path) + p.parent.mkdir(parents=True, exist_ok=True) + p.write_text(content) + print(f'Wrote {len(files)} extra file(s)') + " + + # Commit and push + - git add -A + - | + git -c user.email="sc-automation@census.gov" \ + -c user.name="SC Automation" \ + commit -m "SC automation: ${LAYER}/${REGION_DIR} [${ACCOUNT_REPO}]" \ + --allow-empty + - git push origin $GIT_BRANCH + + # Run tf-run (or tf plan for dry run) in the target layer/region dir + - cd ${LAYER}/${REGION_DIR} + - | + if [ "$DRY_RUN" = "true" ]; then + TFNOLOG=1 tf plan + elif [ -n "$TF_RUN_START_TAG" ]; then + tf-run apply tag:${TF_RUN_START_TAG} + else + tf-run apply + fi + + # Open or update PR + - | + GH_HOST=github.e.it.census.gov \ + GH_TOKEN=${GITHUB_TOKEN} \ + gh pr create \ + --title "SC automation: ${LAYER}/${REGION_DIR}" \ + --body "Triggered by Service Catalog provisioning of ${ACCOUNT_REPO}." \ + --base main \ + --head ${GIT_BRANCH} \ + || echo "PR already exists, skipping create" + + post_build: + commands: + - echo "BUILD_RESULT=${CODEBUILD_BUILD_SUCCEEDING}" + - | + GH_HOST=github.e.it.census.gov \ + GH_TOKEN=${GITHUB_TOKEN} \ + gh pr view --repo ${GITHUB_ORG}/${ACCOUNT_REPO} ${GIT_BRANCH} --json url -q .url \ + || true +``` + +--- + +## Account Repo Pre-conditions + +For `tf-run` to succeed, the target `//` must already have: +- `remote_state.backend.tf` (generated by `tf-directory-setup.py`) +- `remote_state.*.tf` symlink pointing to `.s3` variant +- `tf-run.data` with the correct step definitions +- `.tf-control` at the repo root (Terraform version pin) + +These exist in any bootstrapped account repo. A separate "init" mode (future work) +can run `tf-directory-setup.py -l none -f` for first-time setup. + +--- + +## Infrastructure to Deploy (deploy/) + +| Resource | File | Purpose | +|----------|------|---------| +| `aws_lambda_function.tf_run_trigger` | `lambda.tf` | CFN Custom Resource handler | +| `aws_ecr_repository.tf_run_trigger` | `lambda.tf` | Lambda container image | +| `aws_codebuild_project.tf_run_executor` | `codebuild.tf` | Clones repo, writes files, runs tf-run, opens PR | +| `aws_iam_role.lambda_exec` | `iam.tf` | Lambda execution role | +| `aws_iam_role.codebuild_exec` | `iam.tf` | CodeBuild service role (S3, SM, logs) | +| `aws_servicecatalog_portfolio` | `service_catalog.tf` | SC portfolio | +| `aws_servicecatalog_product` | `service_catalog.tf` | SC product (CFN template) | +| `aws_servicecatalog_constraint` | `service_catalog.tf` | Launch constraint | + +Reused: `ghe-runner/github-token` in Secrets Manager (same PAT as EKS automation). + +--- + +## Implementation Phases + +### Phase 1 β€” CodeBuild + buildspec (manual test) +- [ ] Write `buildspec.yml` +- [ ] Write `deploy/codebuild.tf` +- [ ] Test via AWS CLI `start-build` with env overrides against a test account repo +- [ ] Validate: clone β†’ write file β†’ commit β†’ tf plan β†’ PR opens + +### Phase 2 β€” Lambda +- [ ] Write `lambda/app.py` β€” Pydantic model + CodeBuild trigger + poll loop + cfn-response +- [ ] Write `lambda/Dockerfile` +- [ ] Write `deploy/lambda.tf` + `deploy/iam.tf` +- [ ] Local test with synthetic CFN Create event + +### Phase 3 β€” Service Catalog product +- [ ] Write `service-catalog/product-template.yaml` β€” CFN Parameters + `Custom::TerraformRun` +- [ ] Write `deploy/service_catalog.tf` +- [ ] End-to-end test in csvd-dev: SC provision β†’ CodeBuild β†’ tf-run β†’ PR β†’ CFN SUCCESS + +### Phase 4 β€” Polish + GHA migration path +- [ ] CloudWatch dashboard (build history, PR links) +- [ ] SNS alert on FAILED builds +- [ ] Document GHA migration path (replace CodeBuild with `repository_dispatch` once OIDC unblocked) + +--- + +## GHA Migration Path (deferred β€” blocked on OIDC) + +The CodeBuild buildspec steps map directly to GHA workflow steps. +When OIDC is available, CodeBuild can be replaced by a `repository_dispatch` +trigger on the account repo with a `.github/workflows/tf-run.yml` that +runs the same clone β†’ write β†’ tf-run β†’ PR steps. No Lambda changes required. + +--- + +## What NOT to Do + +- Do not use SSH clone β€” Census proxy blocks SSH; always HTTPS + token +- Do not write temp files to `/tmp` β€” use the CodeBuild build directory +- Do not use `terraform` directly β€” always use the `tf` alias (tf-control.sh symlink) +- Do not hardcode `aws-us-gov` in ARNs β€” use `${AWS::Partition}` +- Do not add `aws_account_id` or `aws_region` as SC form parameters +- Do not run tf-run from the repo root β€” always `cd //` first