From c8525b48bdbf8c44db2eabf10c1dd0d39ec62860 Mon Sep 17 00:00:00 2001 From: Your Name Date: Mon, 6 Apr 2026 13:52:04 -0400 Subject: [PATCH] docs: update callnotes with Step 1-4 DONE status for pivot plan --- callnotes.md | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 callnotes.md diff --git a/callnotes.md b/callnotes.md new file mode 100644 index 0000000..5c98e0f --- /dev/null +++ b/callnotes.md @@ -0,0 +1,122 @@ +Action Plan from Matt Sync + +1. Clean up test resources in CSVD Dev + - Remove or reset any test/demo items in the CSVD Dev environment. + - Prep the environment for a clean demo. + - STATUS: ✅ CLEANED UP (2026-04-06) + - Terminated 5 SC provisioned products: `daves-real-test`, `EKS_Cluster_GitHub_Repository-01222158`, `test-enterprise-eks-fix`, `test-tf-init-fix`, `test-full-run` + - Deleted 3 SC admin products: `EKS Cluster GitHub Repository` (prod-a2v6d2aecpy7o), `eks-terragrunt-eks-repo-creator` (prod-4tee6zssmvf7a), `EKS Terragrunt Repository Creator - dev` (prod-4gchyoxp2wh74) + - Deleted 3 SC portfolios: `eks-cluster-automation`, `eks-terragrunt-github-automation`, `EKS Terragrunt Repository Creator Portfolio` + - Deleted GHE repo: `SCT-Engineering/daves-lambda-test` (`daves-real-test` was never created in GHE) + +2. Verify EKS Automation (ECA) Works + - Ensure the EKS automation pipeline is functional end-to-end. + - No need for a polished demo, just confirm it works. + - STATUS: ✅ WORKING (2026-04-02) + - Pivoted from CodeBuild+Terraform approach to direct Lambda invocation + - Lambda `eks-terragrunt-repo-gen-template-automation` successfully: + - Created repo: https://github.e.it.census.gov/SCT-Engineering/daves-lambda-test + - Rendered 8 Terragrunt HCL files via Jinja2 (root.hcl, account.hcl, region.hcl, vpc.hcl, cluster.hcl, README.md, common-variables.hcl, default-versions.hcl) + - Created PR: https://github.e.it.census.gov/SCT-Engineering/daves-lambda-test/pull/1 + - Fix applied: set VERIFY_SSL=false on Lambda (Census CA cert not in container's certifi bundle) + - No CodeBuild, no Terraform providers, no SSH keys needed + - STATUS: ✅ SC PRODUCT TEMPLATES SYNCED (2026-04-02) + - `2-0-0.yaml` (census live) was diverged from canonical `product-template.yaml`: + - Had `VpcId: AWS::EC2::VPC::Id` (dropdown) — Lambda needs `vpc_name` string → EKS path never triggered + - Missing `AccountName`, `AWSAccountId`, `AwsRegion`, `OrganizationPath`, `FinOps*` params + - Had a comment claiming DynamoDB lookup for missing params — that code does not exist + - Fixed `2-0-0.yaml` to match Lambda interface exactly (vpc_name, all required fields) + - Updated `product-template.yaml` to use `!Sub "arn:..."` for ServiceToken (dropped LambdaFunctionArn param) + - Both templates now functionally identical + - STATUS: ✅ COPILOT INSTRUCTIONS CREATED (2026-04-02) + - `lambda-template-repo-generator/.github/copilot-instructions.md` — comprehensive Lambda-first guidance + - `terraform-eks-deployment/.github/copilot-instructions.md` — deployment workspace context + - Both explicitly document the abandoned CodeBuild approach and why it was replaced + +3. Demo ECA Workflow + - Prepare to walk through the ECA (EKS Cluster Automation) process. + - Target is an internal demo, not a full CSVD stakeholder presentation. + - STATUS: ✅ SC PRODUCT REDEPLOYED VIA TERRAFORM (2026-04-06) + - Two deployment methods established and documented in copilot instructions: + 1. **Direct Terraform** (`lambda-template-repo-generator/deploy/`) — canonical, use for testing/debugging + 2. **terraform-service-catalog-census Terragrunt** — production path + - `tf apply` in `deploy/` recreated 6 SC resources (portfolio, product, association, principal, 2 constraints) + - New IDs: portfolio `port-h5qd63hw5yagq`, product `prod-lmua4oknugafg` + - Rule: always verify Method 1 works before debugging Method 2 (census pipeline) + + - STATUS: ✅ PATH STRUCTURE FIXED AND LAMBDA REBUILT (2026-04-06) + - Matt feedback: "The modules in cluster folder should be in the dave-demo-test-eks/ folder" + - Template repo was being cloned verbatim — placeholder paths (`environment/region/vpc/cluster/eks-*/`) + were landing in the generated repo instead of dynamic paths (`{env}/{region}/{vpc}/{cluster}/eks-*/`) + - Fix: added `path_mapper` parameter to `clone_repository_contents()` in `github_provider.py` + - Added `build_eks_path_mapper(cfg)` in `app.py` — remaps `environment/region/vpc/cluster/{rest}` + → `{env}/{region}/{vpc}/{cluster}/{rest}` and excludes the 4 placeholder HCL files + (replaced by Jinja2-rendered versions: account.hcl, region.hcl, vpc.hcl, cluster.hcl) + - Lambda rebuilt via packer-pipeline (build #6 SUCCEEDED), Lambda updated: + `eks-terragrunt-repo-gen-template-automation` @ `lambda:latest` (sha256:af0b5...) + - PR #1 on dave-demo-test-eks now shows correct structure: + https://github.e.it.census.gov/SCT-Engineering/dave-demo-test-eks/pull/1 + +4. Reference Matt's tfmod-pipeline + - Review the initial branch of https://github.e.it.census.gov/SCT-Engineering/tfmod-pipeline. + - Look for CodePipeline/CodeBuild setup, S3 triggers, and artifact download logic. + - Use as a reference for your own pipeline/code. + +5. Pivot: Use terraform-eks-deployment via CodeBuild (single maintenance point) + - PROBLEM IDENTIFIED (2026-04-06): Current Lambda Python code duplicates repo-creation logic + that `terraform-eks-deployment` already implements as a Terraform workspace. + Matt's feedback: "If you do it in Lambda Python code, we have two places to maintain for the + same process. If you can make it run the same TF, we have a single place to maintain." + - DECISION: Commit current state as a safe revert baseline, then pivot. + - KEY ENABLER: Internal module `CSVD/terraform-github-repo` + - https://github.e.it.census.gov/CSVD/terraform-github-repo + - Uses `github provider 6.6.0` — satisfies internal `>= 6.6.0` requirement + - Supports `template_repo` + `template_repo_org` — clones template-eks-cluster directly + - Supports `managed_extra_files` (`path` + `content`) — writes rendered HCL files + - Contributors: arnol377 + morga471 (Matt) — maintainable internally + - ARCHITECTURE (target): + ``` + SC Console → CFN Stack → Custom::GitHubRepository → Lambda + → Lambda triggers CodeBuild job + → CodeBuild runs terraform-eks-deployment workspace (tf apply) + → Terraform (via CSVD/terraform-github-repo) creates GHE repo + writes HCL files + → Lambda polls CodeBuild → reports SUCCESS/FAILED to CFN + ``` + - WHY THE EARLIER CODEBUILD+TERRAFORM ATTEMPT FAILED (context): + - That attempt used `HappyPathway/terraform-github-repo` **public** module + — pinned `github ~> 6.0`, conflicted with internal modules requiring `>= 6.6.0` + - SSH host key failures downloading remote Terraform modules + - AWS credential proxy issues inside CodeBuild + - `CSVD/terraform-github-repo` (internal) uses `github 6.6.0` — provider conflict RESOLVED + - `terraform-eks-deployment` doesn't pull SSH remote modules — those blockers don't apply + - STEP 1: Commit lambda-template-repo-generator to GHE (safe revert point) + - STATUS: ✅ DONE (2026-04-06) — commit a79cee4 on fix/eca-lambda-approach-and-copilot-docs + - Current state: fully functional Lambda Python path, path_mapper applied, built + deployed + - STEP 2: Refactor Lambda to be a thin orchestrator + - STATUS: ✅ DONE (2026-04-06) — commit ec54b54 + - Added `start_codebuild_build()` and `poll_codebuild_build()` to app.py + - EKS deployment path now: validate params → start CodeBuild → poll → cfn-response + - Non-EKS path unchanged (still writes config.json via Python GitHub API) + - Lambda env var `CODEBUILD_PROJECT_NAME` = `eks-terragrunt-repo-creator` + - STEP 3: Update terraform-eks-deployment — module source + file paths + - STATUS: ✅ DONE (2026-04-06) — commit 91202ff on fix/eca-copilot-instructions-and-callnotes + - Already uses `CSVD/terraform-github-repo`; changed SSH → HTTPS source URL + - Fixed placeholder paths in `managed_extra_files`: + - `"environment/account.hcl"` → `"${var.environment}/account.hcl"` + - `"environment/region/region.hcl"` → `"${var.environment}/${var.region}/region.hcl"` + - `"environment/region/vpc/vpc.hcl"` → `"${var.environment}/${var.region}/${var.cluster_config.vpc_name}/vpc.hcl"` + - `"environment/region/vpc/cluster/cluster.hcl"` → dynamic path with all four vars + - STEP 4: Wire CodeBuild buildspec (`tf init → tf apply`) + - STATUS: ✅ DONE (2026-04-06) — commit ec4d861 + - `buildspec.yml` added to terraform-eks-deployment root + - Installs Terraform from S3 assets bucket (csvd-packer-pipeline-assets) + - Gets GitHub token via `GITHUB_TOKEN` env var (passed by Lambda from Secrets Manager) + - Clones this repo and runs `terraform init + apply -auto-approve` + - `aws_codebuild_project.eks_repo_creator` added to lambda-template-repo-generator/deploy/ + - NEXT: Run `tf apply` in lambda-template-repo-generator/deploy/ to create the CodeBuild project + and update the Lambda env var. Then rebuild Lambda image (packer-pipeline) + end-to-end test. + - Pass rendered HCL content via `managed_extra_files` + - Ensure module accepts all EKS params as variables (driven from CodeBuild env vars) + - STEP 4: Wire CodeBuild project to terraform-eks-deployment workspace + - CodeBuild buildspec: tf init → tf apply (with env var → tfvar mapping) + - STATUS: 🔄 PENDING — commit current state first, then begin refactor