From e6547ed0a07eaddd227ba8ab7b278f03e4896a91 Mon Sep 17 00:00:00 2001 From: Your Name Date: Tue, 14 Apr 2026 14:02:08 -0400 Subject: [PATCH] docs: add ECA demo script with talking points and Q&A prep --- docs/DEMO_SCRIPT.md | 273 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 273 insertions(+) create mode 100644 docs/DEMO_SCRIPT.md diff --git a/docs/DEMO_SCRIPT.md b/docs/DEMO_SCRIPT.md new file mode 100644 index 0000000..62ca86e --- /dev/null +++ b/docs/DEMO_SCRIPT.md @@ -0,0 +1,273 @@ +# EKS Cluster Automation (ECA) — Demo Script + +**Audience**: Platform Engineering, DevOps leads, team managers +**Duration**: ~20 minutes +**Goal**: Show that an engineer can go from zero to a fully-configured, review-ready EKS cluster +GitHub repository in under 5 minutes, by filling out a form in the AWS console. + +--- + +## 0. Pre-Demo Setup (before the audience arrives) + +- [ ] Log into the AWS console in the account where the SC portfolio is shared (not csvd-dev) +- [ ] Have the Service Catalog console open at the "EKS Terragrunt Repository Creator Portfolio" +- [ ] Have a second browser tab ready with GitHub Enterprise (`github.e.it.census.gov/SCT-Engineering`) +- [ ] Have a third tab ready with the CodeBuild console in csvd-dev (`us-gov-west-1`) +- [ ] Have `awscreds` loaded in a terminal for CLI fallback +- [ ] Pre-stage demo values (fill in the table below with real values for the demo cluster): + +| Parameter | Demo Value | +|-----------|------------| +| Repository Name (`project_name`) | `demo-eks-cluster-XX` | +| EKS Cluster Name | *(leave blank — defaults to repo name)* | +| Environment | `dev` | +| AWS Region | `us-gov-west-1` | +| Account Name | `csvd-dev-ew` | +| AWS Account ID | `229685449397` | +| Environment Abbr | `dev` | +| VPC Name | `csvd-dev-ew-vpc-01` | +| VPC Domain Name | `dev.inf.csp1.census.gov` | +| Owning Team | `tf-module-admins` | +| Creator Username | *(your GHE username for admin access)* | + +--- + +## 1. Opening — The Problem (2 min) + +> "Before this system existed, setting up the infrastructure side of a new EKS cluster +> meant manually creating a GitHub repository, copying files from a reference template, +> filling in account IDs, VPC names, region values, and cluster-specific configuration +> into seven different Terragrunt HCL files — by hand. That's error-prone, it takes +> significant time from a senior engineer, and it has to happen before any infrastructure +> work can even start." + +**Talking points:** +- Every EKS cluster needs a dedicated GitHub repository with a precise directory structure +- The repo contains Terragrunt wrappers for ~15 Terraform modules (networking, compute, IAM, observability) +- Values like cluster name, VPC name, account ID, and environment must be consistent across all files +- A single typo at this stage cascades into failed `terragrunt apply` runs later +- Teams were blocked waiting for platform engineers to set this up + +--- + +## 2. The Solution — Service Catalog Self-Service (1 min) + +> "What we've built is a Service Catalog product. A team lead fills out one form — the same +> kind of form used to request a database or an EC2 instance — and the system takes care of +> everything: creates the repository, populates all the configuration files with the right +> values, and opens a pull request. The engineer reviews it, merges, and they're ready to start +> deploying their cluster." + +**Talking points:** +- Self-service: no platform engineer required to set up a new cluster repo +- Runs from any AWS account in the org — teams use their own console +- Input validation on the form catches typos before anything is created +- Outputs a direct link to the repository and the open PR + +--- + +## 3. Architecture Overview (3 min) + +*Draw or point to the ASCII diagram on screen:* + +``` +Any AWS Account (team's account) csvd-dev (229685449397) +───────────────────────────────── ────────────────────────────────────── +Service Catalog console + → fill out form + → click "Launch" + → CloudFormation stack ───────────→ Lambda (cross-account invocation) + ├─ reads GitHub PAT from Secrets Manager + ├─ starts CodeBuild with cluster params as env vars + │ ↓ + │ CodeBuild (terraform-eks-deployment) + │ ├─ installs Terraform from S3 + │ ├─ installs Census CA cert + │ ├─ clones terraform-eks-deployment from GHE + │ └─ terraform apply + │ └─ CSVD/terraform-github-repo module + │ ├─ creates repo from template-eks-cluster + │ ├─ renders 7 HCL files + config.json + │ └─ opens PR (repo-init → main) + └─ returns repo URL + PR URL + ← CFN stack outputs (repo + PR URLs) ── + ← SC product status: AVAILABLE +``` + +**Talking points:** +- The product is shared at the **OU level** — any account in the org can provision it +- All compute runs **centrally in csvd-dev** — no Lambda or CodeBuild needed in the team's account +- CloudFormation natively supports cross-account Lambda invocation via the `ServiceToken` ARN +- The launch role (`r-ent-servicecatalog-eksterragrunt-launch-role`) is deployed to every account + automatically via a **CloudFormation StackSet** — zero manual setup per account +- The Lambda is a thin orchestrator — it just starts CodeBuild and polls for completion +- Actual repo creation is pure Terraform (`terraform-eks-deployment` workspace) — auditable, testable, repeatable + +--- + +## 4. Live Demo — Provisioning the Product (8 min) + +### Step 1: Open Service Catalog +*Navigate to Service Catalog → Products list (or the portfolio directly)* + +> "Here's the EKS Terragrunt Repository Creator product. This is available to any team +> with the inf-admin SSO role in any org account." + +### Step 2: Launch the product +*Click "Launch product" → enter a name for the provisioned product* + +> "I'll give this provisioned product an instance name — this is just the SC record name, +> not the repo name." + +### Step 3: Fill in the parameters +*Walk through the form sections:* + +**Cluster Configuration:** +> "Repository name becomes the cluster name by default. We use lowercase hyphens only +> because this name flows into Kubernetes resource names. Environment is dev — this +> controls which Terragrunt environment directory gets created." + +**Account Configuration:** +> "These three fields — account name, account ID, and environment abbreviation — get +> stamped into multiple HCL files. In the old world, you'd copy-paste these by hand +> into each file. Here, you type them once." + +**VPC Configuration:** +> "VPC Name and VPC Domain Name. Note: this is the VPC name string, not a VPC ID. +> The Terragrunt configuration uses names for readability and to avoid account-specific +> IDs hardcoded in source control." + +**Contact & FinOps:** +> "Team information and cost allocation tags. These get embedded in the cluster +> configuration and inherited by all Terraform runs against this cluster." + +**Creator Username (optional):** +> "If you provide your GitHub username here, you'll get direct admin access to the +> created repo in addition to the owning team." + +### Step 4: Launch and watch +*Click "Launch" → switch to the CloudFormation tab* + +> "CloudFormation creates a stack with a single Custom Resource. That Custom Resource +> invokes our Lambda function in csvd-dev, cross-account." + +*Switch to the CodeBuild console tab in csvd-dev* + +> "We can see the CodeBuild build starting. The Lambda passed all the cluster parameters +> as environment variables — TF_VAR_name, TF_VAR_region, TF_VAR_cluster_config as JSON, +> and so on. CodeBuild is now downloading Terraform from our internal S3 assets bucket, +> installing the Census CA cert so it can reach GitHub Enterprise, and running +> terraform init followed by terraform apply." + +*[Wait ~2-3 minutes for CodeBuild to complete, or use a pre-completed build if time is short]* + +### Step 5: Show the results +*Switch to GitHub Enterprise → SCT-Engineering org* + +> "The repository has been created from the `template-eks-cluster` template. +> Let's look at what's in it." + +**Walk through the generated files:** +| File | What to say | +|------|-------------| +| `config.json` | "Top-level configuration file — all cluster metadata in one place. This is what the Terragrunt modules read to self-configure." | +| `root.hcl` | "Root Terragrunt config — sets the remote state bucket prefix and environment." | +| `/account.hcl` | "Account-level config — account name, ID, and profile. Shared by all clusters in this account." | +| `//region.hcl` | "Region-level config — inherited by all VPCs and clusters in this region." | +| `///vpc.hcl` | "VPC-level config — domain name and routing configuration for this cluster's network." | +| `////cluster.hcl` | "The leaf node — cluster-specific values: name, node group sizes, mailing list, FinOps tags." | +| `README.md` | "Pre-populated with cluster name, environment, and links to relevant runbooks." | + +> "Every value in every file was rendered from the form inputs. No copy-paste, no manual edits." + +*Click to the "Pull requests" tab* + +> "And there's the PR: `repo-init` into `main`. The engineer reviews this — checks the +> values look right, confirms the directory structure is correct — then merges. At that +> point their repository is production-ready and they can start running Terragrunt to +> deploy the actual AWS infrastructure." + +### Step 6: Back to Service Catalog +*Switch back to the SC console* + +> "The provisioned product shows AVAILABLE. The outputs are the repository URL and PR URL +> — so a team lead provisioning this from, say, a ci account or their sandbox account, +> gets these links back directly in the console without ever touching GitHub." + +--- + +## 5. Infrastructure as Code Story (3 min) + +> "The product itself is entirely defined as code — nothing was clicked into existence." + +**Three repos, three layers:** + +| Layer | Repo | What it deploys | +|-------|------|-----------------| +| Compute (prerequisite) | `lambda-template-repo-generator` | Lambda function, CodeBuild project, IAM roles, ECR image via packer pipeline | +| SC product + portfolio | `terraform-service-catalog-census` | SC portfolio, product, launch role StackSet, S3 template object | +| Runtime logic | `terraform-eks-deployment` | The Terraform that actually creates the GitHub repo + HCL files | + +**Talking points:** +- `lambda-template-repo-generator/deploy/` is a standard Terraform root — `tf apply` deploys the Lambda +- `terraform-service-catalog-census` is a Terragrunt monorepo — the EKS product is one of ~6 SC products managed there +- The Lambda buildspec is stored in `terraform-eks-deployment/buildspec.yml` and inlined into the CodeBuild + project at `tf apply` time — so updating the build steps is a normal PR workflow +- The launch role (`eks-terragrunt-launch-role.yaml`) is a CloudFormation template deployed via + StackSet — so it exists in every shared account automatically when a new account joins the OU + +--- + +## 6. Key Design Decisions (2 min — if asked) + +**Why CodeBuild instead of Lambda for repo creation?** +> "The CSVD Terraform GitHub module handles all the complexity: branch protection rules, +> team access, file rendering, PR creation. Rewriting that in Python would be significant +> additional maintenance burden. CodeBuild lets us run vanilla Terraform — same code, +> same modules, same provider versions as any other Terraform workspace." + +**Why is the Lambda centralized in csvd-dev instead of per-account?** +> "The Lambda makes no AWS API calls in the provisioner's account — it only talks to +> GitHub Enterprise and CodeBuild, both of which are in csvd-dev. CloudFormation's +> `ServiceToken` mechanism supports cross-account Lambda invocation natively. Deploying +> the Lambda to every account would require ECR replication, per-account CodeBuild, +> per-account Secrets Manager secrets — real complexity with no benefit." + +**Why Service Catalog instead of a plain API or CLI tool?** +> "Service Catalog gives us SSO-integrated access control, the provisioned product +> lifecycle (create/update/terminate), and a UI that non-engineers can use. It also +> lets the platform team define exactly what inputs are allowed — the form parameters +> are validated by CloudFormation before anything runs." + +--- + +## 7. Cleanup / Termination + +> "If the provisioned product is terminated in Service Catalog, CloudFormation deletes +> the stack. The GitHub repository itself is not deleted — archive-on-destroy is disabled. +> This is intentional: you don't want infrastructure code deleted automatically." + +--- + +## 8. Q&A Prep + +| Likely question | Answer | +|----------------|--------| +| Can I update the repository after it's created? | Not through SC — that's intentional. After provisioning, the repo is owned by the team. SC is for initial creation only. | +| What happens if CodeBuild fails? | Lambda returns FAILED to CloudFormation, CFN rolls back the stack, SC shows the product as FAILED with the error message. | +| Can we add more HCL files to what gets generated? | Yes — edit `terraform-eks-deployment/main.tf` (the `rendered_files` locals block). It's all Terraform templatefile calls. | +| How are GitHub tokens managed? | Two tokens in Secrets Manager: an App installation token for the Lambda's Python API calls, and a PAT for the Terraform GitHub provider (passed to CodeBuild). | +| How does this scale to many accounts? | The StackSet auto-deploys the launch role to every account that joins the OU. No manual setup per account. | +| Can we version the product? | Yes — bump `product_version` in `configurations/products/eks-terragrunt-repo/EKS_REPO.yaml.tftpl` and add a new CFN template. Old provisioned products keep their version. | + +--- + +## Appendix: Useful URLs for the Demo + +| Resource | URL | +|----------|-----| +| SC console | AWS console → Service Catalog → Products | +| CodeBuild builds | AWS console → CodeBuild → Build history → `eks-terragrunt-repo-creator` | +| Lambda logs | CloudWatch → Log groups → `/aws/lambda/eks-terragrunt-repo-gen-template-automation` | +| GHE org | `https://github.e.it.census.gov/SCT-Engineering` | +| template-eks-cluster | `https://github.e.it.census.gov/SCT-Engineering/template-eks-cluster` |