-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add ECA demo script with talking points and Q&A prep
- Loading branch information
Your Name
committed
Apr 14, 2026
1 parent
dff9bfa
commit e6547ed
Showing
1 changed file
with
273 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,273 @@ | ||
| # EKS Cluster Automation (ECA) — Demo Script | ||
|
|
||
| **Audience**: Platform Engineering, DevOps leads, team managers | ||
| **Duration**: ~20 minutes | ||
| **Goal**: Show that an engineer can go from zero to a fully-configured, review-ready EKS cluster | ||
| GitHub repository in under 5 minutes, by filling out a form in the AWS console. | ||
|
|
||
| --- | ||
|
|
||
| ## 0. Pre-Demo Setup (before the audience arrives) | ||
|
|
||
| - [ ] Log into the AWS console in the account where the SC portfolio is shared (not csvd-dev) | ||
| - [ ] Have the Service Catalog console open at the "EKS Terragrunt Repository Creator Portfolio" | ||
| - [ ] Have a second browser tab ready with GitHub Enterprise (`github.e.it.census.gov/SCT-Engineering`) | ||
| - [ ] Have a third tab ready with the CodeBuild console in csvd-dev (`us-gov-west-1`) | ||
| - [ ] Have `awscreds` loaded in a terminal for CLI fallback | ||
| - [ ] Pre-stage demo values (fill in the table below with real values for the demo cluster): | ||
|
|
||
| | Parameter | Demo Value | | ||
| |-----------|------------| | ||
| | Repository Name (`project_name`) | `demo-eks-cluster-XX` | | ||
| | EKS Cluster Name | *(leave blank — defaults to repo name)* | | ||
| | Environment | `dev` | | ||
| | AWS Region | `us-gov-west-1` | | ||
| | Account Name | `csvd-dev-ew` | | ||
| | AWS Account ID | `229685449397` | | ||
| | Environment Abbr | `dev` | | ||
| | VPC Name | `csvd-dev-ew-vpc-01` | | ||
| | VPC Domain Name | `dev.inf.csp1.census.gov` | | ||
| | Owning Team | `tf-module-admins` | | ||
| | Creator Username | *(your GHE username for admin access)* | | ||
|
|
||
| --- | ||
|
|
||
| ## 1. Opening — The Problem (2 min) | ||
|
|
||
| > "Before this system existed, setting up the infrastructure side of a new EKS cluster | ||
| > meant manually creating a GitHub repository, copying files from a reference template, | ||
| > filling in account IDs, VPC names, region values, and cluster-specific configuration | ||
| > into seven different Terragrunt HCL files — by hand. That's error-prone, it takes | ||
| > significant time from a senior engineer, and it has to happen before any infrastructure | ||
| > work can even start." | ||
| **Talking points:** | ||
| - Every EKS cluster needs a dedicated GitHub repository with a precise directory structure | ||
| - The repo contains Terragrunt wrappers for ~15 Terraform modules (networking, compute, IAM, observability) | ||
| - Values like cluster name, VPC name, account ID, and environment must be consistent across all files | ||
| - A single typo at this stage cascades into failed `terragrunt apply` runs later | ||
| - Teams were blocked waiting for platform engineers to set this up | ||
|
|
||
| --- | ||
|
|
||
| ## 2. The Solution — Service Catalog Self-Service (1 min) | ||
|
|
||
| > "What we've built is a Service Catalog product. A team lead fills out one form — the same | ||
| > kind of form used to request a database or an EC2 instance — and the system takes care of | ||
| > everything: creates the repository, populates all the configuration files with the right | ||
| > values, and opens a pull request. The engineer reviews it, merges, and they're ready to start | ||
| > deploying their cluster." | ||
| **Talking points:** | ||
| - Self-service: no platform engineer required to set up a new cluster repo | ||
| - Runs from any AWS account in the org — teams use their own console | ||
| - Input validation on the form catches typos before anything is created | ||
| - Outputs a direct link to the repository and the open PR | ||
|
|
||
| --- | ||
|
|
||
| ## 3. Architecture Overview (3 min) | ||
|
|
||
| *Draw or point to the ASCII diagram on screen:* | ||
|
|
||
| ``` | ||
| Any AWS Account (team's account) csvd-dev (229685449397) | ||
| ───────────────────────────────── ────────────────────────────────────── | ||
| Service Catalog console | ||
| → fill out form | ||
| → click "Launch" | ||
| → CloudFormation stack ───────────→ Lambda (cross-account invocation) | ||
| ├─ reads GitHub PAT from Secrets Manager | ||
| ├─ starts CodeBuild with cluster params as env vars | ||
| │ ↓ | ||
| │ CodeBuild (terraform-eks-deployment) | ||
| │ ├─ installs Terraform from S3 | ||
| │ ├─ installs Census CA cert | ||
| │ ├─ clones terraform-eks-deployment from GHE | ||
| │ └─ terraform apply | ||
| │ └─ CSVD/terraform-github-repo module | ||
| │ ├─ creates repo from template-eks-cluster | ||
| │ ├─ renders 7 HCL files + config.json | ||
| │ └─ opens PR (repo-init → main) | ||
| └─ returns repo URL + PR URL | ||
| ← CFN stack outputs (repo + PR URLs) ── | ||
| ← SC product status: AVAILABLE | ||
| ``` | ||
|
|
||
| **Talking points:** | ||
| - The product is shared at the **OU level** — any account in the org can provision it | ||
| - All compute runs **centrally in csvd-dev** — no Lambda or CodeBuild needed in the team's account | ||
| - CloudFormation natively supports cross-account Lambda invocation via the `ServiceToken` ARN | ||
| - The launch role (`r-ent-servicecatalog-eksterragrunt-launch-role`) is deployed to every account | ||
| automatically via a **CloudFormation StackSet** — zero manual setup per account | ||
| - The Lambda is a thin orchestrator — it just starts CodeBuild and polls for completion | ||
| - Actual repo creation is pure Terraform (`terraform-eks-deployment` workspace) — auditable, testable, repeatable | ||
|
|
||
| --- | ||
|
|
||
| ## 4. Live Demo — Provisioning the Product (8 min) | ||
|
|
||
| ### Step 1: Open Service Catalog | ||
| *Navigate to Service Catalog → Products list (or the portfolio directly)* | ||
|
|
||
| > "Here's the EKS Terragrunt Repository Creator product. This is available to any team | ||
| > with the inf-admin SSO role in any org account." | ||
| ### Step 2: Launch the product | ||
| *Click "Launch product" → enter a name for the provisioned product* | ||
|
|
||
| > "I'll give this provisioned product an instance name — this is just the SC record name, | ||
| > not the repo name." | ||
| ### Step 3: Fill in the parameters | ||
| *Walk through the form sections:* | ||
|
|
||
| **Cluster Configuration:** | ||
| > "Repository name becomes the cluster name by default. We use lowercase hyphens only | ||
| > because this name flows into Kubernetes resource names. Environment is dev — this | ||
| > controls which Terragrunt environment directory gets created." | ||
| **Account Configuration:** | ||
| > "These three fields — account name, account ID, and environment abbreviation — get | ||
| > stamped into multiple HCL files. In the old world, you'd copy-paste these by hand | ||
| > into each file. Here, you type them once." | ||
| **VPC Configuration:** | ||
| > "VPC Name and VPC Domain Name. Note: this is the VPC name string, not a VPC ID. | ||
| > The Terragrunt configuration uses names for readability and to avoid account-specific | ||
| > IDs hardcoded in source control." | ||
| **Contact & FinOps:** | ||
| > "Team information and cost allocation tags. These get embedded in the cluster | ||
| > configuration and inherited by all Terraform runs against this cluster." | ||
| **Creator Username (optional):** | ||
| > "If you provide your GitHub username here, you'll get direct admin access to the | ||
| > created repo in addition to the owning team." | ||
| ### Step 4: Launch and watch | ||
| *Click "Launch" → switch to the CloudFormation tab* | ||
|
|
||
| > "CloudFormation creates a stack with a single Custom Resource. That Custom Resource | ||
| > invokes our Lambda function in csvd-dev, cross-account." | ||
| *Switch to the CodeBuild console tab in csvd-dev* | ||
|
|
||
| > "We can see the CodeBuild build starting. The Lambda passed all the cluster parameters | ||
| > as environment variables — TF_VAR_name, TF_VAR_region, TF_VAR_cluster_config as JSON, | ||
| > and so on. CodeBuild is now downloading Terraform from our internal S3 assets bucket, | ||
| > installing the Census CA cert so it can reach GitHub Enterprise, and running | ||
| > terraform init followed by terraform apply." | ||
| *[Wait ~2-3 minutes for CodeBuild to complete, or use a pre-completed build if time is short]* | ||
|
|
||
| ### Step 5: Show the results | ||
| *Switch to GitHub Enterprise → SCT-Engineering org* | ||
|
|
||
| > "The repository has been created from the `template-eks-cluster` template. | ||
| > Let's look at what's in it." | ||
| **Walk through the generated files:** | ||
| | File | What to say | | ||
| |------|-------------| | ||
| | `config.json` | "Top-level configuration file — all cluster metadata in one place. This is what the Terragrunt modules read to self-configure." | | ||
| | `root.hcl` | "Root Terragrunt config — sets the remote state bucket prefix and environment." | | ||
| | `<env>/account.hcl` | "Account-level config — account name, ID, and profile. Shared by all clusters in this account." | | ||
| | `<env>/<region>/region.hcl` | "Region-level config — inherited by all VPCs and clusters in this region." | | ||
| | `<env>/<region>/<vpc>/vpc.hcl` | "VPC-level config — domain name and routing configuration for this cluster's network." | | ||
| | `<env>/<region>/<vpc>/<cluster>/cluster.hcl` | "The leaf node — cluster-specific values: name, node group sizes, mailing list, FinOps tags." | | ||
| | `README.md` | "Pre-populated with cluster name, environment, and links to relevant runbooks." | | ||
|
|
||
| > "Every value in every file was rendered from the form inputs. No copy-paste, no manual edits." | ||
| *Click to the "Pull requests" tab* | ||
|
|
||
| > "And there's the PR: `repo-init` into `main`. The engineer reviews this — checks the | ||
| > values look right, confirms the directory structure is correct — then merges. At that | ||
| > point their repository is production-ready and they can start running Terragrunt to | ||
| > deploy the actual AWS infrastructure." | ||
| ### Step 6: Back to Service Catalog | ||
| *Switch back to the SC console* | ||
|
|
||
| > "The provisioned product shows AVAILABLE. The outputs are the repository URL and PR URL | ||
| > — so a team lead provisioning this from, say, a ci account or their sandbox account, | ||
| > gets these links back directly in the console without ever touching GitHub." | ||
| --- | ||
|
|
||
| ## 5. Infrastructure as Code Story (3 min) | ||
|
|
||
| > "The product itself is entirely defined as code — nothing was clicked into existence." | ||
| **Three repos, three layers:** | ||
|
|
||
| | Layer | Repo | What it deploys | | ||
| |-------|------|-----------------| | ||
| | Compute (prerequisite) | `lambda-template-repo-generator` | Lambda function, CodeBuild project, IAM roles, ECR image via packer pipeline | | ||
| | SC product + portfolio | `terraform-service-catalog-census` | SC portfolio, product, launch role StackSet, S3 template object | | ||
| | Runtime logic | `terraform-eks-deployment` | The Terraform that actually creates the GitHub repo + HCL files | | ||
|
|
||
| **Talking points:** | ||
| - `lambda-template-repo-generator/deploy/` is a standard Terraform root — `tf apply` deploys the Lambda | ||
| - `terraform-service-catalog-census` is a Terragrunt monorepo — the EKS product is one of ~6 SC products managed there | ||
| - The Lambda buildspec is stored in `terraform-eks-deployment/buildspec.yml` and inlined into the CodeBuild | ||
| project at `tf apply` time — so updating the build steps is a normal PR workflow | ||
| - The launch role (`eks-terragrunt-launch-role.yaml`) is a CloudFormation template deployed via | ||
| StackSet — so it exists in every shared account automatically when a new account joins the OU | ||
|
|
||
| --- | ||
|
|
||
| ## 6. Key Design Decisions (2 min — if asked) | ||
|
|
||
| **Why CodeBuild instead of Lambda for repo creation?** | ||
| > "The CSVD Terraform GitHub module handles all the complexity: branch protection rules, | ||
| > team access, file rendering, PR creation. Rewriting that in Python would be significant | ||
| > additional maintenance burden. CodeBuild lets us run vanilla Terraform — same code, | ||
| > same modules, same provider versions as any other Terraform workspace." | ||
| **Why is the Lambda centralized in csvd-dev instead of per-account?** | ||
| > "The Lambda makes no AWS API calls in the provisioner's account — it only talks to | ||
| > GitHub Enterprise and CodeBuild, both of which are in csvd-dev. CloudFormation's | ||
| > `ServiceToken` mechanism supports cross-account Lambda invocation natively. Deploying | ||
| > the Lambda to every account would require ECR replication, per-account CodeBuild, | ||
| > per-account Secrets Manager secrets — real complexity with no benefit." | ||
| **Why Service Catalog instead of a plain API or CLI tool?** | ||
| > "Service Catalog gives us SSO-integrated access control, the provisioned product | ||
| > lifecycle (create/update/terminate), and a UI that non-engineers can use. It also | ||
| > lets the platform team define exactly what inputs are allowed — the form parameters | ||
| > are validated by CloudFormation before anything runs." | ||
| --- | ||
|
|
||
| ## 7. Cleanup / Termination | ||
|
|
||
| > "If the provisioned product is terminated in Service Catalog, CloudFormation deletes | ||
| > the stack. The GitHub repository itself is not deleted — archive-on-destroy is disabled. | ||
| > This is intentional: you don't want infrastructure code deleted automatically." | ||
| --- | ||
|
|
||
| ## 8. Q&A Prep | ||
|
|
||
| | Likely question | Answer | | ||
| |----------------|--------| | ||
| | Can I update the repository after it's created? | Not through SC — that's intentional. After provisioning, the repo is owned by the team. SC is for initial creation only. | | ||
| | What happens if CodeBuild fails? | Lambda returns FAILED to CloudFormation, CFN rolls back the stack, SC shows the product as FAILED with the error message. | | ||
| | Can we add more HCL files to what gets generated? | Yes — edit `terraform-eks-deployment/main.tf` (the `rendered_files` locals block). It's all Terraform templatefile calls. | | ||
| | How are GitHub tokens managed? | Two tokens in Secrets Manager: an App installation token for the Lambda's Python API calls, and a PAT for the Terraform GitHub provider (passed to CodeBuild). | | ||
| | How does this scale to many accounts? | The StackSet auto-deploys the launch role to every account that joins the OU. No manual setup per account. | | ||
| | Can we version the product? | Yes — bump `product_version` in `configurations/products/eks-terragrunt-repo/EKS_REPO.yaml.tftpl` and add a new CFN template. Old provisioned products keep their version. | | ||
|
|
||
| --- | ||
|
|
||
| ## Appendix: Useful URLs for the Demo | ||
|
|
||
| | Resource | URL | | ||
| |----------|-----| | ||
| | SC console | AWS console → Service Catalog → Products | | ||
| | CodeBuild builds | AWS console → CodeBuild → Build history → `eks-terragrunt-repo-creator` | | ||
| | Lambda logs | CloudWatch → Log groups → `/aws/lambda/eks-terragrunt-repo-gen-template-automation` | | ||
| | GHE org | `https://github.e.it.census.gov/SCT-Engineering` | | ||
| | template-eks-cluster | `https://github.e.it.census.gov/SCT-Engineering/template-eks-cluster` | |