-
Notifications
You must be signed in to change notification settings - Fork 0
docs: generalized architecture, ADR-001 accepted, ADR-002 Vault AWS Secrets Engine #1
Open
arnol377
wants to merge
27
commits into
main
Choose a base branch
from
feature/template-repo-rendering
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+6,322
−1,757
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
arnol377
commented
May 20, 2026
- Add template repo rendering, Bedrock discussion notes, and HOW-IT-WORKS docs
- Incorporate morga471 review feedback and address Teams follow-up questions
- docs: fix outdated sections in HOW-IT-WORKS.md
- review: address morga471 round-2 feedback
- fix: source tf-run toolchain from terraform/support, not scripts/
- feat: Enhance Terraform execution with proposer and executor templates
- feat: Add initial buildspec.yml for tf-run-executor configuration
- docs: ADR-001 webhook auto-apply on merge to main (proposed)
- docs: ADR-001 update — webhook payload details, commit status writeback, .sc-automation.yml lifecycle
- docs: Update ADR-001 to clarify webhook-triggered auto-apply process
- docs: generalized architecture, webhook auto-apply ADR, Vault ADR
…KS docs - lambda/app.py: add template_repo + template_vars fields to TfRunRequest; merge field_validator to cover both extra_files and template_vars; pass both new fields in CodeBuild environmentVariablesOverride - buildspec.yml: add TEMPLATE_REPO/TEMPLATE_VARS env defaults; new build step clones template repo, renders .j2 files via Jinja2 StrictUndefined, copies non-.j2 files verbatim; EXTRA_FILES step runs after and overrides - service-catalog/product-template.yaml: add TemplateRepo + TemplateVars parameters and parameter group; wire to Lambda Custom Resource - docs/HOW-IT-WORKS.md: full end-to-end documentation of the system - .gitignore: exclude *.tfstate, *.tfvars, .terraform/, terraform_data_dirs/
…tions
- buildspec.yml:
- Clone terraform/support at build time for version governance (no more
hardcoded 1.9.1/2.49.0 version strings; reads VERSION files from support repo)
- S3 env vars are now prefixes (TF_BINARY_S3_PREFIX, GH_CLI_S3_PREFIX);
filenames constructed dynamically from support repo VERSION files
- Add SSH->HTTPS git URL rewrite so Terraform module SSH sources work via PAT
- Add conditional cross-account assume-role step (TARGET_ACCOUNT_ID)
- Add 169.254.170.2 to NO_PROXY (AL2023 ECS credential provider)
- deploy/codebuild.tf: upgrade image to amazonlinux2023-x86_64-standard:4.0
- deploy/variables.tf: rename tf_binary_s3/gh_cli_s3 to *_prefix with updated
descriptions and defaults
- lambda/app.py: add optional target_account_id field; pass TARGET_ACCOUNT_ID
to CodeBuild environmentVariablesOverride
- service-catalog/product-template.yaml: add optional TargetAccountId parameter
with AllowedPattern validation
- docs/HOW-IT-WORKS.md:
- Document version governance via terraform/support
- Note AL2023
- Replace 'No SSH Git' constraint with SSH->HTTPS rewrite explanation
- Add cross-account section explaining TARGET_ACCOUNT_ID and required role
- Add 'Moving This System to a Different Account' section (Teams Q)
- Component overview BUILD phase: add missing step 6 (assume cross-account role) and renumber steps 7-8 accordingly - Component overview POST_BUILD: correct description — Lambda calls GHE API directly; PR_URL= line in logs is informational only (not parsed by Lambda) - Step 7 (Commit and Push): add git -c user.email/user.name config that is actually present in buildspec.yml - Step 10 (Lambda Returns Results): note the informational-only nature of PR_URL= more clearly - CFN outputs table: add snake_case aliases (pull_request_url, repository_url, branch_name) that Lambda actually emits alongside PascalCase variants
Code changes: - lambda/app.py: add 'global' to valid region_dir values (line 546 comment) - service-catalog/product-template.yaml: add 'global' to RegionDir AllowedValues Doc changes (HOW-IT-WORKS.md): - region_dir: document 'global' as valid value for non-regional resources (SSO, IAM) - Step 8: add tf-run plan vs tf-plan distinction; tf-plan skips symlink setup - Bootstrapping: rewrite section — remote_state.backend.tf is created by the REMOTE_STATE directive in tf-run.data; what must pre-exist is remote_state.yml. Running 'tf-run init' is the correct first-time setup path. - git-secret: add note that CodeBuild cannot run 'git secret reveal' (no GPG key) - Build Timeout: add EventBridge as future improvement over Lambda polling - SSH section: add note about service-user SSH key as alternative approach - Why csvd-dev: add note about future move to operations accounts - Delete: add recommended decommission path (PR-based removal + manual apply) - Rebuild Lambda: replace personal home path with generic, use tf apply instead of manual aws lambda update-function-code CLI call - Deploy: remove personal ~/aws-creds and /home/a/arnol377 paths
The canonical versions of tf-run, tf-control.sh, and tf-directory-setup.py live in github.e.it.census.gov/terraform/support (local-app/ subtree). We already clone that repo during the INSTALL phase for VERSION governance, so sourcing the scripts from there costs nothing extra. - buildspec.yml: cp from /tmp/tf-support/local-app/ instead of CODEBUILD_SRC_DIR/scripts/ - scripts/: remove tf-run, tf-control.sh, tf-directory-setup.py (bundled copies) tf-run.py and tf-run.data remain (project-specific) - docs/HOW-IT-WORKS.md: update both references (component overview + Step 3)
- Updated `TfRunRequest` model to differentiate between propose and apply actions, adding relevant fields for each. - Refactored `start_codebuild_build` function to handle environment variable overrides based on the action type. - Implemented logic in `lambda_handler` to manage responses for both propose and apply actions. - Added new CloudFormation templates for the proposer and executor products, enabling structured Terraform change proposals and applications. - The proposer template handles rendering templates and opening pull requests, while the executor template applies changes after PR approval.
…ck, .sc-automation.yml lifecycle
- docs/README.md: high-level index with reading paths by use case - docs/HOW-IT-WORKS.md: reframe from two-product to single Proposer + webhook auto-apply; remove executor SC product framing - docs/decisions/001-webhook-auto-apply.md: status Proposed → Accepted; update context and consequences to reflect removal of executor SC product - docs/decisions/002-vault-aws-secrets-engine.md: new ADR for Vault AWS Secrets Engine; dynamic cross-account credentials; per-product IAM scope via Proposer terraform apply; account baseline prerequisite pattern - docs/generalized-terraform-product-architecture.md: new - docs/template-management.md: Executor flow, .sc-automation.yml schema - docs/repo-vars-and-secrets.md: CodeBuild environmentVariablesOverride pattern - docs/workflow-flowcharts.md: Mermaid diagrams for propose/apply flows - docs/fleet-governance-at-scale.md: new - docs/service-catalog-census-integration.md: new - docs/cross-account-visibility.md: new
- repo-vars-and-secrets.md: remove 'Later rollout (GHA)' callout block; the executor is CodeBuild triggered by webhook, not GitHub Actions - fleet-governance-at-scale.md: remove 'GHA executor rollout phase' note; replace with accurate CodeBuild+webhook description
Adds a dedicated section clearly describing what each CodeBuild project does, how it is triggered, what env vars it receives, and what it does/does not do — so stakeholders have a single place to understand the two-build model.
…tory-setup.py)
- buildspec-proposer.yml: install tf-directory-setup.py from terraform/support in
INSTALL phase; add python-dateutil + pyyaml pip deps.
BUILD phase: after template rendering, run Python bootstrap step that:
1. Processes REMOTE-STATE directives in tf-run.data files — derives workspace
remote_state.yml from layer-level file (identical to tf-run.sh behavior)
2. Runs tf-directory-setup.py --link none in each workspace with remote_state.yml —
generates remote_state.backend.tf + .tf.s3/.local/.none variant files + symlink
- buildspec-executor.yml: add note that REMOTE-STATE and tf-directory-setup.py
steps are idempotent — files already exist from Proposer PR, no new files created
- docs/HOW-IT-WORKS.md: expand BUILD phase step 5 to document the full file
generation sequence including REMOTE-STATE and tf-directory-setup.py; add
rationale explaining why all generation must happen in the Proposer
- docs/template-management.md: fix template repo structure diagram — workspace
remote_state.yml.j2 files removed (wrong); layer-level remote_state.yml.j2 shown;
workspace tf-run.data with REMOTE-STATE directive shown; add layout rules for
auto.tfvars profile/region requirement and .j2 source file handling.
Expand Proposer Build steps to cover REMOTE-STATE + tf-directory-setup.py.
Add principle callout: PR diff is the complete truth.
…; plan gitignore buildspec-executor.yml: - INSTALL: create terraform_latest symlink -> terraform (account repos use TFCOMMAND=terraform_latest) - INSTALL: mkdir /data/terraform/terraform.d/plugin-cache + providers (required by .tf-control.tfrc) - BUILD: after tf-run apply, git add symlink re-link + .terraform.lock.hcl and push directly to main with [skip ci] to prevent webhook re-trigger - Add CodeBuild cache block for /data/terraform/terraform.d/plugin-cache (persists provider archives across builds via S3) - Add log note: logs/ is ephemeral, must be in .gitignore docs/HOW-IT-WORKS.md: - INSTALL phase: document terraform_latest alias and /data/terraform dir creation - BUILD phase step 5: document symlink re-link + lock file commit-back with rationale docs/template-management.md: - Template structure: add .gitignore and .terraform.lock.hcl to workspace dirs - Layout rules: add .gitignore required entries (logs/, .terraform/, tfstate*) - Layout rules: explain .terraform.lock.hcl lifecycle (committed, Executor updates + pushes back) - Layout rules: explain terraform_latest alias and plugin cache/.tf-control.tfrc behavior
Core principle: account repos already carry .tf-control, .tf-control.tfrc, region.tf, credentials.d/, variables.d/ from initial setup. Template repos provide only the workload-specific delta (new .tf.j2 files + tf-run.data). Changes: - Rewrite template-management.md opening to explain delta-overlay model and why duplicating standard files would break reusability - Minimal real example: template-s3-bucket is 3 files total - New-layer case: layer-level remote_state.yml provided via EXTRA_FILES (Lambda Pydantic model builds it from SC form inputs), not from template - Remove .tf-control, .tf-control.tfrc, region.tf, credentials.d/, variables.d/ from template structure diagram (wrong/environment-specific) - Remove outdated Lambda template organization section (old EKS-only model) - Replace stale Executor section (was: renders templates + opens PRs) with correct model (runs tf-run apply only, commits lock+symlink back) - Fix Adding checklist: delta files only + EXTRA_FILES note for new layers
Template repos no longer encode layer/workspace as directory nesting. LAYER and REGION_DIR are already known env vars - the Proposer uses them to determine the destination path in the account repo. buildspec-proposer.yml: - Add TEMPLATE_SOURCE_PATH env var (selects subdirectory variant within repo) - Rewrite template rendering: dst_root = LAYER/REGION_DIR/ instead of '.' - Dotfiles at template root (e.g. .sc-automation.yml) go to account repo root - Document flat layout convention in comments docs/template-management.md: - Rewrite What Belongs section: flat structure, show where files land - template-s3-bucket example is now 3 flat files (not nested infrastructure/west/) - TEMPLATE_SOURCE_PATH explained inline with multi-variant example - Remove old Subdirectory Templates section (replaced with inline example) - tf-run.data, .sc-automation.yml.j2, .terraform.lock.hcl notes updated docs/HOW-IT-WORKS.md: - BUILD phase step 3: document flat layout + dotfile root exception
…emplate model
- Template repos are flat: just .tf.j2 + tf-run.data, no nested layer/region dirs
- LAYER and REGION_DIR are Proposer env vars; files are written to the correct
path at copy time, not encoded in template directory structure
- Remove lambda/templates/{product_type}/ tree (templates live in the template repo)
- Layer-level remote_state.yml built by Lambda Pydantic model extra_files()
from validated SC form inputs, not stored in template repo
- Pydantic model example updated with account_alias field + extra_files() method
- Onboarding checklist updated: no skeleton clone, no lambda/templates/ step
… OU sharing + StackSet
… buildspec reality
… it could either support or be changed to work better with this project
…s index; create ADR-003 for Vault cluster topology; create ADR-004 for account baseline IAM role; create ADR-005 for Service Catalog portfolio sharing strategy
… flow
- buildspec-executor.yml / buildspec.yml: default CROSS_ACCOUNT_ROLE=r-inf-terraform;
replace hardcoded role name with ${CROSS_ACCOUNT_ROLE} in sts:AssumeRole block
(interim scaffolding — will be replaced by vault read in CSC-1345)
- deploy/codebuild.tf: add CROSS_ACCOUNT_ROLE env var to executor project
- deploy/iam.tf: StsAssumeRoleCrossAccount allows r-inf-terraform, r-inf-terraform-eks,
sc-automation-codebuild-role (backwards compat)
- lambda/app.py: add TfRunRequest.cross_account_role field (default: r-inf-terraform);
pass CROSS_ACCOUNT_ROLE in CodeBuild env overrides for apply action
- docs/decisions/001-webhook-auto-apply.md: add cross_account_role to schema table
- design-docs/CHECKPOINT.md: update with Vault pivot and CSC-1344 blocked status
Jira: CSC-1344 (Blocked on CSC-1345)
Internal deck covering problem statement, architecture, security benefits (NIST 800-53), government/compliance considerations (BSL 1.1, OpenBao, FIPS 140-2), phased roadmap, and call to action. Jira: CSC-1345 CSC-1346
Sign in
to join this conversation on GitHub.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.