Skip to content

Commit

Permalink
adding docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Your Name committed Aug 27, 2025
1 parent 4f26e92 commit 3773704
Show file tree
Hide file tree
Showing 5 changed files with 381 additions and 0 deletions.
83 changes: 83 additions & 0 deletions tf-native-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Plan for Migrating to a Terraform-Native GitHub Repository Management Workflow (v2)

This document provides a corrected and more accurate plan to replace the current Python Lambda-based workflow with a Terraform-native approach for creating and managing GitHub repositories.

## 1. Corrected Analysis of the Current State

The current system uses a Python-based AWS Lambda function (`template-automation-lambda`) to automate repository creation. It does **not** use Ansible for repository configuration as previously assumed.

The workflow is as follows:
1. The Lambda function is invoked with details for a new repository (e.g., `project_name`, `template_settings`).
2. The Python script within the Lambda performs a series of GitHub API calls to:
a. Create a new repository.
b. Clone the contents of a base template repository into it.
c. Add a custom `config.json` file.
d. Assign team permissions.
e. Create a pull request to merge the initial setup into the `main` branch.

This process, while effective, involves a custom-built Python application, a Lambda deployment, an ECR container registry, and associated IAM roles, making it complex to maintain.

## 2. Proposed Solution: A Purely Terraform-Native Workflow

We will replace the entire Lambda-based system with a declarative Terraform configuration that uses the `terraform-github-repo` module. This module natively supports the actions currently performed by the Python script.

The new process will be:
1. A developer defines a new repository by adding a `module` block to a central Terraform configuration file.
2. Running `terraform apply` will instruct Terraform to perform all the necessary setup steps.

### Mapping Lambda Actions to Terraform Resources

| Action (Current Python Lambda) | Terraform Equivalent (`terraform-github-repo` module) |
| :--- | :--- |
| 1. Create a new repository. | The `github_repository` resource. The module uses this internally. |
| 2. Clone a template repository. | The `template` block within the `github_repository` resource. This is a native feature for creating a repo from a template. |
| 3. Write a `config.json` file. | The `github_repository_file` resource. The module accepts a `files` variable to manage this. |
| 4. Assign team permissions. | The `github_team_repository` resource. The module has a `teams` input for this. |
| 5. Create a pull request. | The `github_pull_request` resource. This can be defined outside the module to initialize the repository. |

## 3. Detailed Migration Plan

### Phase 1: Scoping and Proof of Concept

1. **Identify All Template Variables:** Document all the key-value pairs that are currently passed into the `template_settings` of the Lambda. These will become variables in our new Terraform module.

2. **Create a Wrapper Module:** Create a new, internal Terraform module that wraps the `terraform-github-repo` module. This wrapper will provide a simplified interface for our developers and contain the logic for creating the initial pull request.

3. **Develop the PoC:** In a test environment, write a Terraform configuration that uses this new wrapper module to create a single, non-critical repository. The configuration should:
* Use the `template` feature to create the repository from the existing template repo.
* Use the `files` feature to add the `config.json`.
* Use the `teams` feature to grant permissions.
* Define a `github_pull_request` resource to create the initial PR.

### Phase 2: Implementation and Import

1. **Build Out the Full Configuration:** Create a new Git repository to house the Terraform configuration for all repositories that will be managed this way.

2. **Import Existing Repositories:** For repositories previously created by the Lambda, use `terraform import` to bring them under Terraform's state management. This is a critical step to prevent any disruption.
* `terraform import module.my_repo.github_repository.this[0] my-repo-name`
* `terraform import module.my_repo.github_team_repository.teams["tf-module-admins"] my-repo-name:tf-module-admins`

3. **Parallel Run:** For a transition period, both systems can exist. New repositories should be created using the Terraform method, while the Lambda is left in place to manage older ones if needed.

### Phase 3: Testing and Validation

1. **Dry Run with `terraform plan`:** Before applying any changes to a production repository, run `terraform plan` and carefully review the output to ensure it matches expectations and doesn't plan any destructive changes.

2. **Full Application:** Once validated, apply the configuration to manage all target repositories.

### Phase 4: Decommissioning

Once all repositories are successfully managed by Terraform and the new workflow is stable, the old infrastructure can be safely removed.

1. **Disable the Lambda Trigger:** The first step is to disable the mechanism that invokes the Lambda function.
2. **Delete the Lambda Function:** Remove the `template-automation-lambda` function from AWS.
3. **Delete the ECR Repository:** Delete the ECR repository holding the Lambda's container images.
4. **Delete the Deployment Pipeline:** Remove the `template-repos-lambda-deployment` Terraform configuration and state.
5. **Archive Old Repositories:** Archive the `template-automation-lambda` and `template-repos-lambda-deployment` Git repositories to mark them as deprecated.

## 4. Rollback Plan

If issues arise, we can revert by:
1. Removing the problematic repository from Terraform's state using `terraform state rm`.
2. Re-enabling or re-deploying the Lambda function to take over management again.
3. Manually correcting any unintended changes made by Terraform.
75 changes: 75 additions & 0 deletions tf-native-v3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Plan for Migrating to a Terraform-Native EKS Deployment Workflow (v4)

This document outlines the plan to replace the current Lambda/Ansible-based system with a streamlined, Terraform-native workflow for creating and configuring repositories for EKS deployments.

## 1. Analysis of the Current State

The current process for provisioning a new EKS cluster repository involves multiple, loosely-coupled components:

1. **`template-automation-lambda`**: A Python Lambda function that creates a new GitHub repository from the `template-eks-cluster` template. It clones the template, adds a `config.json` file with user-provided settings, and opens a pull request.
2. **`generate_hcl_files.yml`**: An Ansible playbook inside the newly created repository that is run manually after the initial PR is merged. It reads the `config.json` and generates a set of Terragrunt HCL files (`root.hcl`, `account.hcl`, `region.hcl`, etc.).
3. **`terraform-eks-deployment`**: A Terraform module that is referenced by the generated Terragrunt configuration to deploy the actual EKS cluster.

This workflow is complex, involves manual steps, and relies on a mix of technologies (Python, Lambda, Ansible, Terraform).

## 2. Proposed Solution: A Unified, Terraform-Native Workflow

We will create a single, unified Terraform workflow that handles the entire process of repository creation and configuration declaratively. This eliminates the need for the Lambda function and the Ansible playbook.

The new process will be:
1. A developer defines a new EKS cluster by adding a single `module` block to a central Terraform configuration.
2. Running `terraform apply` will automatically:
a. Create a new GitHub repository.
b. Generate and commit all the necessary Terragrunt HCL files and `README.md`.
c. Configure team permissions for the repository.

### Core Component: The New `terragrunt-eks-repo` Wrapper Module

The centerpiece of this new workflow is a new Terraform module, `terragrunt-eks-repo`. This module will be responsible for all the setup logic.

| Action (Old Workflow) | Terraform Equivalent (New `terragrunt-eks-repo` Module) |
| :--- | :--- |
| 1. Create a new repository from a template. | The module will call the `terraform-github-repo` module internally, using its `template` feature to clone from `template-eks-cluster`. |
| 2. Generate HCL files from `config.json`. | The module will contain HCL templates (`.tf.tpl` files). It will use Terraform's `templatefile()` function to render the final HCL content directly from its input variables. |
| 3. Write files to the repository. | The rendered file content will be passed to the `files` input of the underlying `terraform-github-repo` module, which uses the `github_repository_file` resource to commit them. |
| 4. Assign team permissions. | The module will accept a `teams` variable and pass it to the `terraform-github-repo` module to configure permissions using the `github_team_repository` resource. |

## 3. Detailed Migration Plan

### Phase 1: Develop the `terragrunt-eks-repo` Module

1. **Create Module Scaffolding:** Create a new directory for the `terragrunt-eks-repo` module.

2. **Define Input Variables:** Create a `variables.tf` file. The variables will be derived directly from the `generate_hcl_files.yml` playbook's `config` object (e.g., `environment`, `region`, `cluster_name`, `account`, `vpc`, etc.).

3. **Create HCL Templates:** Create a `templates` directory within the module. For each file generated by the Ansible playbook (`root.hcl`, `account.hcl`, `region.hcl`, `vpc.hcl`, `cluster.hcl`, and `README.md`), create a corresponding `.tf.tpl` template file. Convert the Jinja2 syntax to Terraform's `${...}` interpolation syntax.

4. **Implement Module Logic (`main.tf`):**
* Use `locals` to render the file content for each template using the `templatefile()` function.
* Call the `terraform-github-repo` module.
* Pass the repository name, template configuration, and team permissions to the module.
* Map the rendered local variables to the `files` input of the `terraform-github-repo` module. This will instruct it to create the files in the new repository.

### Phase 2: Implementation and Onboarding

1. **Create a Central Management Repository:** Set up a new Git repository (e.g., `terragrunt-environments`) that will contain the Terraform configuration for creating all new EKS cluster repositories.

2. **Onboard a Pilot Project:** In the new management repository, add a `main.tf` file. Add a module block that calls the newly created `terragrunt-eks-repo` module to provision a repository for a new test cluster.

3. **Execute and Validate:** Run `terraform apply` to create the repository. Verify that:
* The repository is created on GitHub.
* It is correctly initialized from the `template-eks-cluster` template.
* All the Terragrunt HCL files and the `README.md` are present and correctly populated with the variable values.
* Team permissions are correctly assigned.

### Phase 3: Decommissioning the Old Workflow

Since we are not concerned with migrating existing repositories, the decommissioning process is straightforward. Once the new workflow is validated and adopted for all new cluster provisioning:

1. **Disable the Lambda Function:** The Lambda trigger can be disabled in AWS.
2. **Archive Old Repositories:** The `template-automation-lambda` and `template-repos-lambda-deployment` Git repositories should be archived to prevent further use.
3. **Delete AWS Resources:** The old AWS resources (Lambda function, ECR repository, IAM roles) can be deleted via Terraform from the `template-repos-lambda-deployment` project.

## 4. Rollback Plan

As we are not migrating existing resources, a rollback is not applicable in the traditional sense. If the new workflow fails for a new repository, the state can be destroyed (`terraform destroy`), the module can be fixed, and the process can be re-run. The old Lambda-based system can be temporarily kept available for emergency use until the new workflow is fully proven.
79 changes: 79 additions & 0 deletions tf-native-v4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Plan for Migrating to a Terraform-Native EKS Deployment Workflow (v4)

This document outlines the plan to replace the current Lambda/Ansible-based system with a streamlined, Terraform-native workflow by enhancing the `terraform-eks-deployment` module itself.

## 1. Analysis of the Current State

The current process for provisioning a new EKS cluster repository involves multiple components:

1. **`template-automation-lambda`**: A Python Lambda function that creates a new GitHub repository from the `template-eks-cluster` template.
2. **`generate_hcl_files.yml`**: An Ansible playbook inside the new repository that is run manually to generate a set of Terragrunt HCL files (`root.hcl`, `account.hcl`, etc.).
3. **`terraform-eks-deployment`**: The Terraform module that is referenced by the generated Terragrunt configuration to deploy the actual EKS cluster.

This workflow is complex, involves manual steps, and relies on a mix of technologies.

## 2. Proposed Solution: A Unified, All-in-One EKS Deployment Module

We will consolidate the entire workflow into the `terraform-eks-deployment` module. This module will be enhanced to handle not only the EKS deployment but also the initial GitHub repository creation and configuration. This eliminates the need for the Lambda function and the Ansible playbook.

The new, unified process will be:
1. A developer defines a new EKS cluster by adding a single `module "eks_deployment"` block to a central Terraform configuration.
2. By setting `create_repository = true`, the developer instructs the module to perform the initial setup.
3. Running `terraform apply` will automatically:
a. Create a new GitHub repository using the `terraform-github-repo` module as a submodule.
b. Generate and commit all the necessary Terragrunt HCL files and a `README.md`.
c. Configure team permissions for the repository.

The same module, when referenced from within the newly created repository's Terragrunt files, will have `create_repository = false` and will proceed with deploying the EKS cluster as it does today.

### Core Component: The Enhanced `terraform-eks-deployment` Module

| Action (Old Workflow) | Terraform Equivalent (Inside `terraform-eks-deployment`) |
| :--- | :--- |
| 1. Create a new repository from a template. | A new submodule block calling `terraform-github-repo` will be added, controlled by a `create_repository` flag. |
| 2. Generate HCL files from `config.json`. | The module will contain a new `templates` directory with HCL templates (`.tf.tpl`). It will use `templatefile()` to render the final HCL content from its input variables. |
| 3. Write files to the repository. | The rendered file content will be passed to the `files` input of the `terraform-github-repo` submodule. |
| 4. Assign team permissions. | The module will accept a `teams` variable and pass it to the `terraform-github-repo` submodule. |

## 3. Detailed Migration Plan

### Phase 1: Enhance the `terraform-eks-deployment` Module

1. **Add Input Variables:** In `variables.tf`, add new variables:
* `create_repository`: A boolean to control whether to execute the repository creation logic. Default to `false`.
* `repository_name`: The name of the GitHub repository to create.
* `repository_teams`: A map of teams and their permissions for the new repository.
* Variables derived from the `generate_hcl_files.yml` playbook's `config` object (e.g., `environment`, `region`, `cluster_name`, `account`, `vpc`, etc.).

2. **Create HCL Templates:** Create a `templates` directory within the module. For each file generated by the Ansible playbook (`root.hcl`, `account.hcl`, `region.hcl`, `vpc.hcl`, `cluster.hcl`, and `README.md`), create a corresponding `.tf.tpl` template file. Convert the Jinja2 syntax to Terraform's `${...}` interpolation syntax.

3. **Implement Module Logic (`main.tf`):**
* Use `locals` to render the file content for each template using the `templatefile()` function.
* Add a `module "github_repo"` block that calls the `terraform-github-repo` module.
* Set the `count` of this submodule to `var.create_repository ? 1 : 0`.
* Pass the repository name, template configuration, and team permissions to the submodule.
* Map the rendered local variables to the `files` input of the `github_repo` submodule.

### Phase 2: Implementation and Onboarding

1. **Create a Central Management Repository:** Set up a new Git repository (e.g., `terragrunt-environments`) that will contain the Terraform configuration for creating all new EKS cluster repositories.

2. **Onboard a Pilot Project:** In the new management repository, add a `main.tf` file. Add a module block that calls the enhanced `terraform-eks-deployment` module with `create_repository = true` to provision a repository for a new test cluster.

3. **Execute and Validate:** Run `terraform apply` to create the repository. Verify that:
* The repository is created on GitHub.
* It is correctly initialized from the `template-eks-cluster` template.
* All the Terragrunt HCL files and the `README.md` are present and correctly populated.
* Team permissions are correctly assigned.

### Phase 3: Decommissioning the Old Workflow

Since we are not migrating existing repositories, the decommissioning process is straightforward.

1. **Disable the Lambda Function:** The Lambda trigger can be disabled in AWS.
2. **Archive Old Repositories:** The `template-automation-lambda` and `template-repos-lambda-deployment` Git repositories should be archived.
3. **Delete AWS Resources:** The old AWS resources (Lambda, ECR, IAM roles) can be deleted.

## 4. Rollback Plan

A rollback is not applicable in the traditional sense. If the new workflow fails, the state can be destroyed (`terraform destroy`), the module can be fixed, and the process can be re-run. The old Lambda-based system can be kept available for emergency use.
Loading

0 comments on commit 3773704

Please sign in to comment.