Skip to content

EKS 1.28 with doc and EKS 1.25 doc update #11

Merged
merged 8 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 14 additions & 40 deletions examples/full-cluster-tf-upgrade/1.25/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,20 @@ There are a number of steps to end up with a cluster.

1. From main repository, in the same `vpc/{region}/vpc{number}` directory
1. [Tag subnets](#subnet-tagging) in main repository (before creating nodegroup)
1. [Copy variables.vpc.*](#copy-variable-settings) from main respository in the same `vpc/{region}/vpc{number}`
1. Copy the [includes.d structure](#copy-includesd)
1. In the submodule repository, in the `vpc/{region}/vpc{number}/apps/{clustername}` directory
1. Update `settings.auto.tfvars`
1. Update `includes.d/parent_rs.tf`
1. Terraform [Automated Setup](#terraform-automated-setup)
1. Optionally, follow the Terraform Setup-by-Step, which is essentially the same as following the automated tf-run.data
1. Initialize [Cluster Main](#initialize-cluster-main) directory
1. Create [policies](#policies)
1. Create [EC2 Keypair](#keypair-creation)
1. Finish [cluster setup](#cluster-creation)
1. Setup [aws-auth](#setup-aws-auth)
1. Setup [EFS](#setup-efs)
1. Setup [aws-auth](#setup-aws-auth)
1. Setup [EFS](#setup-efs)
1. Setup [Common Services](#common-services)
1. [Access to the cluster](#access-to-the-cluster)
1. [Cluster Setup](#cluster-setup)

## Post-Setup Tasks

Expand Down Expand Up @@ -56,38 +59,6 @@ For creating a service which uses load balancers (ELB, ALB, or NLB), the last ta
to the subnet(s) for load balancing. A separate set of subnets exist for load balacning, with a name including `private-lb`.


## Copy Variable Settings when in a submodule repo

We need the `variables.vpc.tf` and `variables.vpc.auto.tfvars` from the main repository. These are not to be modified in
this submodule.

```shell
cd MAIN-REPOSITORY
MAINTOP=$(git rev-parse --show-toplevel)
cd applications/{APPNAME}
cd vpc/{region}/vpc{number}
for f in $(ls $MAINTOP/vpc/{region}/vpc{number}/variables.vpc*)
do
cp $f ./
done
```

Replace {region} and {number} and {APPNAME} with the correct values.

## Copy includes.d when in a submodule repo

This makes a copy of the entire `MAIN/includes.d` structure in the submodule, for use as soft links to bring in
application variables for tagging.

```shell
cd MAIN-REPOSITORY
MAINTOP=$(git rev-parse --show-toplevel)
cd applications/{APPNAME}
rsync -avRWH $MAINTOP/./includes.d ./
```

Replace {APPNAME} with the correct value.

## Update the settings.auto.tfvars file

Set the appropriate values in the `settings.auto.tfvars` file. An example starter file is at `settings.auto.tfvars.example`.
Expand All @@ -103,25 +74,28 @@ ns-0.awsdns-us-gov-00.com. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 8640
Here is a sample file:

```hcl
cluster_name = "org-project-env
cluster_name = "org-project-env"
cluster_version = "1.25"
region = "us-gov-east-1"
domain = "org-project-env.env.domain.census.gov"
eks_instance_disk_size = 40
eks_vpc_name = "*vpcshortname*"
eks_vpc_name = "vpc_full_name"
eks_instance_type = "t3.xlarge"
eks_ng_desire_size = 3
eks_ng_max_size = 15
eks_ng_min_size = 3
subnets_name = "*-subnet_label-*"
```

You need to change these values:

* cluster_name: put in the proper org, project, and environment. Cluster names should not be replicated across the environment.
These are tracked in the repo cloud-information/aws/documentation/containers/ (fix link).
These are tracked in the repo [cloud-information/aws/documentation/containers/](https://github.e.it.census.gov/terraform/cloud-information/blob/master/documentation/dns.md).
* region: include the correct region. This really is a duplicate of the `region` variable, so it may be removed in the future.
* domain: this is the domain name of the clsuter, consisting of the cluster name and the proper domain name for the environment/VPC.
* eks_vpc_name: replace *vpcshortname* with the appropriate vpc name. This is used to find the vpc ID. This will be fixed at a later date.
* eks_instance_disk_size: this should be default to 40Gb for most use-cases; only change this if you have special requirement and have exception approval.
* eks_vpc_name: replace *vpc_full_name* with the appropriate vpc full name. This is used to find the vpc ID.
* subnets_name: replace *subnet_label* with the label of the subnets allocated to providing ENIs for the cluster node group and containers; often as `container` or `task`

All the others are subject to your configuration. They are a good starting point, but can vary.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,7 @@ module.cert
module.cert
ALL
ALL

COMMENT cd cluster-autoscaler and tf-run.sh apply
COMMENT come back to this directory
COMMENT cd cloudwatch-agent and tf-run.sh apply
2 changes: 1 addition & 1 deletion examples/full-cluster-tf-upgrade/1.25/efs/ecr.tf
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
local {
locals {
ecr_mapping_default = "602401143452"
ecr_mapping = {
"us-gov-east-1" = "151742754352"
Expand Down
2 changes: 0 additions & 2 deletions examples/full-cluster-tf-upgrade/1.25/irsa-roles/tf-run.data
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,3 @@ LINK variables.application_tags.auto.tfvars

ALL
COMMAND tf-directory-setup.py -l s3

COMMENT cd cluster-autoscaler and tf-run.sh apply
23 changes: 12 additions & 11 deletions examples/full-cluster-tf-upgrade/1.25/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -50,17 +50,18 @@ locals {

# The log group name format is /aws/eks/<cluster-name>/cluster
# Reference: https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html
resource "aws_cloudwatch_log_group" "eks_cluster" {
name = format("/aws/eks/%v/cluster", var.cluster_name)
retention_in_days = 180

tags = merge(
local.base_tags,
local.common_tags,
var.tags,
var.application_tags,
)
}
# Obsolete: EKS cluster automatically creates the Cloudwatch log group
#resource "aws_cloudwatch_log_group" "eks_cluster" {
# name = format("/aws/eks/%v/cluster", var.cluster_name)
# retention_in_days = 180
#
# tags = merge(
# local.base_tags,
# local.common_tags,
# var.tags,
# var.application_tags,
# )
#}
Comment on lines +53 to +64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not obsolete. Put it back. This is the only way you can put in a retention period via code.


# we changed endpoint_public_access to false by default. This is so we can reach the EKS API through private IPs
# from on-prem and from the cloud. Otherwise, another account outside of where this is created will be unable to
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ eks_instance_type = "t3.xlarge"
eks_ng_desire_size = 3
eks_ng_max_size = 15
eks_ng_min_size = 3
subnets_name = "*-{subnet_label}-*"
8 changes: 6 additions & 2 deletions examples/full-cluster-tf-upgrade/1.25/tf-run.data
Original file line number Diff line number Diff line change
Expand Up @@ -55,20 +55,24 @@ STOP Once applied in this subdirectory, come back here and continue with step %%

TAG setup-efs
COMMENT cd efs and tf-run.sh apply
STOP Once applied in this subdirectory, come back here and continue with step %%NEXT%% (tag:setup-ebs)
STOP Once applied in this subdirectory, come back here and continue with step %%NEXT%% (tag:setup-addons)

TAG setup-addons
COMMENT cd addons and tf-run.sh apply
STOP Once applied in this subdirectory, come back here and continue with step %%NEXT%% (tag:setup-irsa)

TAG setup-irsa
COMMENT cd irsa-roles and tf-run.sh apply
COMMENT Note: irsa-roles has other subdirectories to be applied, follow the directions from tf-run there
STOP Once applied in this subdirectory, come back here and continue with step %%NEXT%% (tag:setup-common-services)

TAG setup-common-services
COMMENT cd common-services and tf-run.sh apply
COMMENT Notes: this subdirectory is complicated, and it has a certificate step which is manual
COMMENT Notes: common-services also has other subdirectories to be applied, follow the directions from tf-run there
STOP Once applied in this subdirectory, come back here and continue with step %%NEXT%% (tag:setup-cluster-roles)

TAG setup-cluster-roles
COMMENT cd cluster-roles and tf-run.sh apply
STOP Once applied in this subdirectory, come back here and continue with step %%NEXT%% (tag:complete)

TAG complete
Expand Down
5 changes: 5 additions & 0 deletions examples/full-cluster-tf-upgrade/1.28/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
kube.config
ecr-login.txt
setup/ec2-ssh-eks-*
!setup/ec2-ssh-eks-*.pub
logs
20 changes: 20 additions & 0 deletions examples/full-cluster-tf-upgrade/1.28/.tf-control
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# .tf-control
# allows for setting a specific command to be used for tf-* commands under this git repo
# see tf-control.sh help for more info

TFCONTROL_VERSION="1.0.5"

TFCOMMAND="terraform_latest"
# TF_CLI_CONFIG_FILE=PATH-TO-FILE/.tf-control.tfrc
# TFARGS=""
# TFNOLOG=""
# TFNOCOLOR=""

# use the following to force a specific version. An upgrade of an existing 0.12.31 to 1.x
# needs you to cycle through 0.13.17, 0.14.11, and then latest (0.15.5 not needed). Other
# steps in between. See https://github.e.it.census.gov/terraform/support/tree/master/docs/how-to/terraform-upgrade for details
#
#TFCOMMAND="terraform_0.12.31"
#TFCOMMAND="terraform_0.13.7"
#TFCOMMAND="terraform_0.14.11"
#TFCOMMAND="terraform_0.15.5"
24 changes: 24 additions & 0 deletions examples/full-cluster-tf-upgrade/1.28/.tf-control.tfrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
TFCONTROL_VERSION="1.0.5"

# https://www.terraform.io/docs/cli/config/config-file.html
plugin_cache_dir = "/data/terraform/terraform.d/plugin-cache"
#disable_checkpoint = true

provider_installation {
# filesystem_mirror {
# path = "/apps/terraform/terraform.d/providers"
# include = [ "*/*/*" ]
# }
filesystem_mirror {
path = "/data/terraform/terraform.d/providers"
include = [ "*/*/*" ]
}
# filesystem_mirror {
# path = "/apps/terraform/terraform.d/providers"
# include = [ "external.terraform.census.gov/*/*" ]
# }
direct {
include = [ "*/*/*" ]
}
}

Loading