Skip to content

Commit

Permalink
2.4.1: add examples/extras/schedule-asg and README
Browse files Browse the repository at this point in the history
  • Loading branch information
badra001 committed Feb 3, 2025
1 parent 84dec7a commit 99bf5d8
Show file tree
Hide file tree
Showing 5 changed files with 286 additions and 0 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,6 @@
* 2.4.0 -- 2025-01-02
- 1.31:
- create new 1.31, updated charts and images

* 2.4.1 -- 2025-02-03
- add examples/extras/schedule-asg and README
100 changes: 100 additions & 0 deletions examples/extras/schedule-asg/README.schedule-asg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# EKS Scheduling

In order to save money, we have the ability to scale a cluster down to 0 hosts by adding autoscaling scheduled resources to
a cluster's worker nodegroup. Note that the control plane still exists, and you are still charged for the cluster to exist, but it is a
lot less than having idle worker nodes.

There are three things you can do here:

1. Shut down a cluster now
1. Set a one-time action to stop a cluster and a one-time action to start it
1. Set a schedule based on cron (see links).

Please note that the shutdown now and onetime capabilities will generate resources to cause these actions, but then those
resources will disappear, even though they still show as being in Terraform state. After using either of these, and after they
have activated, we recommend you change the values to false and apply with refresh as `tf-apply -refresh-only=true`. This will
shore up the state file with reality.

For the cron-based actions, it uses EST5EDT as the timezone. If you want to change that, update the variable `asg_timezone`
in `variables.schedul-asg.auto.tfvars`.

# Links

* [CRON expressions](https://docs.aws.amazon.com/autoscaling/application/userguide/scheduled-scaling-using-cron-expressions.html)

# Setup

Copy the most recent files from the `tf-upgrade` branch from the `aws-eks` [module](https://github.e.it.census.gov/terraform-modules/aws-eks/tree/tf-upgrade/examples/extras/schedule-asg)
into the main EKS directory (`eks-{cluster_name}`).

Update the file `variables.schedule-asg.auto.tfvars` with the appropriate values. An example:

```hcl
# asg_notification_groups = ["valid-email-list@census.gov"]
asg_shutdown_now_enabled = false
asg_onetime_enabled = false
asg_startup_time = null # "2024-12-31T19:00:00Z" # 14:00 EST
asg_shutdown_time = null # "2024-12-31T23:00:00Z" # 18:00 EST
asg_schedule_enabled = false
asg_startup_schedule = null # starts at 6am mon-fri: "0 6 * * MON-FRI"
asg_shutdown_schedule = null # stops at 9pm mon-fri: "0 21 * * MON-FRI"
```

## Requirements

No requirements.

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | n/a |
| <a name="provider_time"></a> [time](#provider\_time) | n/a |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [aws_autoscaling_notification.notification](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_notification) | resource |
| [aws_autoscaling_schedule.now_shutdown](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_schedule) | resource |
| [aws_autoscaling_schedule.onetime_shutdown](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_schedule) | resource |
| [aws_autoscaling_schedule.onetime_startup](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_schedule) | resource |
| [aws_autoscaling_schedule.schedule_shutdown](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_schedule) | resource |
| [aws_autoscaling_schedule.schedule_startup](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_schedule) | resource |
| [aws_sns_topic.notification](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sns_topic) | resource |
| [aws_sns_topic_subscription.notification](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sns_topic_subscription) | resource |
| [time_offset.now_shutdown](https://registry.terraform.io/providers/hashicorp/time/latest/docs/resources/offset) | resource |
| [aws_autoscaling_group.cluster](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/autoscaling_group) | data source |
| [aws_autoscaling_groups.cluster](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/autoscaling_groups) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_asg_notification_groups"></a> [asg\_notification\_groups](#input\_asg\_notification\_groups) | List of email addresses to receive notification from SNS topic | `list(string)` | `[]` | no |
| <a name="input_asg_onetime_enabled"></a> [asg\_onetime\_enabled](#input\_asg\_onetime\_enabled) | Flag indicating the ASG is to be started at time asg\_startup\_time and stopped at asg\_shutdown\_time | `bool` | `false` | no |
| <a name="input_asg_schedule_enabled"></a> [asg\_schedule\_enabled](#input\_asg\_schedule\_enabled) | Flag indicating the ASG is stopped and started on a schedule with CRON using asg\_startup\_schedule and asg\_shutdown\_schedule | `bool` | `false` | no |
| <a name="input_asg_shutdown_now_enabled"></a> [asg\_shutdown\_now\_enabled](#input\_asg\_shutdown\_now\_enabled) | Flag indicating the ASG is to be scaled to 0 now (really, 5 minutes from now) | `bool` | `false` | no |
| <a name="input_asg_shutdown_schedule"></a> [asg\_shutdown\_schedule](#input\_asg\_shutdown\_schedule) | CRON string for time to startup ASG (to what is defined in min/max/desired) | `string` | `null` | no |
| <a name="input_asg_shutdown_time"></a> [asg\_shutdown\_time](#input\_asg\_shutdown\_time) | ISO Datetime (with zone) YYYY-MM-DDTHH:MM:SSZ for time to shutdown ASG (scale to 0) | `string` | `null` | no |
| <a name="input_asg_startup_schedule"></a> [asg\_startup\_schedule](#input\_asg\_startup\_schedule) | CRON string for time to startup ASG (to what is defined in min/max/desired) | `string` | `null` | no |
| <a name="input_asg_startup_time"></a> [asg\_startup\_time](#input\_asg\_startup\_time) | ISO Datetime (with zone) YYYY-MM-DDTHH:MM:SSZ for time to startup ASG (to what is defined in min/max/desired) | `string` | `null` | no |
| <a name="input_asg_timezone"></a> [asg\_timezone](#input\_asg\_timezone) | Default timezone for start/stop schedules | `string` | `"EST5EDT"` | no |

## Outputs

No outputs.

# CHANGELOG

* 1.0.0 -- 2025-02-03
- add documentation


114 changes: 114 additions & 0 deletions examples/extras/schedule-asg/schedule-asg.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# concept taken from:
# https://gitlab-for-eks.awsworkshop.io/090_appendices/097_cluster_ops.html

data "aws_autoscaling_groups" "cluster" {
filter {
name = "tag:eks:cluster-name"
values = [var.cluster_name]
}
}

data "aws_autoscaling_group" "cluster" {
for_each = zipmap(tolist(data.aws_autoscaling_groups.cluster.filter)[0].values, data.aws_autoscaling_groups.cluster.names)
name = each.value
}

#--
# now + 5m
#--
resource "time_offset" "now_shutdown" {
count = var.asg_shutdown_now_enabled ? 1 : 0
offset_minutes = 5
}

resource "aws_autoscaling_schedule" "now_shutdown" {
count = var.asg_shutdown_now_enabled ? 1 : 0
scheduled_action_name = "shutdown-now"
min_size = 0
max_size = 0
desired_capacity = 0
time_zone = var.asg_timezone
start_time = var.asg_shutdown_now_enabled ? time_offset.now_shutdown[0].rfc3339 : null
autoscaling_group_name = data.aws_autoscaling_group.cluster[var.cluster_name].name
}

#--
# onetime
#--
resource "aws_autoscaling_schedule" "onetime_shutdown" {
count = var.asg_onetime_enabled && var.asg_shutdown_time != null ? 1 : 0
scheduled_action_name = "onetime-shutdown"
min_size = 0
max_size = 0
desired_capacity = 0
# time_zone = var.asg_timezone
start_time = var.asg_shutdown_time
autoscaling_group_name = data.aws_autoscaling_group.cluster[var.cluster_name].name
}

resource "aws_autoscaling_schedule" "onetime_startup" {
count = var.asg_onetime_enabled && var.asg_startup_time != null ? 1 : 0
scheduled_action_name = "onetime-startup"
min_size = var.eks_ng_min_size
max_size = var.eks_ng_max_size
desired_capacity = var.eks_ng_desire_size
# time_zone = var.asg_timezone
start_time = var.asg_startup_time
autoscaling_group_name = data.aws_autoscaling_group.cluster[var.cluster_name].name
}

#--
# schedule
#--
resource "aws_autoscaling_schedule" "schedule_shutdown" {
count = var.asg_schedule_enabled && var.asg_shutdown_schedule != null ? 1 : 0
scheduled_action_name = "scheduled-shutdown"
min_size = 0
max_size = 0
desired_capacity = 0
time_zone = var.asg_timezone
recurrence = var.asg_shutdown_schedule
autoscaling_group_name = data.aws_autoscaling_group.cluster[var.cluster_name].name
}

resource "aws_autoscaling_schedule" "schedule_startup" {
count = var.asg_schedule_enabled && var.asg_startup_schedule != null ? 1 : 0
scheduled_action_name = "scheduled-startup"
min_size = var.eks_ng_min_size
max_size = var.eks_ng_max_size
desired_capacity = var.eks_ng_desire_size
time_zone = var.asg_timezone
recurrence = var.asg_startup_schedule
autoscaling_group_name = data.aws_autoscaling_group.cluster[var.cluster_name].name
}

resource "aws_autoscaling_notification" "notification" {
group_names = [data.aws_autoscaling_group.cluster[var.cluster_name].name]
notifications = [
"autoscaling:EC2_INSTANCE_LAUNCH",
"autoscaling:EC2_INSTANCE_TERMINATE",
"autoscaling:EC2_INSTANCE_LAUNCH_ERROR",
"autoscaling:EC2_INSTANCE_TERMINATE_ERROR",
]
topic_arn = aws_sns_topic.notification.arn
}

resource "aws_sns_topic" "notification" {
name = format("%v%v-asg-notfication", local._prefixes["eks"], var.cluster_name)
kms_master_key_id = "alias/aws/sns"

tags = merge(
local.base_tags,
local.common_tags,
var.tags,
var.application_tags,
{ Name = format("%v%v-asg-notfication", local._prefixes["eks"], var.cluster_name) },
)
}

resource "aws_sns_topic_subscription" "notification" {
for_each = toset(var.asg_notification_groups)
topic_arn = aws_sns_topic.notification.arn
protocol = "email"
endpoint = each.key
}
11 changes: 11 additions & 0 deletions examples/extras/schedule-asg/variables.schedule-asg.auto.tfvars
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# asg_notification_groups = ["valid-email-list@census.gov"]

asg_shutdown_now_enabled = false

asg_onetime_enabled = false
asg_startup_time = null # "2024-12-31T19:00:00Z" # 14:00 EST
asg_shutdown_time = null # "2024-12-31T23:00:00Z" # 18:00 EST

asg_schedule_enabled = false
asg_startup_schedule = null # starts at 6am mon-fri: "0 6 * * MON-FRI"
asg_shutdown_schedule = null # stops at 9pm mon-fri: "0 21 * * MON-FRI"
58 changes: 58 additions & 0 deletions examples/extras/schedule-asg/variables.schedule-asg.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
variable "asg_timezone" {
description = "Default timezone for start/stop schedules"
type = string
default = "EST5EDT"
}

variable "asg_notification_groups" {
description = "List of email addresses to receive notification from SNS topic"
type = list(string)
default = []

validation {
condition = alltrue([for v in var.asg_notification_groups : endswith(v, "@census.gov")])
error_message = "Each provided email address in the list must end in @census.gov to be valid."
}
}

variable "asg_shutdown_now_enabled" {
description = "Flag indicating the ASG is to be scaled to 0 now (really, 5 minutes from now)"
type = bool
default = false
}

variable "asg_onetime_enabled" {
description = "Flag indicating the ASG is to be started at time asg_startup_time and stopped at asg_shutdown_time"
type = bool
default = false
}

variable "asg_startup_time" {
description = "ISO Datetime (with zone) YYYY-MM-DDTHH:MM:SSZ for time to startup ASG (to what is defined in min/max/desired)"
type = string
default = null
}

variable "asg_shutdown_time" {
description = "ISO Datetime (with zone) YYYY-MM-DDTHH:MM:SSZ for time to shutdown ASG (scale to 0)"
type = string
default = null
}

variable "asg_schedule_enabled" {
description = "Flag indicating the ASG is stopped and started on a schedule with CRON using asg_startup_schedule and asg_shutdown_schedule"
type = bool
default = false
}

variable "asg_startup_schedule" {
description = "CRON string for time to startup ASG (to what is defined in min/max/desired)"
type = string
default = null
}

variable "asg_shutdown_schedule" {
description = "CRON string for time to startup ASG (to what is defined in min/max/desired)"
type = string
default = null
}

0 comments on commit 99bf5d8

Please sign in to comment.