Skip to content

Morga471 cluster #20

Merged
merged 57 commits into from
Mar 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
5d463cf
yep
morga471 Feb 22, 2025
7b98e0c
set back to normal
morga471 Feb 24, 2025
9225789
missed tempo
morga471 Feb 24, 2025
d4b0560
fix branch ref
morga471 Feb 25, 2025
b6cfc81
change branch ref to test provider-resolution
morga471 Feb 25, 2025
db25e11
fix min vals
morga471 Feb 25, 2025
afa8bb2
2 is the lowest
morga471 Feb 25, 2025
f649b29
docs and keycloak
morga471 Feb 26, 2025
5923146
use default for eks again
morga471 Feb 26, 2025
a6cb4f7
tempo and kiali updates while working on keycloak
morga471 Feb 26, 2025
2583599
missed a comma
morga471 Feb 26, 2025
ccf901d
almost
morga471 Feb 27, 2025
06a62e4
no v
morga471 Feb 27, 2025
398e434
cleanup
morga471 Feb 27, 2025
1cb9d51
namespaces
morga471 Feb 27, 2025
d34b6e2
use main
morga471 Feb 27, 2025
77e8f9d
fmt
morga471 Feb 27, 2025
8356128
namespace changes
morga471 Feb 27, 2025
b1c61af
update internal url ref
morga471 Feb 27, 2025
a3ace69
fmt
morga471 Feb 28, 2025
7ed43de
versions
morga471 Feb 28, 2025
2829581
more wip:
morga471 Feb 28, 2025
b689635
keycloak wip
morga471 Feb 28, 2025
e7b8b99
update prom internal url input value
morga471 Feb 28, 2025
09dbab4
test changes on prom
morga471 Feb 28, 2025
4d9a294
deleted old cluster platform-eng-eks-test and created new cluster pla…
nangu001 Feb 28, 2025
da84c26
testing more autoscaling stuffs
morga471 Feb 28, 2025
7c9a31e
wip
morga471 Mar 3, 2025
e3d15ce
wip
morga471 Mar 4, 2025
f3260cc
wip
morga471 Mar 5, 2025
4e37688
use my eks
morga471 Mar 5, 2025
221e219
isolate karpenter again for debug
morga471 Mar 5, 2025
a905497
1.3.0 is not ready
morga471 Mar 6, 2025
8f8a1ab
make grafana work again
morga471 Mar 6, 2025
5309908
increment grafana operator chart version
morga471 Mar 6, 2025
44e1884
otel added
morga471 Mar 7, 2025
0f9cbdd
fix gatekeeper chart version
morga471 Mar 7, 2025
efd9d3c
ordering
morga471 Mar 7, 2025
c133413
test branch
morga471 Mar 7, 2025
26fcef3
use newer image
morga471 Mar 7, 2025
5f95a17
update loki memcached
morga471 Mar 8, 2025
2a73cfa
vers
morga471 Mar 8, 2025
1a941d4
keycloak defaults
morga471 Mar 11, 2025
20943a0
put keycloak in keycloak namespace for debug
morga471 Mar 11, 2025
5d487c5
removed a few folders from workspace
morga471 Mar 11, 2025
bdcd452
update grafana tg
morga471 Mar 11, 2025
af1b60b
remove old module from workspace
morga471 Mar 11, 2025
7efb9b6
reset branches to default
morga471 Mar 11, 2025
e651e2d
missed one
morga471 Mar 11, 2025
0a7b279
fmt
morga471 Mar 12, 2025
0f0af4a
more fmt
morga471 Mar 12, 2025
908c0ad
use client id and secret
morga471 Mar 12, 2025
9a391c5
fix service name regex violation
morga471 Mar 12, 2025
5c85b17
updates
morga471 Mar 13, 2025
0d0fd5e
update from lukes pr
morga471 Mar 13, 2025
98a0a07
disable gatekeeper
morga471 Mar 13, 2025
d6bff73
updated
morga471 Mar 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions .checkov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
branch: master
download-external-modules: true
evaluate-variables: true
external-checks-dir:
- security/custom_checks
framework:
- terraform
- kubernetes
output:
- cli
- json
- junitxml
skip-check:
- CKV_AWS_79 # Instance Metadata Service Version 1
- CKV_AWS_130 # Ensure VPC subnets are not assigned public IP by default
quiet: true
compact: true
directory:
- .
- modules/*
secrets-scan-file-type:
- tf
- yaml
- json
28 changes: 17 additions & 11 deletions .github/platform-tg-infra.code-workspace
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,12 @@
"folders": [
{
"name": "platform-tg-infra",
"path": "../"
"path": ".."
},
{
"name": "tfmod-cert-mgr",
"path": "../../tfmod-cert-mgr"
},
{
"name": "tfmod-config-job",
"path": "../../tfmod-config-job"
},
{
"name": "tfmod-custom-iam-role-for-service-account-eks",
"path": "../../tfmod-custom-iam-role-for-service-account-eks"
},
{
"name": "tfmod-eks",
"path": "../../tfmod-eks"
Expand All @@ -28,6 +20,10 @@
"name": "tfmod-eks-dns",
"path": "../../tfmod-eks-dns"
},
{
"name": "tfmod-gogatekeeper",
"path": "../../tfmod-gogatekeeper"
},
{
"name": "tfmod-grafana",
"path": "../../tfmod-grafana"
Expand All @@ -48,6 +44,10 @@
"name": "tfmod-karpenter",
"path": "../../tfmod-karpenter"
},
{
"name": "tfmod-keycloak",
"path": "../../tfmod-keycloak"
},
{
"name": "tfmod-kiali",
"path": "../../tfmod-kiali"
Expand All @@ -60,6 +60,10 @@
"name": "tfmod-metrics-server",
"path": "../../tfmod-metrics-server"
},
{
"name": "tfmod-open-telemetry",
"path": "../../tfmod-open-telemetry"
},
{
"name": "tfmod-prometheus",
"path": "../../tfmod-prometheus"
Expand All @@ -69,13 +73,15 @@
"path": "../../tfmod-tempo"
},
{
"name": "terraform-aws-eks",
"path": "../../terraform-aws-eks"
},
{
"path": "../../karpenter-provider-aws"
"name": "terragrunt",
"path": "../../terragrunt"
},
{
"path": "../../terragrunt"
"path": "../../tfmod-config-job"
}
]
}
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Local .terraform directories
**/.terraform/*
**/apply.log
**/plan.log
**/destroy.log

# terraform lock file.
**/.terraform.lock.hcl
Expand Down
48 changes: 48 additions & 0 deletions configs/node-groups.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
nodeGroups:
- name: general-purpose
instanceTypes:
- m6i.xlarge
- m6a.xlarge
- m5.xlarge
minSize: 2
maxSize: 10
desiredSize: 2
labels:
node-type: general
taints: []
updateConfig:
maxUnavailable: 1

- name: compute-optimized
instanceTypes:
- c6i.2xlarge
- c6a.2xlarge
- c5.2xlarge
minSize: 1
maxSize: 20
desiredSize: 2
labels:
node-type: compute
taints:
- key: workload
value: batch
effect: NoSchedule
updateConfig:
maxUnavailable: 2

- name: memory-optimized
instanceTypes:
- r6i.2xlarge
- r6a.2xlarge
- r5.2xlarge
minSize: 1
maxSize: 10
desiredSize: 2
labels:
node-type: memory
taints:
- key: workload
value: memory-intensive
effect: NoSchedule
updateConfig:
maxUnavailable: 1
36 changes: 36 additions & 0 deletions configs/resource-quotas.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
apiVersion: v1
kind: ResourceQuota
metadata:
name: default-quota
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
pods: "100"
services: "50"
secrets: "100"
configmaps: "100"
persistentvolumeclaims: "50"

---
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
spec:
limits:
- type: Container
default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 100m
memory: 256Mi
max:
cpu: "4"
memory: 8Gi
min:
cpu: 50m
memory: 64Mi
88 changes: 88 additions & 0 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Platform Infrastructure Architecture

## Complete Platform Architecture

```mermaid
graph TD
%% Core Network Infrastructure
VPC[VPC Module] --> DNS[DNS Module]
VPC --> SUBNETS[Subnet Configuration]
SUBNETS --> PRIVATE[Private Subnets]
SUBNETS --> PUBLIC[Public Subnets]
%% EKS Cluster and Core Components
VPC --> EKS[EKS Cluster]
EKS --> IAM[IAM Roles Module]
EKS --> EKS_CONFIG[EKS Configuration]
EKS --> KARPENTER[Karpenter]
%% Security and Access Management
EKS --> CERT_MGR[Cert Manager]
EKS --> GATEKEEPER[GoGatekeeper]
%% Service Mesh
EKS_CONFIG --> ISTIO[Istio Service Mesh]
ISTIO --> KIALI[Kiali Dashboard]
ISTIO --> INGRESS[Service Ingress]
%% Monitoring and Observability
EKS --> MONITORING[Monitoring Stack]
MONITORING --> PROMETHEUS[Prometheus]
MONITORING --> GRAFANA[Grafana]
MONITORING --> LOKI[Loki Log Aggregation]
MONITORING --> TEMPO[Tempo Tracing]
%% Additional Services
EKS --> DASHBOARD[Kubernetes Dashboard]
EKS --> METRICS[Metrics Server]
EKS --> KEYCLOAK[Keycloak SSO]
%% Infrastructure Management
TERRAGRUNT[Terragrunt] --> VPC
TERRAGRUNT --> EKS
%% Database Layer
VPC --> RDS[RDS Database]
%% Styling
classDef core fill:#f9f,stroke:#333,stroke-width:2px
classDef security fill:#bbf,stroke:#333,stroke-width:2px
classDef monitoring fill:#bfb,stroke:#333,stroke-width:2px
class VPC,EKS,EKS_CONFIG core
class CERT_MGR,GATEKEEPER,IAM security
class PROMETHEUS,GRAFANA,LOKI,TEMPO monitoring
```

## Component Descriptions

### Core Infrastructure
- **VPC Module**: Network foundation with public/private subnets
- **EKS Cluster**: Managed Kubernetes service
- **Karpenter**: Autoscaling node management
- **DNS Module**: Route53 DNS management

### Security Layer
- **Cert Manager**: Certificate lifecycle management
- **GoGatekeeper**: Policy enforcement
- **IAM Roles**: AWS IAM integration

### Service Mesh
- **Istio**: Service mesh implementation
- **Kiali**: Service mesh visualization
- **Service Ingress**: External traffic management

### Monitoring Stack
- **Prometheus**: Metrics collection
- **Grafana**: Metrics visualization
- **Loki**: Log aggregation
- **Tempo**: Distributed tracing

### Additional Services
- **Kubernetes Dashboard**: Cluster management UI
- **Metrics Server**: Resource metrics
- **Keycloak**: Identity management

### Infrastructure Management
- **Terragrunt**: Infrastructure deployment orchestration
- **RDS**: Managed database services
56 changes: 56 additions & 0 deletions docs/DOCUMENTATION_STANDARDS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Documentation Standards Guide

## README Structure
Each module must include a README.md with the following sections:

1. Overview
- Purpose
- Key features
- Architecture diagram

2. Prerequisites
- Required tooling
- Required permissions
- Dependencies

3. Usage
- Basic example
- Advanced examples
- Input variables table
- Output variables table

4. Architecture
- Component diagram
- Network flow
- Security considerations

5. Operations
- Deployment guide
- Monitoring
- Troubleshooting
- Maintenance

## Changelog Format
Use Commitizen convention:

```
feat: New feature
fix: Bug fix
docs: Documentation changes
style: Formatting changes
refactor: Code restructure without behavior change
test: Test updates
chore: Maintenance tasks
```

## Diagrams
- Use PlantUML for architecture diagrams
- Include source files in `docs/diagrams`
- Export PNG/SVG to `docs/images`
- Keep diagrams up to date with code changes

## Usage Examples
- Provide basic and advanced examples
- Include realistic variable values
- Document required permissions
- Include expected outputs
Loading