카테고리 없음

[AWS] lambda 함수를 이용해서 ECS서비스 Stop & Start

양눈 2025. 2. 11. 14:18
반응형

AWS에서 비용절약을 위해서 ECS서비스를 업무시간에 Start하고 업무시간외 Stop Test해본다. 

시작시간 : 08시(월 ~ 금)
중단시간 : 22시(월 ~ 금)

ECS의 경우 fargate가 아닌 EC2노드를 사용하는경우 비용을 절약하려면  Node도 내려야 한다.
그리고 서비스가 모두 내려가있지 않으면 AutoScalingGroup에서 Desired Count를 0으로 바꿔도 노드가 내려 가지 않는다. 

실행 순서는 아래와 같이 되면 된다. 
(종료)
1. ECS 서비스 desired count => 0
2. ASG   desired count => 0
(시작)
3. ASG   desired count => 1
4. ECS 서비스 desired count => 1 

구성해야 하는 순서는 다음과 같다. 
1. 현재시간 조건에 따라 ECS의 Service와 ASG의 노두개수를 변경하는 람다 함수 
2. 람다함수에서 ECS 및 ASG에 리소스를 사용하기 위한 IAM설정
3. eventbridge 사용해서 lambda함수를 실행 시키기 위한 스케줄 생성 

AWSTemplateFormatVersion: '2010-09-09'
Description: "Lambda function to adjust ECS services and Auto Scaling Group desired count based on schedule"

Parameters:
  ClusterName:
    Type: String
    Description: "ECS Cluster Name"
    Default: "your-ecs-cluster"

  AutoScalingGroupName:
    Type: String
    Description: "Auto Scaling Group Name"
    Default: "your-asg-name"

Resources:
  ### 1. Lambda 실행을 위한 IAM Role 생성 ###
  LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: LambdaECSASGScalingRole
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: 
                - lambda.amazonaws.com
            Action: 
              - sts:AssumeRole
      Policies:
        - PolicyName: LambdaECSASGScalingPolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - ecs:UpdateService
                  - ecs:DescribeServices
                  - ecs:ListServices
                Resource: "*"
              - Effect: Allow
                Action:
                  - autoscaling:SetDesiredCapacity
                  - autoscaling:DescribeAutoScalingGroups
                Resource: "*"
              - Effect: Allow
                Action:
                  - logs:CreateLogGroup
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: "*"

  ### 2. Lambda 함수 생성 ###
  ECSASGScalingLambda:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: ECSASGScalingLambda
      Runtime: python3.9
      Handler: index.lambda_handler
      Role: !GetAtt LambdaExecutionRole.Arn
      Timeout: 15
      Environment:
        Variables:
          CLUSTER_NAME: !Ref ClusterName
          ASG_NAME: !Ref AutoScalingGroupName
          ACTIVE_COUNT: "1"
          INACTIVE_COUNT: "0"
      Code:
        ZipFile: |
          import boto3
          import datetime
          import os

          ecs_client = boto3.client("ecs")
          asg_client = boto3.client("autoscaling")

          CLUSTER_NAME = os.environ['CLUSTER_NAME']
          ASG_NAME = os.environ['ASG_NAME']
          ACTIVE_COUNT = int(os.environ.get('ACTIVE_COUNT', 1))
          INACTIVE_COUNT = int(os.environ.get('INACTIVE_COUNT', 0))

          def lambda_handler(event, context):
              now = datetime.datetime.utcnow()
              hour_utc = now.hour
              weekday = now.weekday()
              hour_kst = (hour_utc + 9) % 24

              desired_count = ACTIVE_COUNT if (0 <= weekday <= 4 and 8 <= hour_kst < 22) else INACTIVE_COUNT

              try:
                  # 1. ECS 클러스터 내 모든 서비스 가져오기
                  response = ecs_client.list_services(cluster=CLUSTER_NAME)
                  service_arns = response.get("serviceArns", [])

                  if not service_arns:
                      print("No ECS services found in the cluster.")

                  # 모든 서비스 DesiredCount 업데이트
                  for service_arn in service_arns:
                      ecs_client.update_service(
                          cluster=CLUSTER_NAME,
                          service=service_arn,
                          desiredCount=desired_count
                      )
                  print(f"ECS Services in {CLUSTER_NAME} set to {desired_count}")

                  # 2. Auto Scaling Group DesiredCapacity 업데이트
                  asg_client.set_desired_capacity(
                      AutoScalingGroupName=ASG_NAME,
                      DesiredCapacity=desired_count,
                      HonorCooldown=False
                  )
                  print(f"Auto Scaling Group {ASG_NAME} set to {desired_count}")

                  return {
                      "statusCode": 200,
                      "message": f"Updated ECS services and ASG {ASG_NAME} to {desired_count}"
                  }
              except Exception as e:
                  print(f"Error: {str(e)}")
                  return {
                      "statusCode": 500,
                      "error": str(e)
                  }

  ### 3. EventBridge 규칙 생성 (평일 08시 시작) ###
  EventBridgeRuleStart:
    Type: AWS::Events::Rule
    Properties:
      Name: ECSASGScalingStartWeekday
      ScheduleExpression: "cron(0 23 ? * MON-FRI *)"  # UTC 기준 23시 = KST 08시
      State: ENABLED
      Targets:
        - Arn: !GetAtt ECSASGScalingLambda.Arn
          Id: "1"

  ### 4. EventBridge 규칙 생성 (평일 22시 종료) ###
  EventBridgeRuleStop:
    Type: AWS::Events::Rule
    Properties:
      Name: ECSASGScalingStopWeekday
      ScheduleExpression: "cron(0 13 ? * MON-FRI *)"  # UTC 기준 13시 = KST 22시
      State: ENABLED
      Targets:
        - Arn: !GetAtt ECSASGScalingLambda.Arn
          Id: "2"

  ### 5. Lambda를 EventBridge와 연결하는 Permission ###
  PermissionForEventBridgeStart:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !Ref ECSASGScalingLambda
      Action: lambda:InvokeFunction
      Principal: events.amazonaws.com
      SourceArn: !GetAtt EventBridgeRuleStart.Arn

  PermissionForEventBridgeStop:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !Ref ECSASGScalingLambda
      Action: lambda:InvokeFunction
      Principal: events.amazonaws.com
      SourceArn: !GetAtt EventBridgeRuleStop.Arn


아래 코드를 CloudFomation에서 실행 시킨다. 

- AutoScalingGroupName : ASG 이름 입력 

- ECS Cluster Name : 클러스터 이름 입력
 
반응형