Blog

Using systemd to orchestrate cloud-init with cfn-signal in AWS CloudFormation

Mark van Holsteijn

August 31, 2025
4 minutes

In this blog we will show you how to configure a systemd cfn-signal unit, to signal a successful or failed completion of the cloud-init run.

We have been managing a dozen autoscaling groups for instances of ECS clusters using CloudFormation since 2017. To configure an virtual machine instance for a specific cluster, we use cloud-init. With cloud-init, we configure the ECS agent and configure logging to go to the correct log groups. This means that we have a cloud-init configuration embedded as user-data in our CloudFormation template. This works pretty well most of the times, but sometimes when the cloud-init configuration has an error, a success signal was still send leaving us with an incorrectly configured virtual machine.

With systemd we can configure a cfn-signal unit to report the status of the cloud-init, after cloud-init has run. You can do this to! We will show you:

  • how to determine the successful completion of cloud-init
  • how to make cfn-signal dependent on cloud-init
  • how to define the cfn-signal unit
  • integrate this into the CloudFormation template

Determining successful completion of cloud-init

To determine whether cloud-init completed without errors, we use the following command:

cloud-init status --wait

It will print one of the following status:

status: not started
status: running
status: done
status: error - done
status: error - running
status: degraded done
status: degraded running
status: disabled

So the command to get cfn-signal to report the correct completion status looks like this:

cloud-init status --wait | grep -q '^status: done$'
/opt/aws/bin/cfn-signal \
    --stack "$STACK_NAME" \
    --resource "$LOGICAL_RESOURCE_ID" \
    --region "$AWS_REGION" \
    --exit-code $?

How to make cfn-signal dependent on cloud-init

All systemd units which together provide the cloud-init functionality, are grouped by the cloud-init.target.

cloud-init.target
 ├─cloud-config.service
 ├─cloud-final.service
 ├─cloud-init-local.service
 └─cloud-init.service

To run cfn-signal after cloud-init has completed we define the service to be dependent on the cloud-init.target, using the Wants and After instructions in the systemd service unit:

Description=Signal completion of cloud-init to CFN

Wants=cloud-init.target
After=cloud-init.target

We use Wants instead of Required, because we want to run, even if cloud-init failed.

The cfn-signal service unit

The entire cfn-signal service unit now looks like this:

# 
# When cloud-init completed successfully, report this to CFN     
# using cfn-signal
#   
[Unit]
Description=Signal completion of cloud-init to CFN
Want=cloud-init.target
After=cloud-init.target

[Service]
Type=oneshot
ExecStart=/bin/sh  -xc '\
cloud-init status --wait | grep -q '^status: done$'; \
/opt/aws/bin/cfn-signal \
    --stack "$STACK_NAME" \
        --resource "$LOGICAL_RESOURCE_ID" \
        --region "$AWS_REGION" \
        --exit-code $? \
'

[Install]
WantedBy=default.target

You can bake this service into your own VM base image, or include it in your CloudFormation template.

Integration into the CloudFormation template

The cfn-signal service is integrated into the CloudFormation template by adding the stack name, region and logical resource id as environment variables in the systemd service override configuration for the cfn-signal service in /etc/systemd/system/cfn-signal.service.d/override.conf:

 ClusterASGLaunchTemplate:
    Type: AWS::EC2::LaunchTemplate
    Properties: 
      LaunchTemplateName: ClusterASG
      LaunchTemplateData:
        ImageId: !Ref AL2023
        InstanceType: t3.micro
        IamInstanceProfile:
          Arn: !GetAtt 'InstanceProfile.Arn'
        UserData: !Base64
          Fn::Sub: |
            #cloud-config
            write_files:
              - path: /etc/systemd/system/cfn-signal.service.d/override.conf
                permissions: '0644'
                content: |
                  [Service]
                  Environment="AWS_REGION=${AWS::Region}"
                  Environment="STACK_NAME=${AWS::StackName}"
                  Environment="LOGICAL_RESOURCE_ID=ClusterASG"

            runcmd:
              - sudo systemctl daemon-reload
              - sudo systemctl enable --now --no-block cfn-signal

If cloud-init fails to create the file, cfn-signal will fail to report a succesful completion and CloudFormation will time-out waiting for the signal. If the cloud-init succeeds to configure cfn-signal and there were errors in any other part of the cloud-init, a fail signal will be send immediately after cloud-init completes.

Demonstration

To see a full fledge demonstration, deploy the sample stack:

curl -sS -o sample.yaml \
    https://gist.github.com/mvanholsteijn/a85dc4355985477a0aec2aeb3c27eab1

aws cloudformation deploy \
    --stack-name cfn-signal \
    --template-file ./sample.yaml \
    --capabilities CAPABILITY_IAM \
    --parameter-overrides ExitCode=0

Now, mimick an update with an error in the runcmd scripts by giving an ExitCode equal to 1:

aws cloudformation deploy \
    --stack-name cfn-signal \
    --template-file ./sample.yaml \
    --capabilities CAPABILITY_IAM \
    --parameter-overrides ExitCode=1

This will report a failure signal to CloudFormation and abort the update.

Conclusion

By using systemd we can configure a cfn-signal unit to report the status of the cloud-init, after cloud-init has run. This allows us to notify CloudFormation to continue, only if cloud-init initialized the instance successfully. This reduces the chances of incorrectly configured instances in our environment.


Image by WikimediaImages from Pixabay

Written by

Mark van Holsteijn

Mark van Holsteijn is a senior software systems architect at Xebia Cloud-native solutions. He is passionate about removing waste in the software delivery process and keeping things clear and simple.