Knowledge Required: Moderate

Tools required: Ansible

Ansible is a fantastic way to run remote commands in a scaleable manner, such as commands for health check monitoring. One of the things you may wish to monitor is systemd services. Unless you’re a real hipster, the chances are that if you’re running Linux, your services are managed by systemd.

The code

Lets take a look at how we can manage systemd services in Ansible, through an Ansible task:

#make service in start state
- name: Put service "{{ systemdname }}" in state running
  systemd: 
    state: started
    name: "{{ systemdname }}"
  register: systemdstate
  ignore_errors: true

- debug:
    msg: "[customerror]: {{systemdname}} could not be started and was in stopped state"
  when: systemdstate.failed == true

{{ name }} denotes how we store a variable in ansible, this task works with whatever we pass to it in the variable name systemdname

If you’re somewhat familiar with Ansible, you may have noticed that we’re missing our typical Ansible values instructing which hosts to run the above task on, but don’t worry, we’ll come to that later. For now, let’s just put the above code in a file called systemd_actions.yaml and breakdown down what’s happening:

Breakdown:

  • We use the built in systemd functionality of ansible to put the service in a state 'started'
  • We register the result of the systemd command in a new variable called systemdstate
  • We ignore errors, so that if one service cannot be restarted we don’t crash the whole Ansible playbook
  • We then print a custom debug prompt to screen, if the command failed. Note how we put customerror in the debug statement because I’ll get to that later

Run the Ansible tasks

We’re going to create another file which calls the above systemd_actions.yaml file we just made. Call your file whatever you wish, I’m going to call mine systemd_manager.yaml. In this file, we’re going to put the following:

- hosts: localhost
  become: true
  become_user: root
  vars:
    services:
      - smbd
      - sshd

  tasks:
    - include_tasks: "systemd_actions.yaml"
      loop: "{{ services }}"
      loop_control:
        loop_var: systemdname

This does the following:

  • We declare a list of services we wish to check. In this case, I just have smbd (Samba) and sshd (SSH). If you wish to add more services to check, just add to the list in the same format
  • We create a new task, which for each service in the list, imports tasks in systemd_actions.yaml file and runs them against each service.
  • Note we use loop_control to go through the list and the current value we’re at is called systemdname. An equivilent for loop may look like for systemdname in services: ….

Now, we can run this with: ansible-playbook systemd_manager.yaml

If all your services are up and running, your playbook will fly through, however, if for some reason your services were down and cannot be started, you’ll get your custom error message:

ok: [localhost] => {
    "msg": "[customerror]: smbd could not be started and was in stopped state"
}

The benefits of this approach

Importing custom task files has the advantage of keeping everything modular so that if we ever need to make new checks or actions to run against systemd units, we can just add them into the system_actions.yaml file.

Importing a task file also has the benefit that we iterate through a list, passing the current item as variable systemdname. This means we can use this in task names to make reading the state of the playbook easier: output

In the event we get errors, we can automate handling of these. For example, using something like the following ansible-playbook systemd_manager.yaml | grep customerror you can then initiate further actions if you do actually get a result back from your grep command. The debug command also makes it easily identifiable which service has issues. You could easily customise formatting of this.

Using the built-in module for systemd is significantly better than issuing shell commands that you would typically use on the command line, such as sudo systemctl status smbd. Ansible is actually going above and beyond to start the service if it’s in a stopped state all within 2 lines of code.

The systemd documentation for Ansible can be found here

EOF break