Knowledge Required: Moderate
Tools required: Ansible
Ansible is a fantastic way to run remote commands in a scaleable manner, such as commands for health check monitoring. One of the things you may wish to monitor is systemd services. Unless you’re a real hipster, the chances are that if you’re running Linux, your services are managed by systemd.
The code
Lets take a look at how we can manage systemd services in Ansible, through an Ansible task:
#make service in start state
- name: Put service "{{ systemdname }}" in state running
systemd:
state: started
name: "{{ systemdname }}"
register: systemdstate
ignore_errors: true
- debug:
msg: "[customerror]: {{systemdname}} could not be started and was in stopped state"
when: systemdstate.failed == true
{{ name }} denotes how we store a variable in ansible, this task works with whatever we pass to it in the variable name systemdname
If you’re somewhat familiar with Ansible, you may have noticed that we’re missing our typical Ansible values instructing which hosts to run the above task on, but don’t worry, we’ll come to that later. For now, let’s just put the above code in a file called systemd_actions.yaml and breakdown down what’s happening:
Breakdown:
- We use the built in systemd functionality of ansible to put the service in a state
'started' - We register the result of the systemd command in a new variable called
systemdstate - We ignore errors, so that if one service cannot be restarted we don’t crash the whole Ansible playbook
- We then print a custom debug prompt to screen, if the command failed. Note how we put
customerrorin the debug statement because I’ll get to that later
Run the Ansible tasks
We’re going to create another file which calls the above systemd_actions.yaml file we just made. Call your file whatever you wish, I’m going to call mine systemd_manager.yaml. In this file, we’re going to put the following:
- hosts: localhost
become: true
become_user: root
vars:
services:
- smbd
- sshd
tasks:
- include_tasks: "systemd_actions.yaml"
loop: "{{ services }}"
loop_control:
loop_var: systemdname
This does the following:
- We declare a list of services we wish to check. In this case, I just have
smbd(Samba) andsshd(SSH). If you wish to add more services to check, just add to the list in the same format - We create a new task, which for each service in the list, imports tasks in
systemd_actions.yamlfile and runs them against each service. - Note we use
loop_controlto go through the list and the current value we’re at is calledsystemdname. An equivilent for loop may look likefor systemdname in services:….
Now, we can run this with: ansible-playbook systemd_manager.yaml
If all your services are up and running, your playbook will fly through, however, if for some reason your services were down and cannot be started, you’ll get your custom error message:
ok: [localhost] => {
"msg": "[customerror]: smbd could not be started and was in stopped state"
}
The benefits of this approach
Importing custom task files has the advantage of keeping everything modular so that if we ever need to make new checks or actions to run against systemd units, we can just add them into the system_actions.yaml file.
Importing a task file also has the benefit that we iterate through a list, passing the current item as variable systemdname. This means we can use this in task names to make reading the state of the playbook easier:
In the event we get errors, we can automate handling of these. For example, using something like the following ansible-playbook systemd_manager.yaml | grep customerror you can then initiate further actions if you do actually get a result back from your grep command. The debug command also makes it easily identifiable which service has issues. You could easily customise formatting of this.
Using the built-in module for systemd is significantly better than issuing shell commands that you would typically use on the command line, such as sudo systemctl status smbd. Ansible is actually going above and beyond to start the service if it’s in a stopped state all within 2 lines of code.
The systemd documentation for Ansible can be found here
EOF