Notes on Ansible
By: Cam Wohlfeil
Published: 2019-06-28 1235 EDT
Category: DevOps
Tags:
ansible
These are my notes to go along with Ansible training I have done, this is not a full walkthrough or best practices guide. Here's the files: https://gitlab.com/cwohlfeil/MasteringAnsible
Preparations
See topology.pdf See Vagrantfile
https://www.vagrantup.com/ https://docs.ansible.com/ansible/intro_installation.html https://rogerwelin.github.io/ansible/docker/2016/07/04/testing-ansible-playbooks-with-docker.html
Foundations
Inventory
Two types, static and dynamic.
Inventory lists hosts but can also provide additional details, such as how to connect.
Allows you to group together by role.
Can also pass in variables.
ansible --list-hosts-all
Dummy hosts are included as a template: /etc/ansible/hosts
.
Ansible will always try to connect via SSH, in the dev
file we tell it to use a local connection to the control node.
Global config is located at: /etc/ansible/ansible.cfg
.
https://docs.ansible.com/ansible/intro_inventory.html https://docs.ansible.com/ansible/intro_dynamic_inventory.html
Host Selection
ansible --list-hosts loadbalancer
, supports globbing, wildcards
ansible --list-hosts webserver[0]
, supports arrays (this will select first result)
ansible --list-hosts \!control
, supports negation (have to escape for bash)
https://docs.ansible.com/ansible/intro_patterns.html
Tasks
ansible -m ping all
, ping
is the command, all
are the hosts to run it on.
All tasks have return status, even if a failure.
Non 0 exit codes are considered a failure.
This is for basic troubleshooting.
https://docs.ansible.com/ansible/intro_getting_started.html#your-first-commands https://docs.ansible.com/ansible/ping_module.html https://docs.ansible.com/ansible/command_module.html
Plays
Playbook is a YAML file with plays in it. Plays are a set of target hosts and set of tasks to run on them. Hosts and tasks are the two YAML keys in the file. See ansible/playbooks/hostname.yml Don't have to respecify targets, can track changes, can add/modify. Don't just worry about running commands and results, but the process as well.
https://docs.ansible.com/ansible/playbooks_intro.html#playbook-language-example
Playbook Execution
First ansible does data gathering. Next command is executed. Rather than querying end host, shifting towards only focusing on errors. Last is the play recap. Even though we only ran commands, ansible considers it a change.
https://docs.ansible.com/ansible/playbooks_intro.html#executing-a-playbook
Playbooks
Four major aspects: * Packages needed * Service handler * System config * App config files
Standard playbook creation loop: pick a module, implement what it needs, and run to test.
https://docs.ansible.com/ansible/modules_by_category.html
Packages: apt
Tasks:
* name: install nginx
same as using apt on the command line
* apt: name=nginx state=present update_cache=yes
the app state, present
will check if it's installed, you can also do latest
to ensure it's up to date or pin the package version. Third parameter will run apt update.
https://docs.ansible.com/ansible/apt_module.html
Packages: become
Even though ansible is a sudoer, we need to tell it to use those permissions. We can do so by adding become: true
. This will execute at the level it is specified, i.e. if at the top of the playbook, it will go for the entire playbook. Used to be called sudo
.
https://docs.ansible.com/ansible/become.html https://docs.ansible.com/ansible/YAMLSyntax.html#yaml-basics http://yaml.org/type/bool.html
Packages: with_items
Ansible includes the loop with_items
to help reduce code repetition. Feed it a list of things to loop over, and use Jinja2 templating to create variables in declaration.
https://docs.ansible.com/ansible/playbooks_loops.html#standard-loops http://jinja.pocoo.org/
Services: service
Ansible can manage services, it just needs to know they are there.
service: name=nginx state=started enabled=yes
for state, most common will be started
, stopped
, restarted
, and reload
. enabled=yes
means the service will start on startup.
https://docs.ansible.com/ansible/service_module.html
Support Playbook 1 - Stack Restart
The playbook will restart the entire stack to known good config. Start by taking down the stack in order, from userfacing (i.e. loadbalancer to webserver), then restart database, finally bring them back up in reverse order.
Services: apache2_module
, handlers
, notify
Here we begin to prepare apache for our Python application. We will be doing this with mod_wsgi
. This can be done with the apache2_module
service. After enabling an apache service, apache must be restarted.
apache2_module
is idempotent, so if it's already enabled we'll skip right on, but if we set to restarted
it will restart it no matter what.
To solve this, we can set a handler, and by default it will not fire unless we request it with a notify
condition.
The nice thing about notify
is that it will aggregate multiple calls and only run once.
https://docs.ansible.com/ansible/playbooks_intro.html#handlers-running-operations-on-change https://docs.ansible.com/ansible/apache2_module_module.html
Files: copy
File location is relative to playbook file.
Trailing /
specifies to copy the directory.
https://docs.ansible.com/ansible/copy_module.html
Application Modules: pip
Works as expected, pip: requirements=/var/www/demo/requirements.txt virtualenv=/var/www/demo/.venv
https://docs.ansible.com/ansible/pip_module.html
Files: file
Can be used to ensure files exist, do not exist, are symlinks, etc.
Here we will be ensuring the default sites-enabled conf is absent
and our demo site conf is a link
instead.
https://docs.ansible.com/ansible/file_module.html
Files: template
Templates allow you to use a Jinja 2 templates, including many features from Python such as loops, to template file changes based on variables.
template: src=templates/nginx.conf.j2 dest=/etc/nginx/sites-available/demo mode=0755
https://docs.ansible.com/ansible/template_module.html
Files: lineinfile
lineinfile
allows you to read and write a file on the host, in this case we specify the mysql config and use regex
to set the bind-address
.
https://docs.ansible.com/ansible/lineinfile_module.html
Application Modules: mysql_db
, mysql_user
Ansible has several packages for working with mysql, here we use mysql_db
and mysql_user
to set the basic configurations.
This requires the python-mysqldb
package installed.
For mysql_user
, priv=demo.*:ALL
is standard mysql.
https://docs.ansible.com/ansible/mysql_db_module.html https://docs.ansible.com/ansible/mysql_user_module.html
Support Playbook 2 - Stack Status: wait_for
It's helpful now to make a playbook to help us quickly check status of services.
We want this playbook to be read-only, and we can do this with status commands.
Additionally, we want to ensure the hosts are answering on the correct ports.
For this we can use wait_for
with very short timeouts, since they should respond quickly.
Additionally, we can add wait_for
hints to the restart playbook to handle draining.
https://docs.ansible.com/ansible/wait_for_module.html
Support Playbook 2 - Stack Status: uri
, register
, fail
, when
uri
gives us an end-to-end web application test.
register
creates a variable and returns the content and output.
fail
check the contents and fail under certain conditions.
when
conditional logic, such as when using fail
.
By putting all this together, we can create an end-to-end test for our app.
https://docs.ansible.com/ansible/uri_module.html https://docs.ansible.com/ansible/playbooks_conditionals.html#register-variables https://docs.ansible.com/ansible/playbooks_conditionals.html#the-when-statement https://docs.ansible.com/ansible/playbooks_loops.html#standard-loops
Roles
At this point, we are finished with the basics, but we are making a lot of assumptions and not following best practices, especially security. We've made it work, now make it right. We can do this by going back, injecting roles and variables. We will focus on reusability, maintainability, and security. We should be able to change playbooks easily without much extra work, and roles allow this with encapsulation.
https://docs.ansible.com/ansible/playbooks_roles.html
Converting to Roles: tasks
, handlers
Since the tasks have been moved to main.yml
, Ansible can make some implicit assumptions, so we no longer need to define these tasks in the plays and can instead reference them with the roles
key.
Note: The database
tasks had a possible deadlock condition, to fix this we put ensure mysql started
after the configuration changes.
We can also replace handlers
with roles as well.
https://docs.ansible.com/ansible/playbooks_roles.html
Converting to Roles: files
, templates
We're going to move the nginx.conf.j2
to a location closer to the role file and change the relative path. We'll also move the demo app files under the apache2
role.
Site.yml: include
By creating a site.yml
and adding all our playbooks with include
, we can automate executing all of them.
https://docs.ansible.com/ansible/playbooks_roles.html
Variables: facts
By using facts
, we can create a dynamic variable based on the host information, such as IP address.
We can replace the mysql bind-address
value with {{ ansible_eth0.ipv4.address }}
.
stack_status.yml
is now out of sync with our configurations and must be fixed.
https://docs.ansible.com/ansible/playbooks_variables.html#information-discovered-from-systems-facts
Variables: defaults
Now we can replace the default configurations in the database with variables, and instead put those default variables in defaults/main.yml
https://docs.ansible.com/ansible/playbooks_roles.html#role-default-variables
Variables: vars
There are several precedence levels of vars
, generally as you get closer to what is actually being run you want higher precedence. Be careful not to have to many overrides.
vars
are all in global scope, so organization is important.
Here we will use three levels, defaults
, roles
, and group_vars
.
Pass variables in as a dictionary.
https://docs.ansible.com/ansible/playbooks_variables.html
Variables: with_dict
with_dict
is how variables can be passed in. Since it is a dictionary, we need to use {{ item.key }}
to reference the value.
https://docs.ansible.com/ansible/playbooks_loops.html#looping-over-hashes
Selective Removal: shell
, register
, with_items
, when
Drift has happened in out config now, demo
is still there but Ansible no longer cares about it since it's no longer in the config.
For this, we will ensure nothing else is running as well.
Use shell
to run commands on the host to get what is already activated.
register
will register the output as a list variable.
with_items
will loop through the list.
when
will perform an action when it finds an site not in the sites
key.
https://docs.ansible.com/ansible/shell_module.html https://docs.ansible.com/ansible/playbooks_conditionals.html#register-variables https://docs.ansible.com/ansible/playbooks_conditionals.html#the-when-statement https://docs.ansible.com/ansible/playbooks_loops.html#standard-loops
Variables: vars_files
, group_vars
Ansible has a few ways to keep external variable files, such as inventory
, vars_files
, and group_vars
.
We're not going to use inventory
to not overload that and mix logic, we use group_vars/all.yml
to keep everything in the same location.
https://docs.ansible.com/ansible/intro_inventory.html#splitting-out-host-and-group-specific-data https://docs.ansible.com/ansible/playbooks_variables.html#variable-file-separation
Variables: vault
Secrets (passwords, SSH keys, etc.) are very dangerous to leave in your configuration.
Vault will create an encrypted file to store secrets safely with a passphrase.
Since it will encrypt the entire file, it's best to separate out your secret variables from the rest of your variables, and that's what we will do.
group_vars/all
will become group_vars/all/vars
, and group_vars/all/vault
will be the vault.
You can use ansible-playbook --ask-vault-pass
to unlock the playbook while you work.
You can also create a vault password file some where safe, like in your home directory, and tell ansible where to find the file by defining vault_password_file = <file path>
in ansible.cfg
.
These only support one vault password, so if you are managing multiple environments this may be tricky.
https://docs.ansible.com/ansible/playbooks_vault.html
External Roles & Galaxy
Ansible Galaxy is a platform to share third-party roles. There are advantages and disadvantages, like with using any external libraries and tools.
Some considerations: * Age * Ratings * App feature coverage * Updates * Dependence * Modifications you must make
Quiz
Q: How could you define a variable value and be absolutely sure that it would not be overridden anywhere else by Ansible?
A: Pass the variable using the -e
or --extra-vars
parameter when running ansible-playbook
.
Q: What ad-hoc command would you run to determine the facts available for a server?
A: ansible -m setup
The setup
module will query all facts on a host and return them.
Advanced Execution
Now we are done with our configurations, we move on to making things faster.
This isn't critical if performance is good enough, and if the playbooks are actively being changed it's a waste of time.
First we start with a benchmark to refer to later: time ansible-playbook site.yml
. This will be a no-op, no changes are made.
Also we will benchmark a stack restart: time ansible-playbook playbooks/stack_restart.yml
Removing Unnecessary Steps: gather_facts
For any task where we do not need to use facts, then we can simply add a key to skip the step: gather_facts: false
.
This gives immediate gains with no downsides if done correctly.
https://docs.ansible.com/ansible/playbooks_variables.html#turning-off-facts
Extracting Repetitive Tasks: cache_valid_time
Updating cache has a lot of overhead, so by setting cache_valid_time
we can tell the playbook to only check within a reasonable period, not every single time.
This means when we do check we will pay slightly higher time costs, but we check far less often, with big long term gains.
https://docs.ansible.com/ansible/apt_module.html
Limiting Execution by Hosts: limit
Now site.yml
is taking about 1 minute, which isn't very long, but it can quickly add up.
If we only need to run a playbook on a subset of hosts, we can run it with limit
specified: ansible-playbook site.yml --limit app01
.
This allows us to use the whole site.yml
logic.
https://docs.ansible.com/ansible/intro_patterns.html
Limiting Execution by Tasks: tags
The tags
key allow us to apply specific tasks and playbooks to a certain subset of hosts.
For example, setting tags: [ 'packages' ]
allows us to do this: ansible-playbook site.yml --tags "packages"
, and only that task will execute.
We can also invert this logic: ansible-playbook site.yml --skip-tags "packages"
Now our runtime is down to 15 seconds.
Tags are flexible but can easily get out of hand.
https://docs.ansible.com/ansible/playbooks_tags.html
Idempotence: changed_when
, failed_when
By using these, we can define our task output to get more or less information based on conditions.
changed_when: false
don't show output.
changed_when: "active.stdout_lines != sites.keys()"
this is a Python expression being evaluated for truthiness.
https://docs.ansible.com/ansible/playbooks_error_handling.html#overriding-the-changed-result
Accelerated Mode and Pipelining
Ansible uses the installed OpenSSH with a fallback to the paramiko library for compatibility, but performance loss. Accelerated mode and pipelining takes advantage of new features as long as your environment meets the requirements. It's not considered best practice, but it is available if needed.
https://docs.ansible.com/ansible/playbooks_acceleration.html https://docs.ansible.com/ansible/intro_configuration.html#pipelining
Troubleshooting, Testing, & Validation
Troubleshooting Ordering Problems
Inevitably you will run in to errors and playbooks will not execute, usually ordering problems are the cause. For example: a configuration change might not work, so you can't restart the service, and the service is expected to be up on the next run. There are many ways around this, but the best way is to think through logically what is happening and fix the source of the problem.
Jumping to Specific Tasks: list-tasks
, step
, start-at-task
While troubleshooting, you can use these commands to limit the run to just the tasks you are working on.
ansible-playbook site.yml --step
will go through the playbook step by step, requiring interaction to continue.
ansible-playbook site.yml --list-tasks
will output every task that must be executed, so we can select it for start-at-task
.
https://docs.ansible.com/ansible/playbooks_startnstep.html
Retrying Failed Hosts
Not every host may fail, when hosts fail ansible will output a file containing just these hosts.
--limit @/home/ansible/site.retry
Syntax-Check & Dry-Run: syntax-check
, check
Static analysis and a no-op dry-run.
ansible-playbook --syntax-check site.yml
ansible-playbook --check site.yml
It may be helpful to run against specific playbooks rather than an overriding playbook like site.yml
.
Dynamic data or reliance on previous tasks can't be checked, this is just a best guess.
You always have to provide some sort of inventory to run these.
https://docs.ansible.com/ansible/playbooks_checkmode.html
Debugging: debug
Spits out messages or variable values at the defined point in your playbook.
For example: - debug: var=active.stdout_lines
https://docs.ansible.com/ansible/debug_module.html
Quiz
Q: Which of the following is NOT an option presented after running ansible-playbook with '--step'?
A: c
- cancel execution and return to the prompt. The "c" will cause resumption of the playbook, normal. Use /home/<user>/<playbook>.retry
. How would you use a file /home/ansible/site.retry
to retry the last execution?
A: ansible-playbook site.yml --limit @/home/ansible/site.retry
The site.yml playbook should be executed, as normal. The .retry file is a list of hosts that can be used to limit the execution to only the failures from the previous run. The "@" instructs ansible-playbook to use the contents of the retry file for the limit, not the literal file path.