YAML Reference
This reference guide is designed to introduce you to YAML syntax. YAML (YAML Ain’t Markup Language) is basically JSON with a couple extra features, and meant to be a little more human readable. We will be using YAML formatted configuration files in the Docker compose and Kubernetes sections, so it is important to become familiar with the syntax.
YAML Basics
YAML syntax is similar to Python dictionaries, and we will usually see them as key:value pairs. Values can include strings, numbers, booleans, null, lists, and other dictionaries.
Previously, we saw a simple JSON object in dictionary form like:
{
"key1": "value1",
"key2": "value2"
}
That same object in YAML looks like:
---
key1: value1
key2: value2
...
Notice that YAML documents all start with three hyphens on the top line (---
),
and end with an optional three dots (...
) on the last line. Key:value pairs
are separated by colons, but consecutive key:value pairs are NOT separated by
commas.
We also mentioned that JSON supports list-like structures. YAML does too. So the following valid JSON block:
[
"thing1", "thing2", "thing3"
]
Appears like this in YAML:
---
- thing1
- thing2
- thing3
...
Elements of the same list all appear with a hyphen -
at the same indent
level.
We previously saw this complex data structure in JSON:
{
"department": "COE",
"number": 332,
"name": "Software Engineering and Design",
"inperson": true,
"finalgroups": null,
"instructors": ["Joe", "Charlie", "Joe"],
"prerequisites": [
{"course": "COE 322", "instructor": "Victor"},
{"course": "SDS 322", "instructor": "Victor"}
]
}
The same structure in YAML is:
---
department: COE
number: 332
name: Software Engineering and Design
inperson: true
finalgroups: null # can also use ~
instructors:
- Joe
- Charlie
- Joe
prerequisites:
- course: COE 322
instructor: Victor
- course: SDS 322
instructor: Victor
...
The whole thing can be considered a dictionary. The key instructors
contains
a value that is a list of names, and the key prerequisites
contains a value
that is a list of dictionaries. Booleans appear as false
and true
(lowercase only). Null / empty values appear as null
or ~
. And, as you
can see above, YAML also supports comments starting with a #
.
One glaring thing that is missing from the YAML file is quotation marks. In
general, you don’t have to use quotes in YAML. You may use quotes to force a
number to be interpreted as a string (e.g. 10
will automatically be
interpreted as an integer, but "10"
will be interpreted as a string).
Note
Check out the list of meteorite landing sites we worked with in the JSON section, but now in YAML format here.
There is a lot more to YAML, most of which we will not use in this course. Just know that YAML files can contain:
Comments
Multi-line strings / text blocks
Multi-word keys
Complex objects
Special characters
Explicitly declared types
A mechanism to duplicate / inherit values across a document (“anchors”)
If we encounter a need for any of these, we can refer to the official YAML syntax
Read YAML from File
Warning
There is no YAML interpreter in the Python 3.6 standard libary, so we need to install one with pip3:
[login-coe332]$ pip3 install --user pyyaml
Given the meteorite landing site data in YAML format, which you can download from this link, load it into a Python3 dictionary object using the following:
1import yaml
2
3data = {}
4
5with open('Meteorite_Landings.yaml', 'r') as f:
6 data = yaml.load(f, Loader=yaml.SafeLoader)
Very similar to the JSON module, it only requires a few simple lines then you
have a dictionary object to work with. The Loader=yaml.SafeLoader
parameter
makes it so no arbitrary Python code is executed when loading in the data - this
is typically a good choice for data from untrusted sources.
Write YAML to File
In a new script create a dictionary object that we can write to a new YAML file.
1import yaml
2
3data = {}
4data['class'] = 'COE332'
5data['title'] = 'Software Engineering and Design'
6data['subjects'] = []
7data['subjects'].append( {'unit': 1, 'topic': ['linux', 'python3', 'git']} )
8data['subjects'].append( {'unit': 2, 'topic': ['json', 'csv', 'xml', 'yaml']} )
9
10with open('class.json', 'w') as o:
11 yaml.dump(data, o)
Notice that most of the code in the script above was simply assembling a normal
Python3 dictionary. The json.dump()
method only requires two arguments - the
object that should be written to file, and the filehandle.
Inspect the output file and paste the contents into an online YAML validator.