During the PTG there was a discussion that the screen developer workflow wasn't nearly as useful as it once was. There were now too many services to see them all on one screen, and one of the most common service restart scenarios was not restarting one service, but a bunch to get code to take effect. This implements a 3rd way of running services instead of direct forking via bash, or running under screen, which is running as systemd units. Logging is adjusted because it's redundant to log datetime in oslo.log when journald has that. Swift needed to have services launched by absolute path to work. This is disabled by default, but with instructions on using it. The long term intent is to make this the way to run devstack, which would be the same between both the gate and local use. Some changes were also needed to run_process to pass the run User in. A hack around the keystone uwsgi launcher was done at the same time to remove a run_process feature that only keystone uwsgi uses. Change-Id: I836bf27c4cfdc449628aa7641fb96a5489d5d4e7
5.3 KiB
Using Systemd in DevStack
Note
This is an in progress document as we work out the way forward here with DevStack and systemd.
DevStack can be run with all the services as systemd unit files. Systemd is now the default init system for nearly every Linux distro, and systemd encodes and solves many of the problems related to poorly running processes.
Why this instead of screen?
The screen model for DevStack was invented when the number of services that a DevStack user was going to run was typically < 10. This made screen hot keys to jump around very easy. However, the landscape has changed (not all services are stoppable in screen as some are under Apache, there are typically at least 20 items)
There is also a common developer workflow of changing code in more than one service, and needing to restart a bunch of services for that to take effect.
To enable this add the following to your local.conf:
USE_SYSTEMD=True
Unit Structure
Note
Originally we actually wanted to do this as user units, however there are issues with running this under non interactive shells. For now, we'll be running as system units. Some user unit code is left in place in case we can switch back later.
All DevStack user units are created as a part of the DevStack slice
given the name devstack@$servicename.service
. This lets us
do certain operations at the slice level.
Manipulating Units
Assuming the unit n-cpu
to make the examples more
clear.
Enable a unit (allows it to be started):
sudo systemctl enable devstack@n-cpu.service
Disable a unit:
sudo systemctl disable devstack@n-cpu.service
Start a unit:
sudo systemctl start devstack@n-cpu.service
Stop a unit:
sudo systemctl stop devstack@n-cpu.service
Restart a unit:
sudo systemctl restart devstack@n-cpu.service
See status of a unit:
sudo systemctl status devstack@n-cpu.service
Querying Logs
One of the other major things that comes with systemd is journald, a
consolidated way to access logs (including querying through structured
metadata). This is accessed by the user via journalctl
command.
Logs can be accessed through journalctl
. journalctl has
powerful query facilities. We'll start with some common options.
Follow logs for a specific service:
journalctl -f --unit devstack@n-cpu.service
Following logs for multiple services simultaneously:
journalctl -f --unit devstack@n-cpu.service --user-unit
devstack@n-cond.service
Use higher precision time stamps:
journalctl -f -o short-precise --unit devstack@n-cpu.service
Known Issues
Be careful about systemd python libraries. There are 3 of them on
pypi, and they are all very different. They unfortunately all install
into the systemd
namespace, which can cause some
issues.
systemd-python
- this is the upstream maintained library, it has a version number like systemd itself (currently233
). This is the one you want.systemd
- a python 3 only library, not what you want.python-systemd
- another library you don't want. Installing it on a system will break ansible's ability to run.
If we were using user units, the [Service]
-
Group=
parameter doesn't seem to work with user units, even
though the documentation says that it should. This means that we will
need to do an explicit /usr/bin/sg
. This has the downside
of making the SYSLOG_IDENTIFIER be sg
. We can explicitly
set that with SyslogIdentifier=
, but it's really
unfortunate that we're going to need this work around. This is currently
not a problem because we're only using system units.
Future Work
oslo.log journald
Journald has an extremely rich mechanism for direct logging including structured metadata. We should enhance oslo.log to take advantage of that. It would let us do things like:
journalctl REQUEST_ID=......
journalctl INSTANCE_ID=......
And get all lines related to the request id or instance id.
sub targets/slices
We might want to create per project slices so that it's easy to follow, restart all services of a single project (like swift) without impacting other services.
log colorizing
We lose log colorization through this process. We might want to build a custom colorizer that we could run journalctl output through optionally for people.
user units
It would be great if we could do services as user units, so that there is a clear separation of code being run as not root, to ensure running as root never accidentally gets baked in as an assumption to services. However, user units interact poorly with devstack-gate and the way that commands are run as users with ansible and su.
Maybe someday we can figure that out.
References
- Arch Linux Wiki - https://wiki.archlinux.org/index.php/Systemd/User
- Python interface to journald -https://www.freedesktop.org/software/systemd/python-systemd/journal.html
- Systemd documentation on service files -https://www.freedesktop.org/software/systemd/man/systemd.service.html
- Systemd documentation on exec (can be used to impact service runs) -https://www.freedesktop.org/software/systemd/man/systemd.exec.html