Welcome to check_systemd’s documentation!

check_system is a Nagios / Icinga monitoring plugin to check systemd. This Python script will report a degraded system to your monitoring solution. It can also be used to monitor individual systemd services (with the -u, --unit parameter) and timers units (with the -t, --dead-timers parameter).

To learn more about the project, please visit the repository on Github.

Monitoring scopes

  • units: State of unites

  • timers: Timers

  • startup_time: Startup time

  • performance_data: Performance data

Data sources

  • D-Bus (dbus)

  • Command line interface (cli)

This plugin is based on a Python package named nagiosplugin. nagiosplugin has a fine-grained class model to separate concerns. A Nagios / Icinga plugin must perform these three steps: data acquisition, evaluation and presentation. nagiosplugin provides for this three steps three classes: Resource, Context, Summary. check_systemd extends this three model classes in the following subclasses:

Acquisition (Resource)

Evaluation (Context)

Presentation (Summary)

check_systemd.is_gi

true if the package PyGObject (gi) is available.

class check_systemd.OptionContainer[source]

Bases: object

This class has the same attributes as the Namespace instance returned by the argparse package.

verbose: int
debug: int
ignore_inactive_state: bool
include_unit: str | None
include_type: list[str]
exclude_unit: list[str]
exclude_type: list[str]
expected_state: str | None
scope_timers: bool
timers_warning: float
timers_critical: float
scope_startup_time: bool
warning: float
critical: float
with_user_units: bool
performance_data: bool
include: list[str]
exclude: list[str]
data_source: Literal['dbus', 'cli'] | None
check_systemd.opts

We make is variable global to be able to access the command line arguments everywhere in the plugin. In this variable the result of parse_args() is stored. It is an instance of the argparse.Namespace class. This variable is initialized in the main function. The variable is intentionally not named args to avoid confusion with *args (Non-Keyword Arguments).

class check_systemd.DbusManager[source]

Bases: object

This class holds the main entry point object of the D-Bus systemd API. See the section The Manager Object in the systemd D-Bus API.

property manager: DBusProxy
check_systemd.dbus_manager

The systemd D-Bus API main entry point object, the so called “manager”.

check_systemd.format_timespan_to_seconds(fmt_timespan: str) float[source]

Convert a timespan format string into secondes. Take a look at the systemd time-util.c source code.

Parameters:

fmt_timespan – for example 2.345s or 3min 45.234s or 34min left or 2 months 8 days

Returns:

The seconds

check_systemd.execute_cli(args: str | Sequence[str]) str | None[source]

Execute a command on the command line (cli = command line interface)) and capture the stdout. This is a wrapper around subprocess.Popen.

Parameters:

args – A list of programm arguments.

Raises:

nagiosplugin.CheckError – If the command produces some stderr output or if an OSError exception occurs.

Returns:

The stdout of the command.

class check_systemd.TableParser(stdout: str)[source]

Bases: object

This class reads the text tables that some systemd commands like systemctl list-units or systemctl list-timers produce.

header_row: str
column_lengths: list[int]
columns: list[str]
body_rows: list[str]
property row_count

The number of rows. Only the body rows are counted. The header row is not taken into account.

check_header(column_header: Sequence[str]) None[source]

Check if the specified column names are present in the header row of the text table. Raise an exception if not.

Parameters:

column_headers – The expected column headers (for example ('UNIT', 'LOAD', 'ACTIVE'))

get_row(row_number: int) dict[str, str][source]

Retrieve a table row as a dictionary. The keys are taken from the header row. The first row number is 0.

Parameters:

row_number – The index number of the table row starting at 0.

list_rows() Generator[dict[str, str], None, None][source]

List all rows.

exception check_systemd.CheckSystemdError[source]

Bases: Exception

Base class for exceptions in this module. All exceptions are caught by the decorator @nagiosplugin.guarded() on the main function and printed out nicely.

exception check_systemd.CheckSystemdRegexpError[source]

Bases: CheckSystemdError

Raised when an invalid regular expression is specified.

check_systemd.match_multiple(unit_name: str, regexes: str | Sequence[str]) bool[source]

Match multiple regular expressions against a unit name.

Parameters:
  • unit_name – The unit name to be matched.

  • regexes – A single regular expression (include='.*service') or a list of regular expressions (include=('.*service', '.*mount')).

Returns:

True if one regular expression matches

check_systemd.ActiveState

From the D-Bus interface of systemd documentation:

ActiveState contains a state value that reflects whether the unit is currently active or not. The following states are currently defined:

  • active,

  • reloading,

  • inactive,

  • failed,

  • activating, and

  • deactivating.

active indicates that unit is active (obviously…).

reloading indicates that the unit is active and currently reloading its configuration.

inactive indicates that it is inactive and the previous run was successful or no previous run has taken place yet.

failed indicates that it is inactive and the previous run was not successful (more information about the reason for this is available on the unit type specific interfaces, for example for services in the Result property, see below).

activating indicates that the unit has previously been inactive but is currently in the process of entering an active state.

Conversely deactivating indicates that the unit is currently in the process of deactivation.

alias of Literal[‘active’, ‘reloading’, ‘inactive’, ‘failed’, ‘activating’, ‘deactivating’]

check_systemd.SubState

From the D-Bus interface of systemd documentation:

SubState encodes states of the same state machine that ActiveState covers, but knows more fine-grained states that are unit-type-specific. Where ActiveState only covers six high-level states, SubState covers possibly many more low-level unit-type-specific states that are mapped to the six high-level states. Note that multiple low-level states might map to the same high-level state, but not vice versa. Not all high-level states have low-level counterparts on all unit types.

All sub states are listed in the file basic/unit-def.c of the systemd source code:

  • automount: dead, waiting, running, failed

  • device: dead, tentative, plugged

  • mount: dead, mounting, mounting-done, mounted,

    remounting, unmounting, remounting-sigterm, remounting-sigkill, unmounting-sigterm, unmounting-sigkill, failed, cleaning

  • path: dead, waiting, running, failed

  • scope: dead, running, abandoned, stop-sigterm,

    stop-sigkill, failed

  • service: dead, condition, start-pre, start,

    start-post, running, exited, reload, stop, stop-watchdog, stop-sigterm, stop-sigkill, stop-post, final-watchdog, final-sigterm, final-sigkill, failed, auto-restart, cleaning

  • slice: dead, active

  • socket: dead, start-pre, start-chown, start-post,

    listening, running, stop-pre, stop-pre-sigterm, stop-pre-sigkill, stop-post, final-sigterm, final-sigkill, failed, cleaning

  • swap: dead, activating, activating-done, active,

    deactivating, deactivating-sigterm, deactivating-sigkill, failed, cleaning

  • target:dead, active

  • timer: dead, waiting, running, elapsed, failed

alias of Literal[‘abandoned’, ‘activating-done’, ‘activating’, ‘active’, ‘auto-restart’, ‘cleaning’, ‘condition’, ‘deactivating-sigkill’, ‘deactivating-sigterm’, ‘deactivating’, ‘dead’, ‘elapsed’, ‘exited’, ‘failed’, ‘final-sigkill’, ‘final-sigterm’, ‘final-watchdog’, ‘listening’, ‘mounted’, ‘mounting-done’, ‘mounting’, ‘plugged’, ‘reload’, ‘remounting-sigkill’, ‘remounting-sigterm’, ‘remounting’, ‘running’, ‘start-chown’, ‘start-post’, ‘start-pre’, ‘start’, ‘stop-post’, ‘stop-pre-sigkill’, ‘stop-pre-sigterm’, ‘stop-pre’, ‘stop-sigkill’, ‘stop-sigterm’, ‘stop-watchdog’, ‘stop’, ‘tentative’, ‘unmounting-sigkill’, ‘unmounting-sigterm’, ‘unmounting’, ‘waiting’]

check_systemd.LoadState

From the D-Bus interface of systemd documentation:

LoadState contains a state value that reflects whether the configuration file of this unit has been loaded. The following states are currently defined:

  • loaded,

  • error and

  • masked.

loaded indicates that the configuration was successfully loaded.

error indicates that the configuration failed to load, the LoadError field contains information about the cause of this failure.

masked indicates that the unit is currently masked out (i.e. symlinked to /dev/null or suchlike).

Note that the LoadState is fully orthogonal to the ActiveState (see below) as units without valid loaded configuration might be active (because configuration might have been reloaded at a time where a unit was already active).

alias of Literal[‘loaded’, ‘error’, ‘masked’]

class check_systemd.Unit(**kwargs)[source]

Bases: object

This class bundles all state related informations of a systemd unit in a object. This class is inherited by the class DbusUnit and the attributes are overwritten by properties.

name: str

The name of the system unit, for example nginx.service. In the command line table of the command systemctl list-units is the column containing unit names titled with “UNIT”.

active_state: Literal['active', 'reloading', 'inactive', 'failed', 'activating', 'deactivating']
sub_state: Literal['abandoned', 'activating-done', 'activating', 'active', 'auto-restart', 'cleaning', 'condition', 'deactivating-sigkill', 'deactivating-sigterm', 'deactivating', 'dead', 'elapsed', 'exited', 'failed', 'final-sigkill', 'final-sigterm', 'final-watchdog', 'listening', 'mounted', 'mounting-done', 'mounting', 'plugged', 'reload', 'remounting-sigkill', 'remounting-sigterm', 'remounting', 'running', 'start-chown', 'start-post', 'start-pre', 'start', 'stop-post', 'stop-pre-sigkill', 'stop-pre-sigterm', 'stop-pre', 'stop-sigkill', 'stop-sigterm', 'stop-watchdog', 'stop', 'tentative', 'unmounting-sigkill', 'unmounting-sigterm', 'unmounting', 'waiting']
load_state: Literal['loaded', 'error', 'masked']
convert_to_exitcode() ServiceState[source]

Convert the different systemd states into a Nagios compatible exit code.

Returns:

A Nagios compatible exit code: 0, 1, 2, 3

class check_systemd.SystemdUnitTypesList(*args)[source]

Bases: MutableSequence

insert(index, unit_type) None[source]

S.insert(index, value) – insert value before index

convert_to_regexp()[source]
class check_systemd.UnitNameFilter(unit_names: Sequence[str] = ())[source]

Bases: object

This class stores all system unit names (e. g. nginx.service or fstrim.timer) and provides a interface to filter the names by regular expressions.

add(unit_name: str) None[source]

Add one unit name.

Parameters:

unit_name – The name of the unit, for example apt.timer.

get() set[str][source]

Get all stored unit names.

list(include: str | Sequence[str] | None = None, exclude: str | Sequence[str] | None = None) Generator[str, None, None][source]

List all unit names or apply filters (include or exclude) to the list of unit names.

Parameters:
  • include – If the unit name matches the provided regular expression, it is included in the list of unit names. A single regular expression (include='.*service') or a list of regular expressions (include=('.*service', '.*mount')).

  • exclude – If the unit name matches the provided regular expression, it is excluded from the list of unit names. A single regular expression (exclude='.*service') or a list of regular expressions (exclude=('.*service', '.*mount')).

class check_systemd.UnitCache[source]

Bases: object

This class is a container class for systemd units.

add_unit(unit: Unit | None = None, name: str | None = None, active_state: Literal['active', 'reloading', 'inactive', 'failed', 'activating', 'deactivating'] | None = None, sub_state: Literal['abandoned', 'activating-done', 'activating', 'active', 'auto-restart', 'cleaning', 'condition', 'deactivating-sigkill', 'deactivating-sigterm', 'deactivating', 'dead', 'elapsed', 'exited', 'failed', 'final-sigkill', 'final-sigterm', 'final-watchdog', 'listening', 'mounted', 'mounting-done', 'mounting', 'plugged', 'reload', 'remounting-sigkill', 'remounting-sigterm', 'remounting', 'running', 'start-chown', 'start-post', 'start-pre', 'start', 'stop-post', 'stop-pre-sigkill', 'stop-pre-sigterm', 'stop-pre', 'stop-sigkill', 'stop-sigterm', 'stop-watchdog', 'stop', 'tentative', 'unmounting-sigkill', 'unmounting-sigterm', 'unmounting', 'waiting'] | None = None, load_state: Literal['loaded', 'error', 'masked'] | None = None) Unit[source]
get(name: str | None = None) Unit | None[source]
list(include: str | Sequence[str] | None = None, exclude: str | Sequence[str] | None = None) Generator[Unit, None, None][source]

List all units or apply filters (include or exclude) to the list of unit.

Parameters:
  • include – If the unit name matches the provided regular expression, it is included in the list of unit names. A single regular expression (include='.*service') or a list of regular expressions (include=('.*service', '.*mount')).

  • exclude – If the unit name matches the provided regular expression, it is excluded from the list of unit names. A single regular expression (exclude='.*service') or a list of regular expressions (exclude=('.*service', '.*mount')).

property count: int
count_by_states(states: Sequence[str], include: str | Sequence[str] | None = None, exclude: str | Sequence[str] | None = None) dict[source]
class check_systemd.CliUnitCache(with_user_units: bool = False)[source]

Bases: UnitCache

class check_systemd.DbusUnitCache[source]

Bases: UnitCache

check_systemd.unit_cache: UnitCache

An instance of DbusUnitCache or CliUnitCache

class check_systemd.UnitsResource[source]

Bases: Resource

probe() Generator[Metric, None, None][source]

Query system state and return metrics.

This is the only method called by the check controller. It should trigger all necessary actions and create metrics.

Returns:

list of Metric objects, or generator that emits Metric objects, or single Metric object

class check_systemd.UnitsContext[source]

Bases: Context

evaluate(metric: Metric, resource: Resource) Result[source]

Determines state of a given metric.

Parameters:
  • metric – associated metric that is to be evaluated

  • resource – resource that produced the associated metric (may optionally be consulted)

Returns:

Result

class check_systemd.TimersResource[source]

Bases: Resource

Resource that calls systemctl list-timers --all on the command line to get informations about dead / inactive timers. There is one type of systemd “degradation” which is normally not detected: dead / inactive timers.

Parameters:

excludes (list) – A list of systemd unit names to exclude from the checks.

name
probe() Generator[Metric, None, None][source]
Returns:

generator that emits Metric objects

class check_systemd.TimersContext[source]

Bases: Context

evaluate(metric: Metric, resource: Resource)[source]

Determines state of a given metric.

Parameters:
  • metric – associated metric that is to be evaluated

  • resource – resource that produced the associated metric (may optionally be consulted)

Returns:

Result

class check_systemd.StartupTimeResource[source]

Bases: Resource

Resource that calls systemd-analyze on the command line to get informations about the startup time.

probe() Generator[Metric, None, None][source]

Query system state and return metrics.

Returns:

generator that emits Metric objects

class check_systemd.StartupTimeContext[source]

Bases: ScalarContext

performance(metric: Metric, resource: Resource)[source]

Derives performance data.

The metric’s attributes are combined with the local warning and critical ranges to get a fully populated Performance object.

Parameters:
  • metric – metric from which performance data are derived

  • resource – not used

Returns:

Performance object

class check_systemd.PerformanceDataResource[source]

Bases: Resource

probe() Generator[Metric, None, None][source]

Query system state and return metrics.

This is the only method called by the check controller. It should trigger all necessary actions and create metrics.

Returns:

list of Metric objects, or generator that emits Metric objects, or single Metric object

class check_systemd.PerformanceDataContext[source]

Bases: Context

performance(metric: Metric, resource: Resource)[source]

Derives performance data from a given metric.

Parameters:
  • metric – associated metric from which performance data are derived

  • resource – resource that produced the associated metric (may optionally be consulted)

Returns:

Perfdata object

class check_systemd.SystemdSummary[source]

Bases: Summary

Format the different status lines. A subclass of nagiosplugin.Summary.

ok(results: Results) str[source]

Formats status line when overall state is ok.

Parameters:

resultsResults container

Returns:

status line

problem(results: Results) str[source]

Formats status line when overall state is not ok.

Parameters:

resultsResults container

Returns:

status line

verbose(results: Results) list[str][source]

Provides extra lines if verbose plugin execution is requested.

Parameters:

resultsResults container

Returns:

list of strings

check_systemd.convert_to_regexp_list(regexp: Sequence[str] | None = None, unit_names: str | Sequence[str] | None = None, unit_types: Sequence[str] | None = None) set[str][source]
check_systemd.get_argparser() ArgumentParser[source]
check_systemd.normalize_argparser(opts: Namespace) OptionContainer[source]
check_systemd.main() None[source]

The main entry point of the monitoring plugin. First the command line arguments are read into the variable opts. The configuration of this opts object decides which instances of the Resource, Context and Summary subclasses are assembled in a list called tasks. This list is passed the main class of the nagiosplugin library: the Check class.

Indices and tables