Welcome to check_systemd’s documentation!¶
check_system
is a Nagios / Icinga monitoring plugin to check systemd. This Python script
will report a degraded system to your monitoring solution. It can also be used
to monitor individual systemd services (with the -u, --unit
parameter) and
timers units (with the -t, --dead-timers
parameter).
To learn more about the project, please visit the repository on Github.
Monitoring scopes¶
units
: State of unitestimers
: Timersstartup_time
: Startup timeperformance_data
: Performance data
Data sources¶
D-Bus (
dbus
)Command line interface (
cli
)
This plugin is based on a Python package named nagiosplugin. nagiosplugin
has a fine-grained
class model to separate concerns. A Nagios / Icinga plugin must perform these
three steps: data acquisition, evaluation and presentation.
nagiosplugin
provides for this three steps three classes: Resource
,
Context
, Summary
. check_systemd
extends this three model classes in
the following subclasses:
Acquisition (Resource
)¶
UnitsResource
(context=units
)TimersResource
(context=timers
)StartupTimeResource
(context=startup_time
)PerformanceDataResource
(context=performance_data
)
Evaluation (Context
)¶
UnitsContext
(context=units
)TimersContext
(context=timers
)StartupTimeContext
(context=timers
)PerformanceDataContext
(context=performance_data
)
Presentation (Summary
)¶
- check_systemd.ActiveState¶
From the D-Bus interface of systemd documentation:
ActiveState
contains a state value that reflects whether the unit is currently active or not. The following states are currently defined:active
,reloading
,inactive
,failed
,activating
, anddeactivating
.
active
indicates that unit is active (obviously…).reloading
indicates that the unit is active and currently reloading its configuration.inactive
indicates that it is inactive and the previous run was successful or no previous run has taken place yet.failed
indicates that it is inactive and the previous run was not successful (more information about the reason for this is available on the unit type specific interfaces, for example for services in the Result property, see below).activating
indicates that the unit has previously been inactive but is currently in the process of entering an active state.Conversely
deactivating
indicates that the unit is currently in the process of deactivation.alias of
Literal
[‘active’, ‘reloading’, ‘inactive’, ‘failed’, ‘activating’, ‘deactivating’]
- check_systemd.SubState¶
From the D-Bus interface of systemd documentation:
SubState
encodes states of the same state machine thatActiveState
covers, but knows more fine-grained states that are unit-type-specific. WhereActiveState
only covers six high-level states,SubState
covers possibly many more low-level unit-type-specific states that are mapped to the six high-level states. Note that multiple low-level states might map to the same high-level state, but not vice versa. Not all high-level states have low-level counterparts on all unit types.All sub states are listed in the file basic/unit-def.c of the systemd source code:
automount:
dead
,waiting
,running
,failed
device:
dead
,tentative
,plugged
- mount:
dead
,mounting
,mounting-done
,mounted
, remounting
,unmounting
,remounting-sigterm
,remounting-sigkill
,unmounting-sigterm
,unmounting-sigkill
,failed
,cleaning
- mount:
path:
dead
,waiting
,running
,failed
- scope:
dead
,running
,abandoned
,stop-sigterm
, stop-sigkill
,failed
- scope:
- service:
dead
,condition
,start-pre
,start
, start-post
,running
,exited
,reload
,stop
,stop-watchdog
,stop-sigterm
,stop-sigkill
,stop-post
,final-watchdog
,final-sigterm
,final-sigkill
,failed
,auto-restart
,cleaning
- service:
slice:
dead
,active
- socket:
dead
,start-pre
,start-chown
,start-post
, listening
,running
,stop-pre
,stop-pre-sigterm
,stop-pre-sigkill
,stop-post
,final-sigterm
,final-sigkill
,failed
,cleaning
- socket:
- swap:
dead
,activating
,activating-done
,active
, deactivating
,deactivating-sigterm
,deactivating-sigkill
,failed
,cleaning
- swap:
target:
dead
,active
timer:
dead
,waiting
,running
,elapsed
,failed
alias of
Literal
[‘abandoned’, ‘activating-done’, ‘activating’, ‘active’, ‘auto-restart’, ‘cleaning’, ‘condition’, ‘deactivating-sigkill’, ‘deactivating-sigterm’, ‘deactivating’, ‘dead’, ‘elapsed’, ‘exited’, ‘failed’, ‘final-sigkill’, ‘final-sigterm’, ‘final-watchdog’, ‘listening’, ‘mounted’, ‘mounting-done’, ‘mounting’, ‘plugged’, ‘reload’, ‘remounting-sigkill’, ‘remounting-sigterm’, ‘remounting’, ‘running’, ‘start-chown’, ‘start-post’, ‘start-pre’, ‘start’, ‘stop-post’, ‘stop-pre-sigkill’, ‘stop-pre-sigterm’, ‘stop-pre’, ‘stop-sigkill’, ‘stop-sigterm’, ‘stop-watchdog’, ‘stop’, ‘tentative’, ‘unmounting-sigkill’, ‘unmounting-sigterm’, ‘unmounting’, ‘waiting’]
- check_systemd.LoadState¶
-
From the D-Bus interface of systemd documentation:
LoadState
contains a state value that reflects whether the configuration file of this unit has been loaded. The following states are currently defined:loaded
,error
andmasked
.
loaded
indicates that the configuration was successfully loaded.error
indicates that the configuration failed to load, theLoadError
field contains information about the cause of this failure.masked
indicates that the unit is currently masked out (i.e. symlinked to /dev/null or suchlike).Note that the
LoadState
is fully orthogonal to theActiveState
(see below) as units without valid loaded configuration might be active (because configuration might have been reloaded at a time where a unit was already active).alias of
Literal
[‘stub’, ‘loaded’, ‘not-found’, ‘bad-setting’, ‘error’, ‘merged’, ‘masked’]
- class check_systemd.T¶
For UnitCache. Can not be an inner typevar because of pylance
alias of TypeVar(‘T’)
- class check_systemd.Logger[source]¶
Bases:
object
A wrapper around the Python logging module with 3 debug logging levels.
-d
: info-dd
: debug-ddd
: verbose
- class check_systemd.Source[source]¶
Bases:
object
- class BaseUnit[source]¶
Bases:
object
- name: str¶
The name of the system unit, for example
nginx.service
. In the command line table of the commandsystemctl list-units
is the column containing unit names titled with “UNIT”.
- class Unit(name: str, active_state: object | None = None, sub_state: object | None = None, load_state: object | None = None)[source]¶
Bases:
BaseUnit
This class bundles all state related informations of a systemd unit in a object. This class is inherited by the class
DbusUnit
and the attributes are overwritten by properties.- active_state: Literal['active', 'reloading', 'inactive', 'failed', 'activating', 'deactivating']¶
- sub_state: Literal['abandoned', 'activating-done', 'activating', 'active', 'auto-restart', 'cleaning', 'condition', 'deactivating-sigkill', 'deactivating-sigterm', 'deactivating', 'dead', 'elapsed', 'exited', 'failed', 'final-sigkill', 'final-sigterm', 'final-watchdog', 'listening', 'mounted', 'mounting-done', 'mounting', 'plugged', 'reload', 'remounting-sigkill', 'remounting-sigterm', 'remounting', 'running', 'start-chown', 'start-post', 'start-pre', 'start', 'stop-post', 'stop-pre-sigkill', 'stop-pre-sigterm', 'stop-pre', 'stop-sigkill', 'stop-sigterm', 'stop-watchdog', 'stop', 'tentative', 'unmounting-sigkill', 'unmounting-sigterm', 'unmounting', 'waiting']¶
- load_state: Literal['stub', 'loaded', 'not-found', 'bad-setting', 'error', 'merged', 'masked']¶
- class Timer(name: str, last: int | None, next: int | None)[source]¶
Bases:
BaseUnit
# Dbus doc # readonly t NextElapseUSecRealtime = …; # readonly t NextElapseUSecMonotonic = …; # readonly t LastTriggerUSec = …; # readonly t LastTriggerUSecMonotonic = …; # NextElapseUSecRealtime contains the next elapsation point on the CLOCK_REALTIME clock in miscroseconds since the epoch, or 0 if this timer event does not include at least one calendar event.
# Similarly, NextElapseUSecMonotonic contains the next elapsation point on the CLOCK_MONOTONIC clock in microseconds since the epoch, or 0 if this timer event does not include at least one monotonic event.
# https://github.com/systemd/systemd/blob/e0270bab43a4c37028ee32ae853037df22999767/src/systemctl/systemctl-list-units.c#L668-L671’ # TABLE_TIMESTAMP, t->next_elapse, # TABLE_TIMESTAMP_LEFT, t->next_elapse, # TABLE_TIMESTAMP, t->last_trigger.realtime, # TABLE_TIMESTAMP_RELATIVE_MONOTONIC, t->last_trigger.monotonic,
# https://github.com/systemd/systemd/blob/e0270bab43a4c37028ee32ae853037df22999767/src/core/dbus-timer.c#L111 # SD_BUS_PROPERTY(“NextElapseUSecRealtime”, “t”, bus_property_get_usec, offsetof(Timer, next_elapse_realtime), SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE), # SD_BUS_PROPERTY(“NextElapseUSecMonotonic”, “t”, property_get_next_elapse_monotonic, 0, SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE), # BUS_PROPERTY_DUAL_TIMESTAMP(“LastTriggerUSec”, offsetof(Timer, last_trigger), SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE),
- name: str¶
The name of the system unit, for example
nginx.service
. In the command line table of the commandsystemctl list-units
is the column containing unit names titled with “UNIT”.
- last: int | None¶
Timestamp
- next: int | None¶
Timestamp
- class NameFilter(unit_names: Sequence[str] = ())[source]¶
Bases:
object
This class stores all system unit names (e. g.
nginx.service
orfstrim.timer
) and provides a interface to filter the names by regular expressions.- static match(unit_name: str, regexes: str | Sequence[str]) bool [source]¶
Match multiple regular expressions against a unit name.
- Parameters:
unit_name – The unit name to be matched.
regexes – A single regular expression (
include='.*service'
) or a list of regular expressions (include=('.*service', '.*mount')
).
- Returns:
True if one regular expression matches
- add(unit_name: str) None [source]¶
Add one unit name.
- Parameters:
unit_name – The name of the unit, for example
apt.timer
.
- filter(include: str | Sequence[str] | None = None, exclude: str | Sequence[str] | None = None) Generator[str, None, None] [source]¶
List all unit names or apply filters (
include
orexclude
) to the list of unit names.- Parameters:
include – If the unit name matches the provided regular expression, it is included in the list of unit names. A single regular expression (
include='.*service'
) or a list of regular expressions (include=('.*service', '.*mount')
).exclude – If the unit name matches the provided regular expression, it is excluded from the list of unit names. A single regular expression (
exclude='.*service'
) or a list of regular expressions (exclude=('.*service', '.*mount')
).
- class Cache[source]¶
Bases:
Generic
[T
]This class is a container class for systemd units.
- filter(include: str | Sequence[str] | None = None, exclude: str | Sequence[str] | None = None) Generator[T, None, None] [source]¶
List all units or apply filters (
include
orexclude
) to the list of unit.- Parameters:
include – If the unit name matches the provided regular expression, it is included in the list of unit names. A single regular expression (
include='.*service'
) or a list of regular expressions (include=('.*service', '.*mount')
).exclude – If the unit name matches the provided regular expression, it is excluded from the list of unit names. A single regular expression (
exclude='.*service'
) or a list of regular expressions (exclude=('.*service', '.*mount')
).
- property count: int¶
- static get_interface_name_from_unit_name(unit_name: str) str [source]¶
- Parameters:
name – for example apt-daily.service
- Returns:
org.freedesktop.systemd1.Service
- static get_interface_name_from_object_path(object_path: str) str [source]¶
- Parameters:
object_path – for example
/org/freedesktop/systemd1/unit/apt_2ddaily_2eservice
- Returns:
org.freedesktop.systemd1.Service
- static is_unit_type(unit_name_or_object_path: str, type_name: Literal['service', 'socket', 'target', 'device', 'mount', 'automount', 'timer', 'swap', 'path', 'slice', 'scope']) bool [source]¶
- abstract property startup_time: float | None¶
- class check_systemd.CliSource[source]¶
Bases:
Source
- class Table(stdout: str)[source]¶
Bases:
object
This class reads the text tables that some systemd commands like
systemctl list-units
orsystemctl list-timers
produce.- header_row: str¶
- column_lengths: list[int]¶
- columns: list[str]¶
- body_rows: list[str]¶
- property row_count: int¶
The number of rows. Only the body rows are counted. The header row is not taken into account.
- check_header(column_header: Sequence[str]) None [source]¶
Check if the specified column names are present in the header row of the text table. Raise an exception if not.
- Parameters:
column_headers – The expected column headers (for example
('UNIT', 'LOAD', 'ACTIVE')
)
- property startup_time: float | None¶
- class check_systemd.GiSource[source]¶
Bases:
CliSource
Data source via D-Bus using the
gi
(GObject introspection) package.TODO Intherit from DataSource if the full Dbus Api is implemented
This class holds the main entry point object of the D-Bus systemd API. See the section The Manager Object in the systemd D-Bus API.
- class UnitTuple(name, description, load_state, active_state, sub_state, followed_by, unit_object_path, job_id, job_type, job_object_path)[source]¶
Bases:
NamedTuple
- name: str¶
The primary unit name as string, for example
dbus.service
- description: str¶
The human readable description string, for example
D-Bus System Message Bus
- load_state: Literal['stub', 'loaded', 'not-found', 'bad-setting', 'error', 'merged', 'masked']¶
The load state (i.e. whether the unit file has been loaded successfully), for example
loaded
- active_state: Literal['active', 'reloading', 'inactive', 'failed', 'activating', 'deactivating']¶
The active state (i.e. whether the unit is currently started or not), for example
active
- sub_state: Literal['abandoned', 'activating-done', 'activating', 'active', 'auto-restart', 'cleaning', 'condition', 'deactivating-sigkill', 'deactivating-sigterm', 'deactivating', 'dead', 'elapsed', 'exited', 'failed', 'final-sigkill', 'final-sigterm', 'final-watchdog', 'listening', 'mounted', 'mounting-done', 'mounting', 'plugged', 'reload', 'remounting-sigkill', 'remounting-sigterm', 'remounting', 'running', 'start-chown', 'start-post', 'start-pre', 'start', 'stop-post', 'stop-pre-sigkill', 'stop-pre-sigterm', 'stop-pre', 'stop-sigkill', 'stop-sigterm', 'stop-watchdog', 'stop', 'tentative', 'unmounting-sigkill', 'unmounting-sigterm', 'unmounting', 'waiting']¶
The sub state (a more fine-grained version of the active state that is specific to the unit type, which the active state is not), for example
running
- followed_by: str¶
A unit that is being followed in its state by this unit, if there is any, otherwise the empty string, for example
''
- unit_object_path: str¶
The unit object path, for example
/org/freedesktop/systemd1/unit/dbus_2eservice
- job_id: str¶
If there is a job queued for the job unit, the numeric job id, 0 otherwise, for example
0
- job_type: str¶
The job type as string, for example
''
- job_object_path: str¶
The job object path, for example
/
- class Proxy(object_path: str, interface_name: str, user: bool = False)[source]¶
Bases:
object
- property object_path: str¶
- property interface_name: str¶
- class ManagerProxy(user: bool = False)[source]¶
Bases:
Proxy
- property default_target: str¶
- property userspace_timestamp_monotonic: int¶
- class UnitProxy(name: str | None = None, object_path: str | None = None, user: bool = False)[source]¶
Bases:
Proxy
- property active_state: str¶
- property sub_state: str¶
- property load_state: str¶
- property active_enter_timestamp_monotonic: int¶
- class TimerProxy(name: str | None = None, object_path: str | None = None, user: bool = False)[source]¶
Bases:
UnitProxy
- property last: int¶
Timestamp in microseconds
- property next: int¶
Timestamp in microseconds
- classmethod get_manager(user: bool = False) ManagerProxy [source]¶
- property manager: ManagerProxy¶
- property startup_time: float | None¶
src/analyze/analyze-time-data.c <https://github.com/systemd/systemd/blob/1f901c24530fb9b111126381a6ea101af8040e65/src/analyze/analyze-time-data.c#L141-L197>
- class check_systemd.OptionContainer[source]¶
Bases:
object
This class has the same attributes as the
Namespace
instance returned by theargparse
package.- verbose: int¶
- debug: int¶
- ignore_inactive_state: bool¶
- include_unit: str | None¶
- include_type: list[str]¶
- exclude_unit: list[str]¶
- exclude_type: list[str]¶
- expected_state: str | None¶
- scope_timers: bool¶
- timers_warning: int¶
- timers_critical: int¶
- scope_startup_time: bool¶
- warning: int¶
-w
,--warning
- critical: int¶
-c
,--critical
- user: bool¶
--user
- performance_data: bool¶
- include: list[str]¶
- exclude: list[str]¶
- data_source: Literal['dbus', 'cli'] | None¶
- check_systemd.opts¶
We make is variable global to be able to access the command line arguments everywhere in the plugin. In this variable the result of parse_args() is stored. It is an instance of the argparse.Namespace class. This variable is initialized in the main function. The variable is intentionally not named
args
to avoid confusion with*args
(Non-Keyword Arguments).
- exception check_systemd.CheckSystemdError[source]¶
Bases:
Exception
Base class for exceptions in this module. All exceptions are caught by the decorator
@nagiosplugin.guarded()
on the main function and printed out nicely.
- exception check_systemd.CheckSystemdRegexpError[source]¶
Bases:
CheckSystemdError
Raised when an invalid regular expression is specified.
- class check_systemd.SystemdUnitTypesList(*args: str)[source]¶
Bases:
MutableSequence
[str
]- unit_types: list[str]¶
- class check_systemd.TimersResource(source: Source)[source]¶
Bases:
Resource
Resource that calls
systemctl list-timers --all
on the command line to get informations about dead / inactive timers. There is one type of systemd “degradation” which is normally not detected: dead / inactive timers.- Parameters:
excludes (list) – A list of systemd unit names to exclude from the checks.
- name¶
- class check_systemd.StartupTimeResource(source: Source)[source]¶
Bases:
Resource
Resource that calls
systemd-analyze
on the command line to get informations about the startup time.
- class check_systemd.StartupTimeContext[source]¶
Bases:
ScalarContext
- performance(metric: Metric, resource: Resource) Performance | None [source]¶
Derives performance data.
The metric’s attributes are combined with the local
warning
andcritical
ranges to get a fully populatedPerformance
object.- Parameters:
metric – metric from which performance data are derived
resource – not used
- Returns:
Performance
object
- class check_systemd.PerformanceDataContext[source]¶
Bases:
Context
- performance(metric: Metric, resource: Resource) Performance [source]¶
Derives performance data from a given metric.
- Parameters:
metric – associated metric from which performance data are derived
resource – resource that produced the associated metric (may optionally be consulted)
- Returns:
Perfdata
object
- class check_systemd.SystemdSummary[source]¶
Bases:
Summary
Format the different status lines. A subclass of nagiosplugin.Summary.
- ok(results: Results) str [source]¶
Formats status line when overall state is ok.
- Parameters:
results –
Results
container- Returns:
status line
- check_systemd.convert_to_regexp_list(regexp: Sequence[str] | None = None, unit_names: str | Sequence[str] | None = None, unit_types: Sequence[str] | None = None) set[str] [source]¶
- check_systemd.normalize_argparser(opts: Namespace) OptionContainer [source]¶
- check_systemd.main() None [source]¶
The main entry point of the monitoring plugin. First the command line arguments are read into the variable
opts
. The configuration of thisopts
object decides which instances of the Resource, Context and Summary subclasses are assembled in a list calledtasks
. This list is passed the main class of thenagiosplugin
library: the Check class.