Welcome to check_systemd’s documentation!¶
check_system
is a Nagios / Icinga monitoring plugin to check systemd. This Python script
will report a degraded system to your monitoring solution. It can also be used
to monitor individual systemd services (with the -u, --unit
parameter) and
timers units (with the -t, --dead-timers
parameter).
To learn more about the project, please visit the repository on Github.
Monitoring scopes¶
units
: State of unitestimers
: Timersstartup_time
: Startup timeperformance_data
: Performance data
Data sources¶
D-Bus (
dbus
)Command line interface (
cli
)
This plugin is based on a Python package named nagiosplugin. nagiosplugin
has a fine-grained
class model to separate concerns. A Nagios / Icinga plugin must perform these
three steps: data acquisition, evaluation and presentation.
nagiosplugin
provides for this three steps three classes: Resource
,
Context
, Summary
. check_systemd
extends this three model classes in
the following subclasses:
Acquisition (Resource
)¶
UnitsResource
(context=units
)TimersResource
(context=timers
)StartupTimeResource
(context=startup_time
)PerformanceDataResource
(context=performance_data
)
Evaluation (Context
)¶
UnitsContext
(context=units
)TimersContext
(context=timers
)StartupTimeContext
(context=timers
)PerformanceDataContext
(context=performance_data
)
Presentation (Summary
)¶
- check_systemd.is_gi¶
true if the package PyGObject (gi) is available.
- class check_systemd.OptionContainer[source]¶
Bases:
object
This class has the same attributes as the
Namespace
instance returned by theargparse
package.- verbose: int¶
- debug: int¶
- ignore_inactive_state: bool¶
- include_unit: str | None¶
- include_type: list[str]¶
- exclude_unit: list[str]¶
- exclude_type: list[str]¶
- expected_state: str | None¶
- scope_timers: bool¶
- timers_warning: float¶
- timers_critical: float¶
- scope_startup_time: bool¶
- warning: float¶
- critical: float¶
- with_user_units: bool¶
- performance_data: bool¶
- include: list[str]¶
- exclude: list[str]¶
- data_source: Literal['dbus', 'cli'] | None¶
- check_systemd.opts¶
We make is variable global to be able to access the command line arguments everywhere in the plugin. In this variable the result of parse_args() is stored. It is an instance of the argparse.Namespace class. This variable is initialized in the main function. The variable is intentionally not named
args
to avoid confusion with*args
(Non-Keyword Arguments).
- class check_systemd.DbusManager[source]¶
Bases:
object
This class holds the main entry point object of the D-Bus systemd API. See the section The Manager Object in the systemd D-Bus API.
- property manager: DBusProxy¶
- check_systemd.dbus_manager¶
The systemd D-Bus API main entry point object, the so called “manager”.
- check_systemd.format_timespan_to_seconds(fmt_timespan: str) float [source]¶
Convert a timespan format string into secondes. Take a look at the systemd time-util.c source code.
- Parameters:
fmt_timespan – for example
2.345s
or3min 45.234s
or34min left
or2 months 8 days
- Returns:
The seconds
- check_systemd.execute_cli(args: str | Sequence[str]) str | None [source]¶
Execute a command on the command line (cli = command line interface)) and capture the stdout. This is a wrapper around
subprocess.Popen
.- Parameters:
args – A list of programm arguments.
- Raises:
nagiosplugin.CheckError – If the command produces some stderr output or if an OSError exception occurs.
- Returns:
The stdout of the command.
- class check_systemd.TableParser(stdout: str)[source]¶
Bases:
object
This class reads the text tables that some systemd commands like
systemctl list-units
orsystemctl list-timers
produce.- header_row: str¶
- column_lengths: list[int]¶
- columns: list[str]¶
- body_rows: list[str]¶
- property row_count¶
The number of rows. Only the body rows are counted. The header row is not taken into account.
- check_header(column_header: Sequence[str]) None [source]¶
Check if the specified column names are present in the header row of the text table. Raise an exception if not.
- Parameters:
column_headers – The expected column headers (for example
('UNIT', 'LOAD', 'ACTIVE')
)
- exception check_systemd.CheckSystemdError[source]¶
Bases:
Exception
Base class for exceptions in this module. All exceptions are caught by the decorator
@nagiosplugin.guarded()
on the main function and printed out nicely.
- exception check_systemd.CheckSystemdRegexpError[source]¶
Bases:
CheckSystemdError
Raised when an invalid regular expression is specified.
- check_systemd.match_multiple(unit_name: str, regexes: str | Sequence[str]) bool [source]¶
Match multiple regular expressions against a unit name.
- Parameters:
unit_name – The unit name to be matched.
regexes – A single regular expression (
include='.*service'
) or a list of regular expressions (include=('.*service', '.*mount')
).
- Returns:
True if one regular expression matches
- check_systemd.ActiveState¶
From the D-Bus interface of systemd documentation:
ActiveState
contains a state value that reflects whether the unit is currently active or not. The following states are currently defined:active
,reloading
,inactive
,failed
,activating
, anddeactivating
.
active
indicates that unit is active (obviously…).reloading
indicates that the unit is active and currently reloading its configuration.inactive
indicates that it is inactive and the previous run was successful or no previous run has taken place yet.failed
indicates that it is inactive and the previous run was not successful (more information about the reason for this is available on the unit type specific interfaces, for example for services in the Result property, see below).activating
indicates that the unit has previously been inactive but is currently in the process of entering an active state.Conversely
deactivating
indicates that the unit is currently in the process of deactivation.alias of
Literal
[‘active’, ‘reloading’, ‘inactive’, ‘failed’, ‘activating’, ‘deactivating’]
- check_systemd.SubState¶
From the D-Bus interface of systemd documentation:
SubState
encodes states of the same state machine thatActiveState
covers, but knows more fine-grained states that are unit-type-specific. WhereActiveState
only covers six high-level states,SubState
covers possibly many more low-level unit-type-specific states that are mapped to the six high-level states. Note that multiple low-level states might map to the same high-level state, but not vice versa. Not all high-level states have low-level counterparts on all unit types.All sub states are listed in the file basic/unit-def.c of the systemd source code:
automount:
dead
,waiting
,running
,failed
device:
dead
,tentative
,plugged
- mount:
dead
,mounting
,mounting-done
,mounted
, remounting
,unmounting
,remounting-sigterm
,remounting-sigkill
,unmounting-sigterm
,unmounting-sigkill
,failed
,cleaning
- mount:
path:
dead
,waiting
,running
,failed
- scope:
dead
,running
,abandoned
,stop-sigterm
, stop-sigkill
,failed
- scope:
- service:
dead
,condition
,start-pre
,start
, start-post
,running
,exited
,reload
,stop
,stop-watchdog
,stop-sigterm
,stop-sigkill
,stop-post
,final-watchdog
,final-sigterm
,final-sigkill
,failed
,auto-restart
,cleaning
- service:
slice:
dead
,active
- socket:
dead
,start-pre
,start-chown
,start-post
, listening
,running
,stop-pre
,stop-pre-sigterm
,stop-pre-sigkill
,stop-post
,final-sigterm
,final-sigkill
,failed
,cleaning
- socket:
- swap:
dead
,activating
,activating-done
,active
, deactivating
,deactivating-sigterm
,deactivating-sigkill
,failed
,cleaning
- swap:
target:
dead
,active
timer:
dead
,waiting
,running
,elapsed
,failed
alias of
Literal
[‘abandoned’, ‘activating-done’, ‘activating’, ‘active’, ‘auto-restart’, ‘cleaning’, ‘condition’, ‘deactivating-sigkill’, ‘deactivating-sigterm’, ‘deactivating’, ‘dead’, ‘elapsed’, ‘exited’, ‘failed’, ‘final-sigkill’, ‘final-sigterm’, ‘final-watchdog’, ‘listening’, ‘mounted’, ‘mounting-done’, ‘mounting’, ‘plugged’, ‘reload’, ‘remounting-sigkill’, ‘remounting-sigterm’, ‘remounting’, ‘running’, ‘start-chown’, ‘start-post’, ‘start-pre’, ‘start’, ‘stop-post’, ‘stop-pre-sigkill’, ‘stop-pre-sigterm’, ‘stop-pre’, ‘stop-sigkill’, ‘stop-sigterm’, ‘stop-watchdog’, ‘stop’, ‘tentative’, ‘unmounting-sigkill’, ‘unmounting-sigterm’, ‘unmounting’, ‘waiting’]
- check_systemd.LoadState¶
From the D-Bus interface of systemd documentation:
LoadState
contains a state value that reflects whether the configuration file of this unit has been loaded. The following states are currently defined:loaded
,error
andmasked
.
loaded
indicates that the configuration was successfully loaded.error
indicates that the configuration failed to load, theLoadError
field contains information about the cause of this failure.masked
indicates that the unit is currently masked out (i.e. symlinked to /dev/null or suchlike).Note that the
LoadState
is fully orthogonal to theActiveState
(see below) as units without valid loaded configuration might be active (because configuration might have been reloaded at a time where a unit was already active).alias of
Literal
[‘loaded’, ‘error’, ‘masked’]
- class check_systemd.Unit(**kwargs)[source]¶
Bases:
object
This class bundles all state related informations of a systemd unit in a object. This class is inherited by the class
DbusUnit
and the attributes are overwritten by properties.- name: str¶
The name of the system unit, for example
nginx.service
. In the command line table of the commandsystemctl list-units
is the column containing unit names titled with “UNIT”.
- active_state: Literal['active', 'reloading', 'inactive', 'failed', 'activating', 'deactivating']¶
- sub_state: Literal['abandoned', 'activating-done', 'activating', 'active', 'auto-restart', 'cleaning', 'condition', 'deactivating-sigkill', 'deactivating-sigterm', 'deactivating', 'dead', 'elapsed', 'exited', 'failed', 'final-sigkill', 'final-sigterm', 'final-watchdog', 'listening', 'mounted', 'mounting-done', 'mounting', 'plugged', 'reload', 'remounting-sigkill', 'remounting-sigterm', 'remounting', 'running', 'start-chown', 'start-post', 'start-pre', 'start', 'stop-post', 'stop-pre-sigkill', 'stop-pre-sigterm', 'stop-pre', 'stop-sigkill', 'stop-sigterm', 'stop-watchdog', 'stop', 'tentative', 'unmounting-sigkill', 'unmounting-sigterm', 'unmounting', 'waiting']¶
- load_state: Literal['loaded', 'error', 'masked']¶
- class check_systemd.UnitNameFilter(unit_names: Sequence[str] = ())[source]¶
Bases:
object
This class stores all system unit names (e. g.
nginx.service
orfstrim.timer
) and provides a interface to filter the names by regular expressions.- add(unit_name: str) None [source]¶
Add one unit name.
- Parameters:
unit_name – The name of the unit, for example
apt.timer
.
- list(include: str | Sequence[str] | None = None, exclude: str | Sequence[str] | None = None) Generator[str, None, None] [source]¶
List all unit names or apply filters (
include
orexclude
) to the list of unit names.- Parameters:
include – If the unit name matches the provided regular expression, it is included in the list of unit names. A single regular expression (
include='.*service'
) or a list of regular expressions (include=('.*service', '.*mount')
).exclude – If the unit name matches the provided regular expression, it is excluded from the list of unit names. A single regular expression (
exclude='.*service'
) or a list of regular expressions (exclude=('.*service', '.*mount')
).
- class check_systemd.UnitCache[source]¶
Bases:
object
This class is a container class for systemd units.
- add_unit(unit: Unit | None = None, name: str | None = None, active_state: Literal['active', 'reloading', 'inactive', 'failed', 'activating', 'deactivating'] | None = None, sub_state: Literal['abandoned', 'activating-done', 'activating', 'active', 'auto-restart', 'cleaning', 'condition', 'deactivating-sigkill', 'deactivating-sigterm', 'deactivating', 'dead', 'elapsed', 'exited', 'failed', 'final-sigkill', 'final-sigterm', 'final-watchdog', 'listening', 'mounted', 'mounting-done', 'mounting', 'plugged', 'reload', 'remounting-sigkill', 'remounting-sigterm', 'remounting', 'running', 'start-chown', 'start-post', 'start-pre', 'start', 'stop-post', 'stop-pre-sigkill', 'stop-pre-sigterm', 'stop-pre', 'stop-sigkill', 'stop-sigterm', 'stop-watchdog', 'stop', 'tentative', 'unmounting-sigkill', 'unmounting-sigterm', 'unmounting', 'waiting'] | None = None, load_state: Literal['loaded', 'error', 'masked'] | None = None) Unit [source]¶
- list(include: str | Sequence[str] | None = None, exclude: str | Sequence[str] | None = None) Generator[Unit, None, None] [source]¶
List all units or apply filters (
include
orexclude
) to the list of unit.- Parameters:
include – If the unit name matches the provided regular expression, it is included in the list of unit names. A single regular expression (
include='.*service'
) or a list of regular expressions (include=('.*service', '.*mount')
).exclude – If the unit name matches the provided regular expression, it is excluded from the list of unit names. A single regular expression (
exclude='.*service'
) or a list of regular expressions (exclude=('.*service', '.*mount')
).
- property count: int¶
- check_systemd.unit_cache: UnitCache¶
An instance of
DbusUnitCache
orCliUnitCache
- class check_systemd.TimersResource[source]¶
Bases:
Resource
Resource that calls
systemctl list-timers --all
on the command line to get informations about dead / inactive timers. There is one type of systemd “degradation” which is normally not detected: dead / inactive timers.- Parameters:
excludes (list) – A list of systemd unit names to exclude from the checks.
- name¶
- class check_systemd.StartupTimeResource[source]¶
Bases:
Resource
Resource that calls
systemd-analyze
on the command line to get informations about the startup time.
- class check_systemd.StartupTimeContext[source]¶
Bases:
ScalarContext
- performance(metric: Metric, resource: Resource)[source]¶
Derives performance data.
The metric’s attributes are combined with the local
warning
andcritical
ranges to get a fully populatedPerformance
object.- Parameters:
metric – metric from which performance data are derived
resource – not used
- Returns:
Performance
object
- class check_systemd.SystemdSummary[source]¶
Bases:
Summary
Format the different status lines. A subclass of nagiosplugin.Summary.
- ok(results: Results) str [source]¶
Formats status line when overall state is ok.
- Parameters:
results –
Results
container- Returns:
status line
- check_systemd.convert_to_regexp_list(regexp: Sequence[str] | None = None, unit_names: str | Sequence[str] | None = None, unit_types: Sequence[str] | None = None) set[str] [source]¶
- check_systemd.normalize_argparser(opts: Namespace) OptionContainer [source]¶
- check_systemd.main() None [source]¶
The main entry point of the monitoring plugin. First the command line arguments are read into the variable
opts
. The configuration of thisopts
object decides which instances of the Resource, Context and Summary subclasses are assembled in a list calledtasks
. This list is passed the main class of thenagiosplugin
library: the Check class.