-
Notifications
You must be signed in to change notification settings - Fork 185
Action chains implementation
This page describes the implementation of the action chains for regular Salt minion and for SSH minions.
The action chain implementation for Salt uses the same database tables used by the traditional clients.
+------------------------------+ +----------------------------------------+
| | | |
| rhnActionChain | | rhnActionChainEntry |
| | | |
-------------------------------- ------------------------------------------
| | | |
| id | | actionchain_id -> rhnActionChain(id) |
| label | 1 * | action_id -> rhnAction(id) |
| user_id -> web_contact(id) <-------------------- server_id -> rhnServer(id) |
| created | | sort_order |
| modified | | created |
| | | modified |
| | | |
| | | |
| | | |
| | | |
+------------------------------+ +----------------------------------------+
An action chain has one or more entries. Each entry points to an Action
and to a Server
entity. For each server there is one action being created, even if the actions are added from SSM and target multiple servers. The Action
doesn't have any corresponding ServerAction
until it's executed. The target server for the action is stored in the ActionChainEntry
.
The entries are ordered according to the field sort_order
.
When the action chain is executed by Taskomatic:
-
ServerAction
s are created to store the result of the execution - for each minion an
.sls
file is generated that contains states for each action in the chain. This.sls
file is then applied to the target minion by a custom modulemgractionchains
. Under the hood this module does astate.sls <generated sls>
. - the action chain is deleted from the database
The steps are the following:
- The user creates the action chain by adding one or more actions to it.
ActionChainEntry
objects are stored in the db, one for each target server. NoServerAction
s are created yet. - The user schedules the execution at a certain date and time
- Taskomatic executes the action chain at the schedule time by invoking
MinionActionChainExecutor
:-
For each minion,
ServerAction
and theLocalCall
objects are created. -
The
LocalCall
objects are converted intoSaltState
objects. Reboot actions are conerted intoSaltSystemReboot
objects while all other actions are converted intoSaltModuleRun
objects. -
For
salt-ssh
minions include additional files in the state tarball that gets copied over to the SSH minion.By default,
salt-ssh
tries to figure out what files to include in the state tarball, besides the.sls
to be applied. However Uyuni typically usesmgrcompat.module_run
+state.apply
to apply the.sls
file that corresponds to a particular action type. E.g.:mgr_actionchain_131_action_1_chunk_1: mgrcompat.module_run: - name: state.apply - mods: remotecommands . . .
Any state applied this can't be inspected by Salt for things like includes,
salt://...
urls, etc. Therefore any additional files referenced by the state applied viamgrcompat.module_run + state.apply
need to be included explicitly. -
The
SaltState
objects are rendered into.sls
files. The resulting.sls
files contain states for eachAction
, with one state perAction
.If one of the actions is a reboot action or a package upgrade action that touches the
salt-minion
package the resulting.sls
is split in two or more chunks depending on the number of reboot/package upgrade actions.The resulting files are written to
/srv/susemanager/salt/actionchains/
. The filename has the formatactionchain_<CHAIN_ID>_<MACHINE_ID>_<CHUNK>.sls
Example 1 An action on two minions is added from SSM to an action chain. The resulting files are:
/srv/susemanager/salt/actionchains/actionchain_45_df00a3b56f3aa159746b8c835eaaeede_1.sls /srv/susemanager/salt/actionchains/actionchain_45_dec27e1bc3cffa7749c965c15eaca15c_1.sls
Example 2 A simple action chain:
- Action 1: Run script
- Action 2: Apply highstate
is rendered to:
mgr_actionchain_131_action_1_chunk_1: mgrcompat.module_run: - name: state.apply - mods: remotecommands - kwargs: pillar: mgr_remote_cmd_runas: joe mgr_remote_cmd_script: salt://scripts/script_1.sh mgr_actionchain_131_action_2_chunk_1: mgrcompat.module_run: - name: state.apply - require: - mgrcompat: mgr_actionchain_131_action_1_chunk_1
Example 3 An action chain containing a reboot action:
- Action 1: Run script
- Action 2: Reboot
- Action 3: Apply highstate
is rendered to:
Chunk 1 (
actionchain_45_df00a3b56f3aa159746b8c835eaaeede_1.sls
):mgr_actionchain_131_action_1_chunk_1: mgrcompat.module_run: - name: state.apply - mods: remotecommands - kwargs: pillar: mgr_remote_cmd_runas: foobar mgr_remote_cmd_script: salt://scripts/script_1.sh mgr_actionchain_131_action_2_chunk_1: mgrcompat.module_run: - name: system.reboot - at_time: 1 - require: - mgrcompat: mgr_actionchain_131_action_1_chunk_1 schedule_next_chunk: mgrcompat.module_run: - name: mgractionchains.next - actionchain_id: 131 - chunk: 2 - next_action_id: 3 - require: - mgrcompat: mgr_actionchain_131_action_2_chunk_1
Chunk 2 (
actionchain_45_df00a3b56f3aa159746b8c835eaaeede_2.sls
):mgr_actionchain_131_action_3_chunk_2: mgrcompat.module_run: - name: state.apply
-
Split the minions into regular minions and salt-ssh minions
-
Invoke Salt to execute the action chain by calling the custom module
mgractionchains.start
. This will calculate the name of the chunk based on the action chain id and the machine id of the minion and will do astate.sls <CHUNK>
to apply the file generated in step 4.For regular minions the execution happens asynchronously.
For salt-ssh minions the execution is synchronous.
-
The action chain is deleted from the database.
For regular minions this happens once all the Salt calls are made. There is no waiting for the Salt jobs to return because regular minions operate in an asynchronous manner.
For SSH minions the delete happens once all the
salt-ssh
calls complete and the response is returned. This is because thesalt-api
works synchronously for SSH minions.
-
As mentioned above, the generated .sls
file is split into multiple chunks in case the action chain contains reboot actions or upgrade actions that affect the salt-minion
package (for regular minions).
The resume mechanism is different for regular minions and for SSH minions.
-
The first chunk is executed/applied by
mgractionchains.start
. -
If there are multiple chunks, the next chunk to be applied will be saved in a local file on the minion (in
/etc/salt/minion.d/_mgractionchains.conf
). This is done by adding a call tomgractionchains.next
in the.sls
file of the chunk as the last state:... schedule_next_chunk: mgrcompat.module_run: - name: mgractionchains.next - actionchain_id: <CHAIN_ID> - chunk: <NEXT_CHUNK> - next_action_id: <FIRST_ACTION_ID_IN_THE_NEXT_CHUNK> - require: - mgrcompat: <LAST_ACTION_IN_THIS_CHUNK>
-
After a reboot or
salt-minion
package upgrade, the minion service is started and theminion/start
event is fired. This triggers a reactor on the master which calls themgractionchains.resume
module on the minion.The
mgractionchains.resume
module reads the file/etc/salt/minion.d/_mgractionchains.conf
to get the<NEXT_CHUNK>
, it deletes the file and then does astate.sls <NEXT_CHUNK>
.The reactor is configured in
/etc/salt/master.d/susemanager.conf
:... reactor: - 'salt/minion/*/start': - /usr/share/susemanager/reactor/resume_action_chain.sls ...
If the next chunk contains again a reboot, the steps
2
and3
are repeated.
The generated .sls
is split into chunks only if there's a reboot action present in the chain. Upgrading the salt-minion
package doesn't affect an SSH minion so there's no need to split.
- Check first if there are any pending action chains to be resumed by calling
mgractionchains.get_pending_resume
. - If there's not pending action chain to be resumed, execute the first chunk by calling
mgractionchains.start
synchronously. This may trigger a reboot. If there's already an action chain to be resumed on a minion set theServerActions
s of that minion to failed to avoid concurrent execution. - Handle the execution result and update the corresponding
ServerAction
s. The reboot action is set toSTATUS_PICKED_UP
. - The
SSHPushWorkerSalt
will check periodically for SSH minions that have any of these:- reboot actions older than 4 minutes
- queued actions that have as prerequisite a completed reboot action.
- Call
mgractionchains.get_pending_resume
on each of the minions found in the previous step to get the information on which action chain and chunk to resume - Call
actionchains.resumessh
synchronously and handle the results updating the correspondingServerAction
s
Independent of steps 4-6, the SSHPushWorkerSalt
always updates system information like the kernel version and the uptime. It also sets reboot actions to STATUS_COMPLETED
if one of:
- the
ServerAction
is inSTATUS_PICKED_UP
and the boot time is after the action pickup time - the
ServerAction
is inSTATUS_PICKED_UP
but the pickup time is missing and the boot time is after the schedule time of theAction.earliestAction
- the action is in
STATUS_QUEUED
and the boot time is afterAction.earliestAction
For regular minions, the generated .sls
files are removed after the job return event is processed.
For SSH minions, the generated .sls
files are removed after the synchronous calls finish and the result is processed.
-
Asynchronous execution for SSH minions. One option would be to create a new job type to execute
mgractionchains.start
and to schedule a job execution for each SSH minion instead of making a synchronous call to thesalt-api
. A similar approach is used for executing regular actions on SSH minions - instead of making a sync call, assh-minion-action-executor
job is scheduled for each SSH minion and the number of parallel jobs is controlled with thetaskomatic.com.redhat.rhn.taskomatic.task.SSHMinionActionExecutor.parallel_threads
parameter. -
Improve error handling. See issue https://github.com/SUSE/spacewalk/issues/12826.
One solution would be to keep the action chain in the DB but add a flag to the
rhnActionChain
table and set this flag to true for the action chains that have already been started. This way only the action chains not yet scheduled can be shown in the UI while allowing for better error handling. -
Rename the
SSHPushWorkerSalt
to something more appropriate (see issue https://github.com/SUSE/spacewalk/issues/12914).