This repository contains a Python application to monitor job executions in a KNIME Server using its REST API. For each job executed, the application creates a backup folder containing job information, workflow summary and a knwf file that can be imported in the KNIME Server to recreate the workflow that has been executed. The main idea is to monitor who is executing each job, even if the user deletes the job or the workflow in the server. In addition to creating the backup folder, a XML containing the job information is sent to an ActiveMQ queue for auditing purposes.
The code can be summarized in the following steps:
- Create a thread that is going to tail the KNIME Server tomcat logs.
- The thread is going to extract the job_id from the logs and send it to a thread-safe FIFO queue.
- The main thread is going to retrieve the job_id and perform the following steps.
- Use the
GET https://<serverurl>:<port>/knime/rest/v4/jobs/{job_id}
to retrieve job information. - Use the
GET https://<serverurl>:<port>/knime/rest/v4/jobs/{job_id}/workflow-summary?format=JSON&includeExecutionInfo=true
to retrieve the workflow information. - Use the
GET https://<serverurl>:<port>/knime/rest/v4/repository/{workflow_path}:data
to download the workflow .knwf file. - Unzip the .knwf file to get the settings.xml information and filter the intermediate data we don't want.
- Store the job information, the workflow summary, and the filtered .knwf data into a backup folder.
- Generate an XML with the job information required and send it to the ActiveMQ queue for auditing.
The code has been designed for a Python 3.6 version. The requirements include the pip packages:
requests
to perform the API calls.python-qpid-proton
to send the XML to the ActiveMQ queue.
The QPID Proton client, as it is based on C, requires some additional packages in order to work (below provided for RPM-based systems, for others check QPID Proton documentation):
python36-devel
gcc
gcc-c++
make
cmake
libuuid-devel
openssl-devel
cyrus-sasl-devel
cyrus-sasl-plain
cyrus-sasl-md5
optionally if you need SSL
With root privileges do the following:
- Clone or copy this repository into the KNIME Server.
- Ensure the bash script is executable:
chmod u+x knime_audit.sh
- Edit the
knime_audit_config.json
accordingly. - Ensure you have a valid Python3 environment with the requirements mentioned above installed.
- Enable the service:
cp knime_audit.service /etc/systemd/system/ systemctl daemon-reload systemctl enable knime_audit.service