Similar to the PCAP processing pipeline described above, new tools can plug into Malcolm's automatic file extraction and scanning to examine file transfers carved from network traffic.
When Zeek extracts a file it observes being transfered in network traffic, the file-monitor
container picks up those extracted files and publishes to a ZeroMQ topic that can be subscribed to by any other process that wants to analyze that extracted file. In Malcolm at the time of this writing (as of the [v5.0.0 release]({{ site.github.repository_url }}/releases/tag/v5.0.0)), currently implemented file scanners include ClamAV, YARA, capa and VirusTotal, all of which are managed by the file-monitor
container. The scripts involved in this code are:
- [shared/bin/zeek_carve_watcher.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_watcher.py) - watches the directory to which Zeek extracts files and publishes information about those files to the ZeroMQ ventilator on port 5987
- [shared/bin/zeek_carve_scanner.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_scanner.py) - subscribes to
zeek_carve_watcher.py
's topic and performs file scanning for the ClamAV, YARA, capa and VirusTotal engines and sends "hits" to another ZeroMQ sync on port 5988 - [shared/bin/zeek_carve_logger.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_logger.py) - subscribes to
zeek_carve_scanner.py
's topic and logs hits to a "fake" Zeek signatures.log file that is parsed and ingested by Logstash - [shared/bin/zeek_carve_utils.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_utils.py) - various variables and classes related to carved file scanning
Additional file scanners could either be added to the file-monitor
service; or to avoid coupling with Malcolm's code users could simply define a new service as instructed in the Adding a new service section and write custom scripts to subscribe and publish to the topics as described above. While this might be a bit of hand-waving, these general steps take care of the plumbing around extracting the file and notifying a new tool, as well as handling the logging of "hits": users shouldn't have to really edit any existing code to add a new carved file scanner.
The EXTRACTED_FILE_PIPELINE_VERBOSITY
environment variables in can be set to -v
, -vv
, etc., to increase the verbosity of debug logging from the output of the containers involved in the carved file processing pipeline.