Skip to content

Latest commit

 

History

History
14 lines (9 loc) · 2.72 KB

contributing-file-scanners.md

File metadata and controls

14 lines (9 loc) · 2.72 KB

Carved file scanners

Similar to the PCAP processing pipeline described above, new tools can plug into Malcolm's automatic file extraction and scanning to examine file transfers carved from network traffic.

When Zeek extracts a file it observes being transfered in network traffic, the file-monitor container picks up those extracted files and publishes to a ZeroMQ topic that can be subscribed to by any other process that wants to analyze that extracted file. In Malcolm at the time of this writing (as of the [v5.0.0 release]({{ site.github.repository_url }}/releases/tag/v5.0.0)), currently implemented file scanners include ClamAV, YARA, capa and VirusTotal, all of which are managed by the file-monitor container. The scripts involved in this code are:

  • [shared/bin/zeek_carve_watcher.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_watcher.py) - watches the directory to which Zeek extracts files and publishes information about those files to the ZeroMQ ventilator on port 5987
  • [shared/bin/zeek_carve_scanner.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_scanner.py) - subscribes to zeek_carve_watcher.py's topic and performs file scanning for the ClamAV, YARA, capa and VirusTotal engines and sends "hits" to another ZeroMQ sync on port 5988
  • [shared/bin/zeek_carve_logger.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_logger.py) - subscribes to zeek_carve_scanner.py's topic and logs hits to a "fake" Zeek signatures.log file that is parsed and ingested by Logstash
  • [shared/bin/zeek_carve_utils.py]({{ site.github.repository_url }}/blob/{{ site.github.build_revision }}/shared/bin/zeek_carve_utils.py) - various variables and classes related to carved file scanning

Additional file scanners could either be added to the file-monitor service; or to avoid coupling with Malcolm's code users could simply define a new service as instructed in the Adding a new service section and write custom scripts to subscribe and publish to the topics as described above. While this might be a bit of hand-waving, these general steps take care of the plumbing around extracting the file and notifying a new tool, as well as handling the logging of "hits": users shouldn't have to really edit any existing code to add a new carved file scanner.

The EXTRACTED_FILE_PIPELINE_VERBOSITY environment variables in can be set to -v, -vv, etc., to increase the verbosity of debug logging from the output of the containers involved in the carved file processing pipeline.