"If you know the enemy and know yourself, you need not fear the result of a hundred battles." - Quote by Sun Tzu
An Open Source Big Data Forensics/Data Packet Analysis tool that analyzes information passively using Apache Pig Hadoop and Mongo. The community version does not do live packet captures and certain scripts have been removed as they can be misused. If you need analytics, forensics, security (penetration testing, intrusion detection systems) contact us and we may be able to help if the request is lawful.
DoD approved contractor
corporate@duasamericasgroup.com Created by Aslan Varoqua - Duas Americas Group Inc. . If Markdown is painful in your text editor, run lib/scripts/readme.py from this directory and it'll generate a README.html for you. You'll need the markdown python module installed. If you want to run the pig scripts you have to set the pcap parameter for the pcap you want to use. There is a small, test pcap file called data/web.pcap that you can test prior to running on your own pcaps. You can run locally:
or with a cluster setup:
You'll need to put files into HDFS to leverage the cluster setup. Also edit pig/include-hdfs.pig to specify your HDFS URI. A frontend to lib/scripts/tcp.py which gives you a record per TCP connection, along with src, dst, end state, timestamps of each packet, and intervals between each packet. A lightweight wrapper around Pig. It is a handy tool when switching between local and mapreduce mode without having to change many arguments, e.g. HDFS paths. It also has a basic set of sane default arguements to help retyping them all the time. An example usage of pigrun:
This will generate the command:
The list of available arguments are listed when running
Specify -i to get an interactive pig shell on the emr cluster. Check -h for full options or refer to X for examples. The following environment variables will configure the emr credentials for you:
Upload a single file into HDFS into a predetermined location. You should specify the env variable HDFS_MASTER to specify where the destination is. The env variable PREFIX determines the path to place the uploaded file. Uses The visualisations are pure HTML and JavaScript. You'll need to run a dumb web server that just serves files. Python does this well with a one-liner:
Run this in the root of the project, then access a visualisation via http://localhost:8888/vis/cube/cube.html for example. WebGL will require a browser that supports WebGL. The globe is a WebGL visualisation which displays the Earth with lines extruding out from it. The colour of the lines represent average severity in Snort attacks and the height of the lines for number of attacks. It expects the format to be It's in Trigram cube is a WebGL visualisation displaying 3 dimensions of data, designed for visualising trigrams. The First, run
The generated This is a visualisation that uses Ubigraph. It links domains by their subdomain parts. Download Ubigraph from http://ubietylab.net/ubigraph/content/Downloads/ then extract and run
These charts allow you to compare different sets of data together. Use either histogram or timeseries data in this format:
or
The vis is in The Unigram pig script at Note: If you inspect the output of Drag the combined-tweaked into the visualisation at Choropleth is a map of the Earth with countries shaded to a particular colour based on some data. The input expected for the Choropleth is "country code,value". Basically drag it into the drop zone and the countries get highlighted. It's in pig/examples/ contains various examples for you to try out. For each web-related snort alert found in a set of captures, the attacker User-Agent header is discovered. Arguments:
Sum ip packet length per time bin. Arguments:
Collect packets into bins of $time seconds. Additionally group by tcp, udp, and bandwidth. Arguments:
Shows DNS queries and responses. Arguments:
Show DNS response TTLs. Arguments:
Create a packet length histogram. Arguments:
Create an ngram. Arguments:
Find p0f fingerprints of snort attackers. Arguments:
Histogram for packets, ordered by the packet volume on dport. Arguments:
|