Skip to content

heiDGAF - a machine learning based DNS inspector to detect DGAs in the wild!

License

Notifications You must be signed in to change notification settings

stefanDeveloper/heiDGAF

Repository files navigation

Contributors Forks Stargazers Issues EUPL License


Logo

heiDGAF - Domain Generation Algorithms Finder

Machine learning-based DNS classifier for detecting Domain Generation Algorithms (DGAs), tunneling, and data exfiltration by malicious actors.
Explore the docs »

View Demo · Report Bug · Request Feature

Caution

The project is under active development right now. Everything might change, break, or move around quickly.

Continuous Integration Linux WorkFlows MacOS WorkFlows Windows WorkFlows

About the Project

Pipeline overview

Getting Started

If you want to use heiDGAF, just use the provided Docker compose to quickly bootstrap your environment:

docker compose -f docker/docker-compose.yml up

Terminal example

Developing

Important

More information will be added soon! Go and watch the repository for updates.

Install all Python requirements:

python -m venv .venv
source .venv/bin/activate

pip install -r requirements/requirements-dev.txt -r requirements/requirements.detector.txt -r requirements/requirements.logcollector.txt -r requirements/requirements.prefilter.txt -r requirements/requirements.inspector.txt

Now, you can start each stage, e.g. the inspector:

python src/inspector/main.py

Train your own models

Important

More information will be added soon! Go and watch the repository for updates.

Currently, we enable two trained models, namely XGBoost and RandomForest.

python -m venv .venv
source .venv/bin/activate

pip install -r requirements/requirements.train.txt

For training our models, we rely on the following data sets:

However, we compute all feature separately and only rely on the domain and class. Currently, we are only interested in binary classification, thus, the class is either benign or malicious.

(back to top)

Data

Important

We support custom schemes.

loglines:
  fields:
    - [ "timestamp", RegEx, '^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z$' ]
    - [ "status_code", ListItem, [ "NOERROR", "NXDOMAIN" ], [ "NXDOMAIN" ] ]
    - [ "client_ip", IpAddress ]
    - [ "dns_server_ip", IpAddress ]
    - [ "domain_name", RegEx, '^(?=.{1,253}$)((?!-)[A-Za-z0-9-]{1,63}(?<!-)\.)+[A-Za-z]{2,63}$' ]
    - [ "record_type", ListItem, [ "A", "AAAA" ] ]
    - [ "response_ip", IpAddress ]
    - [ "size", RegEx, '^\d+b$' ]

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Top contributors:

contrib.rocks image

(back to top)

License

Distributed under the EUPL License. See LICENSE.txt for more information.

(back to top)