Anomaly Detection Case

Anomaly Detection Case

Anomaly Detection Case

Requirements

git
docker
docker-compose
make

Show Me the Demo

Clone the Repository:
```
git clone <link>
```
Navigate to the Working Directory:
```
cd anomaly_case
```
Run the Project:
```
make run
```
This will start containers for data population and anomaly detection. The dashboard will be available at http://localhost:5006/dashboard. It may take up to 5 minutes to display detected anomalies. To see changes in data refresh the page.

Data Simulation

What Is Our Data?

I aimed to work with a different dataset than classical seasonal data like Wi-Fi sessions or yearly sales. I chose bear activity data due to their hibernation patterns, which could reveal interesting insights.

I discovered a research article on the effects of human interaction on bears' stress levels and used it as inspiration. My virtual motivation was to create data to detect sudden changes in an endangered animal's activity and alert authorities for protection.

Generating the Data

I found an incomplete yearly hibernation dataset. To complete it, I extended the data by following the trend of initial hibernation. I added white noise to maintain the original entropy and generated the remaining data. The generated data is shown in green.

After generating the data for one season, I smoothed it using a windowed mean. I added white noise for natural deviations and occasional large values to simulate anomalies. An example of the data stream is shown below.

Anomaly Detection

I have decided to go with auto-encoding to implement anomaly detection in this project. In simple terms, our model tries to reconstruct the data we provided in training. And the loss is how far is newly introduced data from the guess of our model. If loss is larger than a certain threshold (setup manually since we only have non-anomaly data for training) the data point that is responsible for that loss is flagged as anomaly. Below you can see effectiveness of loss-threshold model in detecting introduced anomalies. First graph shows the loss vs threshold during time. The second graph shows, introduced anomalies in generated data.

Implementation and Used Technologies

C:.
│   .env
│   docker-compose.yml
│   Makefile
│   Readme.md
├───dashboard
│       dashboard.py
│       Dockerfile
│       requirements.txt
├───data_sim
│       data_simulator.py
│       df_filtered_extended_final.csv
│       Dockerfile
│       requirements.txt
├───detector
│       anomaly_detection.py
│       Dockerfile
│       requirements.txt
├───influxdb
│       Dockerfile
│       stream.json
└───preprocessing
        ad_from_experimentdata.ipynb
        anomaly_detection.ipynb
        data_gen.ipynb
        preprocess.ipynb

I aimed to prepare the project to be ready to run on any system that can run docker. The file hierarchy can be seen above. data_sim simulates the data stream and pushes the data to the timeseries database.

influxdb is the timeseries database used to push and read data. data_sim and detector pushes to influxdb, dashboard and detector reads from influxdb.

detector is the program where anomaly detection is done. It pulls first year of data from influxdb. Then trains the autoencoder lstm model with first years data. Then, it starts testing the stream data constantly being pushed by data_sim. Data flagged by detector is later pushed to influxdb.

dashboard, pulls stream data and anomaly flagged data from influxdb and exports the dashboard.

Preprocessing folder was used as playground.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.vscode		.vscode
dashboard		dashboard
data_sim		data_sim
detector		detector
docs		docs
influxdb		influxdb
preprocessing		preprocessing
.env		.env
.env.influxdb2-admin-password		.env.influxdb2-admin-password
.env.influxdb2-admin-token		.env.influxdb2-admin-token
.env.influxdb2-admin-username		.env.influxdb2-admin-username
.gitignore		.gitignore
Makefile		Makefile
Readme.md		Readme.md
docker-compose.yml		docker-compose.yml
image-1.png		image-1.png
image-2.png		image-2.png
image-3.png		image-3.png
image-4.png		image-4.png
image.png		image.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anomaly Detection Case

Requirements

Show Me the Demo

Data Simulation

What Is Our Data?

Generating the Data

Anomaly Detection

Implementation and Used Technologies

About

Releases

Packages

Languages

fthdrmzzz/anomaly-case

Folders and files

Latest commit

History

Repository files navigation

Anomaly Detection Case

Requirements

Show Me the Demo

Data Simulation

What Is Our Data?

Generating the Data

Anomaly Detection

Implementation and Used Technologies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages