This repository contains the source code for the research article "DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series" by
- Zahra Z. Darban [Monash University],
- Yiyuan Yang [University of Oxford],
- Geoffrey I. Webb [Monash University],
- Charu C. Aggarwal [IBM],
- Qingsong Wen [Squirrel AI], and
- Mahsa Salehi [Monash University]
The code is implemented in Python using the PyTorch framework.
In time series anomaly detection (TSAD), the scarcity of labeled data poses a challenge to the development of accurate models. Unsupervised domain adaptation (UDA) offers a solution by leveraging labeled data from a related domain to detect anomalies in an unlabeled target domain. However, existing UDA methods assume consistent anomalous classes across domains. To address this limitation, we propose a novel Domain Adaptation Contrastive learning model for Anomaly Detection in multivariate time series (DACAD), combining UDA with contrastive learning. DACAD utilizes an anomaly injection mechanism that enhances generalization across unseen anomalous classes, improving adaptability and robustness. Additionally, our model employs supervised contrastive loss for the source domain and self-supervised contrastive triplet loss for the target domain, ensuring comprehensive feature representation learning and domain-invariant feature extraction. Finally, an effective Centre-based Entropy Classifier (CEC) accurately learns normal boundaries in the source domain. Extensive evaluations on multiple real-world datasets and a synthetic dataset highlight DACAD's superior performance in transferring knowledge across domains and mitigating the challenge of limited labeled data in TSAD.
DACAD Model Overview: Involves source (
To elucidate the effectiveness of UDA within DACAD, we examine the feature representations from the MSL dataset, as illustrated below. It presents the t-SNE 2D embeddings of DACAD feature representations
To use the code, follow these steps:
- Clone the repository:
git clone https://github.com/zamanzadeh/DACAD.git
cd DACAD
- Create a virtual environment and activate it:
python -m venv dacad-env
source dacad-env/bin/activate # On Windows use `dacad-env\Scripts\activate`
- Install the required packages:
pip install -r requirements.txt
DACAD
|
|- datasets [datasets should be placed here]
| |- MSL
| |- SMD
| | ...
|- results [find the results here]
| |- MSL
| |- SMD
| | ...
To train and evaluate the model, use the following commands:
python main/main_MSL.py
python main/main_SMD.py
python main/main_Boiler.py
We evaluate the performance of the proposed model and compare the results across the seven most commonly used real benchmark datasets for Time Series Anomaly Detection (TSAD).
-
NASA Datasets -- Mars Science Laboratory (MSL) and Soil Moisture Active Passive (SMAP) (source) are collected from NASA spacecraft and contain anomaly information from incident reports for a spacecraft monitoring system.
-
Server Machine Dataset (SMD) (source) is gathered from 28 servers over 10 days, with normal data observed for the first 5 days and anomalies sporadically injected in the last 5 days.
-
Boiler Fault Detection Dataset (source) includes sensor information from three separate boilers, with each boiler representing an individual domain. The objective of the learning process is to identify the malfunctioning blowdown valve in each boiler.
If you use this code in your research, please cite the following article:
@article{darban2024dacad,
title={DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series},
author={Darban, Zahra Zamanzadeh and Yang, Yiyuan and Webb, Geoffrey I and Aggarwal, Charu C. and Wen, Qingsong and Salehi, Mahsa},
journal={arXiv preprint arXiv:2404.11269},
year={2024}
}
This project is licensed under the MIT License - see the LICENSE file for details.