LogDP is a semi-supervised log anomaly detection approach, which utilizes the dependency relationships among log events and proximity among log sequences to detect the anomalies in massive unlabeled log data.
LogDP divides log events into dependent and independent events, then learns normal patterns of dependent events using dependency and independent events using proximity. Events violating any normal pattern are identified as anomalies.
By combining dependency and proximity, LogDP is able to achieve high detection accuracy. Extensive experiments have been conducted on real-world datasets, and the results show that LogDP outperforms six state-of-the-art methods.
Three public log datasets, HDFS, BGL and Spirit, are used in our experiments, which are available from LOGPAI. From the three datasets, we generate seven datasets using different log grouping strategies. The HDFS is generated using session, and BGL and Spirit are generated using 1-hour logs, 100 logs, and20 logs windows. For LogDP, the first 2/3 sequences of the training set are used for training, and the remaining 1/3sequences are used as a validation set.
Six state-of-the-art log-based anomaly detection methods are selected as the benchmark methods
Please feel free to contribute any kind of functions or enhancements.
This project is licensed under the MIT License. See LICENSE for more details.
If you use this code for your research, please cite our paper.
@article{xie2021LogDP,
title={LogDP: Combining Dependency and Proximityfor Log-based Anomaly Detection},
author={Yongzheng Xie and Hongyu Zhang and Bo Zhang and Muhammad Ali Babar and Sha Lu},
journal={arXiv preprint arXiv:2110.01927},
year={2021}
}