Volume 29, Issue 1
On Density-Based Data Streams Clustering Algorithms: A Survey

Amini, Amineh, Teh, Ying Wah & Saboohi, Hadi

Journal of Computer Science and Technology, 29 (2014), pp. 116-141.

Published online: 2020-08

Export citation
  • Abstract

Clustering data streams has drawn lots of attention in the last few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data stream applications. Data streams are infinite and evolving over time, and we do not have any knowledge about the number of clusters. In a data stream environment due to various factors, some noise appears occasionally. Density-based method is a remarkable class in clustering data streams, which has the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the number of clusters in advance. Due to data stream characteristics, the traditional density-based clustering is not applicable. Recently, a lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using densitybased methods in the clustering process and at the same time overcoming the constraints, which are put out by data stream’s nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. Moreover, we investigate the evaluation metrics used in validating cluster quality and measuring algorithms' performance. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms.

  • Keywords

  • AMS Subject Headings

  • Copyright

COPYRIGHT: © Springer

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{JCST-29-116, author = {Amineh , Amini,Ying Wah , Teh, and Hadi , Saboohi,}, title = {On Density-Based Data Streams Clustering Algorithms: A Survey}, journal = {Journal of Computer Science & Technology}, year = {2020}, volume = {29}, number = {1}, pages = {116--141}, abstract = {

Clustering data streams has drawn lots of attention in the last few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data stream applications. Data streams are infinite and evolving over time, and we do not have any knowledge about the number of clusters. In a data stream environment due to various factors, some noise appears occasionally. Density-based method is a remarkable class in clustering data streams, which has the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the number of clusters in advance. Due to data stream characteristics, the traditional density-based clustering is not applicable. Recently, a lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using densitybased methods in the clustering process and at the same time overcoming the constraints, which are put out by data stream’s nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. Moreover, we investigate the evaluation metrics used in validating cluster quality and measuring algorithms' performance. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms.

}, issn = {1860-4749}, doi = {https://doi.org/10.1007/s11390-013-1416-3}, url = {http://global-sci.org/intro/article_detail/jcst/18169.html} }
TY - JOUR T1 - On Density-Based Data Streams Clustering Algorithms: A Survey AU - Amineh , Amini, AU - Ying Wah , Teh, AU - Hadi , Saboohi, JO - Journal of Computer Science & Technology VL - 1 SP - 116 EP - 141 PY - 2020 DA - 2020/08 SN - 29 DO - http://doi.org/10.1007/s11390-013-1416-3 UR - https://global-sci.org/intro/article_detail/jcst/18169.html KW - AB -

Clustering data streams has drawn lots of attention in the last few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data stream applications. Data streams are infinite and evolving over time, and we do not have any knowledge about the number of clusters. In a data stream environment due to various factors, some noise appears occasionally. Density-based method is a remarkable class in clustering data streams, which has the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the number of clusters in advance. Due to data stream characteristics, the traditional density-based clustering is not applicable. Recently, a lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using densitybased methods in the clustering process and at the same time overcoming the constraints, which are put out by data stream’s nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. Moreover, we investigate the evaluation metrics used in validating cluster quality and measuring algorithms' performance. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms.

Amini, Amineh, Teh, Ying Wah & Saboohi, Hadi. (2020). On Density-Based Data Streams Clustering Algorithms: A Survey. Journal of Computer Science & Technology. 29 (1). 116-141. doi:10.1007/s11390-013-1416-3
Copy to clipboard
The citation has been copied to your clipboard