Fast Parameterless Density-Based Clustering via Random Projections

back to overview

Reference

Vlachos, M., & Schneider, J. (2013). Fast Parameterless Density-Based Clustering via Random Projections. Paper presented at the Conference on Information and Knowledge Management (CIKM).

Publication type

Paper in Conference Proceedings

Abstract

Clustering offers significant insights in data analysis. Density-based algorithms have emerged as flexible and efficient techniques, able to discover high-quality –and potentially irregularly shaped– clusters. We present two fast density-based clustering algorithms based on random projections. Both algorithms demonstrate one to two orders of magnitude speedup compared to equivalent state-of-art density based techniques, even for modest-size datasets. We give a comprehensive analysis of both our algorithms and show runtime of O(dN log^2 N), for a d-dimensional dataset. Our first algorithm can be viewed as a fast variant of the OPTICS density-based algorithm, but using a softer definition of density combined with sampling. The second algorithm is parameter-less, and identifies areas separating clusters.

Persons

Organizational Units

  • Institute of Information Systems
  • Hilti Chair of Business Process Management