May 2016     Issue 1
News
Computer Science and Engineering duo won Best Paper at SIGMOD 2015


Mr. GAN Junhao and Prof. TAO Yufei from the Department of Computer Science and Engineering won the Best Paper Award at ACM Conference on Management of Data (SIGMOD) 2015.

Title: DBSCAN Revisited: Mis-Claim, Un-Fixability, and Approximation
Author: Mr. GAN Junhao and Prof. TAO Yufei
Paper link: http://dl.acm.org/citation.cfm?doid=2723372.2737792
Abstract:
DBSCAN is a popular method for clustering multi-dimensional objects. Just as notable as the method's vast success is the research community's quest for its efficient computation. The original KDD'96 paper claimed an algorithm with O(n log n) running time, where n is the number of objects. Unfortunately, this is a mis-claim, and that algorithm actually requires O(n2) time. There has been a fix in 2D space, where a genuine O(n log n)-time algorithm has been found. Looking for a fix for dimensionality d ≥ 3 is currently an important open problem.

In this paper, we prove that for d ≥ 3, the DBSCAN problem requires Ω(n4/3) time to solve, unless very significant breakthroughs — ones widely believed to be impossible — could be made in theoretical computer science. This (i) explains why the community's search for fixing the aforementioned mis-claim has been futile for d ≥ 3, and (ii) indicates (sadly) that all DBSCAN algorithms must be intolerably slow even on moderately large n in practice. Surprisingly, we show that the running time can be dramatically brought down to O(n) in expectation regardless of the dimensionality d, as soon as slight inaccuracy in the clustering results is permitted. We formalise our findings into the new notion of ρ-approximate DBSCAN, which we believe should replace DBSCAN on big data due to the latter's computational intractability.

Past Issue      
Contact Us
Subscribe    Email to friend    Unsubscribe
Copyright © 2024.
All Rights Reserved. The Chinese University of Hong Kong.