Machine Learning

Clustering

Clustering algorithms aim at grouping points so that the points in the same group are more similar to each other than points in other groups. Clustering algorithms can serve many purposes. They can work as an efficient exploratory data analysis tool in fields such as physics and bioinformatics or as a preprocessing tool for other algorithms.

In our Machine Learning group, we have studied many types of clustering methods, including K-means, agglomerative, divisive and density based clustering. We have experience in clustering high- and low-dimensional vectorial data, strings, GPS data, categorical, graphs/networks, images and user profiles. Any type of data can be clustered as long as there is a way of measuring distance or similarity for data objects.

Highlighted publications:

S. Sieranoja and P. Fränti, "Fast and general density peaks clustering", Pattern Recognition Letters, 128, 551-558, December 2019. (pdf) IF=2.81, JF=2
P. Fränti and S. Sieranoja, "How much k-means can be improved by using better initialization and repeats?", Pattern Recognition, 93, 95-112, 2019. (pdf) IF=3.39, JF=3
P. Fränti and S. Sieranoja, "K-means properties on six clustering benchmark datasets.", Applied Intelligence, 48 (12), 4743-4759, December 2018. (pdf) IF=1.98, JF=1
P. Fränti, "Efficiency of random swap clustering", Journal of Big Data, 5:13, 1-29, 2018. (pdf) JF=1
M. Rezaei and P. Fränti, "Set matching measures for external cluster validity", IEEE Trans. on Knowledge and Data Engineering, 28 (8), 2173-2186, August 2016. (pdf) IF=2.47, JF=3

Software

https://github.com/uef-machine-learning

Apps

Clusterator
Animator
More...

Courses:

Clustering Methods