Online clustering with variable sized clusters

J. Csirik, Leah Epstein, Csanád Imreh, Asaf Levin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In online clustering problems, the classification of points into sets (called clusters) is done in an online fashion. Points arrive one by one at arbitrary locations, to be assigned to clusters at the time of arrival. A point can be assigned to an existing cluster, or a new cluster can be opened for it. We study a one dimensional variant on a line, where there is no restriction on the length of a cluster, and the cost of a cluster is the sum of a fixed set-up cost and its diameter. The goal is to minimize the sum of costs of the clusters used by the algorithm. We study several variants, all maintaining the essential property that a point which was assigned to a given cluster must remain assigned to this cluster, and clusters cannot be merged. In the strict variant, the diameter and the exact location of the cluster must be fixed when it is initialized. In the flexible variant, the algorithm can shift the cluster or expand it, as long as it contains all points assigned to it. In an intermediate model, the diameter is fixed in advance while the exact location can be modified. We give tight bounds on the competitive ratio of any online algorithm in each of these variants. In addition, for each one of the models, we consider also the semi-online case, where points are presented sorted by their location.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages282-293
Number of pages12
Volume6281 LNCS
DOIs
Publication statusPublished - 2010
Event35th International Symposium on Mathematical Foundations of Computer Science, MFCS 2010 - Brno, Czech Republic
Duration: Aug 23 2010Aug 27 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6281 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other35th International Symposium on Mathematical Foundations of Computer Science, MFCS 2010
CountryCzech Republic
CityBrno
Period8/23/108/27/10

Fingerprint

Clustering
Costs
Setup Cost
Time of Arrival
Competitive Ratio
Online Algorithms
Expand
Restriction
Minimise
Line
Arbitrary
Model
Time of arrival

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Csirik, J., Epstein, L., Imreh, C., & Levin, A. (2010). Online clustering with variable sized clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6281 LNCS, pp. 282-293). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6281 LNCS). https://doi.org/10.1007/978-3-642-15155-2_26

Online clustering with variable sized clusters. / Csirik, J.; Epstein, Leah; Imreh, Csanád; Levin, Asaf.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6281 LNCS 2010. p. 282-293 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6281 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Csirik, J, Epstein, L, Imreh, C & Levin, A 2010, Online clustering with variable sized clusters. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 6281 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6281 LNCS, pp. 282-293, 35th International Symposium on Mathematical Foundations of Computer Science, MFCS 2010, Brno, Czech Republic, 8/23/10. https://doi.org/10.1007/978-3-642-15155-2_26
Csirik J, Epstein L, Imreh C, Levin A. Online clustering with variable sized clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6281 LNCS. 2010. p. 282-293. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-15155-2_26
Csirik, J. ; Epstein, Leah ; Imreh, Csanád ; Levin, Asaf. / Online clustering with variable sized clusters. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6281 LNCS 2010. pp. 282-293 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{637fc6c2fa414269803f7e639421a0a0,
title = "Online clustering with variable sized clusters",
abstract = "In online clustering problems, the classification of points into sets (called clusters) is done in an online fashion. Points arrive one by one at arbitrary locations, to be assigned to clusters at the time of arrival. A point can be assigned to an existing cluster, or a new cluster can be opened for it. We study a one dimensional variant on a line, where there is no restriction on the length of a cluster, and the cost of a cluster is the sum of a fixed set-up cost and its diameter. The goal is to minimize the sum of costs of the clusters used by the algorithm. We study several variants, all maintaining the essential property that a point which was assigned to a given cluster must remain assigned to this cluster, and clusters cannot be merged. In the strict variant, the diameter and the exact location of the cluster must be fixed when it is initialized. In the flexible variant, the algorithm can shift the cluster or expand it, as long as it contains all points assigned to it. In an intermediate model, the diameter is fixed in advance while the exact location can be modified. We give tight bounds on the competitive ratio of any online algorithm in each of these variants. In addition, for each one of the models, we consider also the semi-online case, where points are presented sorted by their location.",
author = "J. Csirik and Leah Epstein and Csan{\'a}d Imreh and Asaf Levin",
year = "2010",
doi = "10.1007/978-3-642-15155-2_26",
language = "English",
isbn = "364215154X",
volume = "6281 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "282--293",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Online clustering with variable sized clusters

AU - Csirik, J.

AU - Epstein, Leah

AU - Imreh, Csanád

AU - Levin, Asaf

PY - 2010

Y1 - 2010

N2 - In online clustering problems, the classification of points into sets (called clusters) is done in an online fashion. Points arrive one by one at arbitrary locations, to be assigned to clusters at the time of arrival. A point can be assigned to an existing cluster, or a new cluster can be opened for it. We study a one dimensional variant on a line, where there is no restriction on the length of a cluster, and the cost of a cluster is the sum of a fixed set-up cost and its diameter. The goal is to minimize the sum of costs of the clusters used by the algorithm. We study several variants, all maintaining the essential property that a point which was assigned to a given cluster must remain assigned to this cluster, and clusters cannot be merged. In the strict variant, the diameter and the exact location of the cluster must be fixed when it is initialized. In the flexible variant, the algorithm can shift the cluster or expand it, as long as it contains all points assigned to it. In an intermediate model, the diameter is fixed in advance while the exact location can be modified. We give tight bounds on the competitive ratio of any online algorithm in each of these variants. In addition, for each one of the models, we consider also the semi-online case, where points are presented sorted by their location.

AB - In online clustering problems, the classification of points into sets (called clusters) is done in an online fashion. Points arrive one by one at arbitrary locations, to be assigned to clusters at the time of arrival. A point can be assigned to an existing cluster, or a new cluster can be opened for it. We study a one dimensional variant on a line, where there is no restriction on the length of a cluster, and the cost of a cluster is the sum of a fixed set-up cost and its diameter. The goal is to minimize the sum of costs of the clusters used by the algorithm. We study several variants, all maintaining the essential property that a point which was assigned to a given cluster must remain assigned to this cluster, and clusters cannot be merged. In the strict variant, the diameter and the exact location of the cluster must be fixed when it is initialized. In the flexible variant, the algorithm can shift the cluster or expand it, as long as it contains all points assigned to it. In an intermediate model, the diameter is fixed in advance while the exact location can be modified. We give tight bounds on the competitive ratio of any online algorithm in each of these variants. In addition, for each one of the models, we consider also the semi-online case, where points are presented sorted by their location.

UR - http://www.scopus.com/inward/record.url?scp=78349254592&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78349254592&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-15155-2_26

DO - 10.1007/978-3-642-15155-2_26

M3 - Conference contribution

SN - 364215154X

SN - 9783642151545

VL - 6281 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 282

EP - 293

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -