An algorithm for matching run-length coded strings

H. Bunke, J. Csirik

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

An algorithm for the computation of the edit distance of run-length coded strings is given. In run-length coding, not all individual symbols in a string are explicitly listed. Instead, one run of identical consecutive symbols is coded by giving one representative symbol together with its multiplicity. The algorithm determines the minimum cost sequence of edit operations transforming one string into another. In the worst case, the algorithm has a time complexity of O(n·m), where n and m give the lengths of the strings to be compared. In the best case, the time complexity is O(k·l), where k and l are the numbers of runs of identical symbols in the two strings under comparison.

Original languageEnglish
Pages (from-to)297-314
Number of pages18
JournalComputing
Volume50
Issue number4
DOIs
Publication statusPublished - Dec 1993

Fingerprint

Run Length
Strings
Time Complexity
Edit Distance
Consecutive
Multiplicity
Coding
Costs

Keywords

  • AMS Subject Classifications: 68Q20, 90C39
  • longest common subsequence
  • run-length coding
  • string edit distance
  • String matching

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computational Theory and Mathematics

Cite this

An algorithm for matching run-length coded strings. / Bunke, H.; Csirik, J.

In: Computing, Vol. 50, No. 4, 12.1993, p. 297-314.

Research output: Contribution to journalArticle

Bunke, H. ; Csirik, J. / An algorithm for matching run-length coded strings. In: Computing. 1993 ; Vol. 50, No. 4. pp. 297-314.
@article{585799e69e604bb5b1e0f8b3e8833490,
title = "An algorithm for matching run-length coded strings",
abstract = "An algorithm for the computation of the edit distance of run-length coded strings is given. In run-length coding, not all individual symbols in a string are explicitly listed. Instead, one run of identical consecutive symbols is coded by giving one representative symbol together with its multiplicity. The algorithm determines the minimum cost sequence of edit operations transforming one string into another. In the worst case, the algorithm has a time complexity of O(n·m), where n and m give the lengths of the strings to be compared. In the best case, the time complexity is O(k·l), where k and l are the numbers of runs of identical symbols in the two strings under comparison.",
keywords = "AMS Subject Classifications: 68Q20, 90C39, longest common subsequence, run-length coding, string edit distance, String matching",
author = "H. Bunke and J. Csirik",
year = "1993",
month = "12",
doi = "10.1007/BF02243873",
language = "English",
volume = "50",
pages = "297--314",
journal = "Computing (Vienna/New York)",
issn = "0010-485X",
publisher = "Springer Wien",
number = "4",

}

TY - JOUR

T1 - An algorithm for matching run-length coded strings

AU - Bunke, H.

AU - Csirik, J.

PY - 1993/12

Y1 - 1993/12

N2 - An algorithm for the computation of the edit distance of run-length coded strings is given. In run-length coding, not all individual symbols in a string are explicitly listed. Instead, one run of identical consecutive symbols is coded by giving one representative symbol together with its multiplicity. The algorithm determines the minimum cost sequence of edit operations transforming one string into another. In the worst case, the algorithm has a time complexity of O(n·m), where n and m give the lengths of the strings to be compared. In the best case, the time complexity is O(k·l), where k and l are the numbers of runs of identical symbols in the two strings under comparison.

AB - An algorithm for the computation of the edit distance of run-length coded strings is given. In run-length coding, not all individual symbols in a string are explicitly listed. Instead, one run of identical consecutive symbols is coded by giving one representative symbol together with its multiplicity. The algorithm determines the minimum cost sequence of edit operations transforming one string into another. In the worst case, the algorithm has a time complexity of O(n·m), where n and m give the lengths of the strings to be compared. In the best case, the time complexity is O(k·l), where k and l are the numbers of runs of identical symbols in the two strings under comparison.

KW - AMS Subject Classifications: 68Q20, 90C39

KW - longest common subsequence

KW - run-length coding

KW - string edit distance

KW - String matching

UR - http://www.scopus.com/inward/record.url?scp=0042651526&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0042651526&partnerID=8YFLogxK

U2 - 10.1007/BF02243873

DO - 10.1007/BF02243873

M3 - Article

VL - 50

SP - 297

EP - 314

JO - Computing (Vienna/New York)

JF - Computing (Vienna/New York)

SN - 0010-485X

IS - 4

ER -