Edit distance of run-length coded strings

H. Bunke, J. Csirik

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We give an algorithm for measuring the similarity of run-length coded strings. In run-length coding, not all individual symbols in a string are listed. Instead, one run of identical consecutive symbols is coded by giving one representative symbol together with its multiplicity. If the strings under consideration consist of long runs of identical symbols, significant reductions in memory and access time can be achieved by run-length coding. Our algorithm determines the minimum cost sequence of edit operations needed to transform one string into another. It uses as basic data structure an edit matrix similar to the classical algorithm of Wagner and Fischer. However, depending on the particular pair of strings to be compared, only a part of this edit matrix usually needs to be computed. In the worst case, our algorithm has a time complexity of O(n·m), where n and m give the lengths of the strings to be compared. In the best case, the time complexity is O(k·l), where k and l are the numbers of runs of identical symbols in the two strings under comparison.

Original languageEnglish
Title of host publicationApplied Computing
Subtitle of host publicationTechnological Challenges of the 1990's
PublisherPubl by ACM
Pages137-143
Number of pages7
ISBN (Print)089791502X
Publication statusPublished - Dec 1 1992
EventProceedings of the 1992 ACM/SIGAPP Symposium on Applied Computing - SAC '92 - Kansas City, KS, USA
Duration: Mar 1 1992Mar 3 1992

Publication series

NameApplied Computing: Technological Challenges of the 1990's

Other

OtherProceedings of the 1992 ACM/SIGAPP Symposium on Applied Computing - SAC '92
CityKansas City, KS, USA
Period3/1/923/3/92

    Fingerprint

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Bunke, H., & Csirik, J. (1992). Edit distance of run-length coded strings. In Applied Computing: Technological Challenges of the 1990's (pp. 137-143). (Applied Computing: Technological Challenges of the 1990's). Publ by ACM.