### Abstract

We give an algorithm for measuring the similarity of run-length coded strings. In run-length coding, not all individual symbols in a string are listed. Instead, one run of identical consecutive symbols is coded by giving one representative symbol together with its multiplicity. If the strings under consideration consist of long runs of identical symbols, significant reductions in memory and access time can be achieved by run-length coding. Our algorithm determines the minimum cost sequence of edit operations needed to transform one string into another. It uses as basic data structure an edit matrix similar to the classical algorithm of Wagner and Fischer. However, depending on the particular pair of strings to be compared, only a part of this edit matrix usually needs to be computed. In the worst case, our algorithm has a time complexity of O(n·m), where n and m give the lengths of the strings to be compared. In the best case, the time complexity is O(k·l), where k and l are the numbers of runs of identical symbols in the two strings under comparison.

Original language | English |
---|---|

Title of host publication | Applied Computing |

Subtitle of host publication | Technological Challenges of the 1990's |

Publisher | Publ by ACM |

Pages | 137-143 |

Number of pages | 7 |

ISBN (Print) | 089791502X |

Publication status | Published - Dec 1 1992 |

Event | Proceedings of the 1992 ACM/SIGAPP Symposium on Applied Computing - SAC '92 - Kansas City, KS, USA Duration: Mar 1 1992 → Mar 3 1992 |

### Publication series

Name | Applied Computing: Technological Challenges of the 1990's |
---|

### Other

Other | Proceedings of the 1992 ACM/SIGAPP Symposium on Applied Computing - SAC '92 |
---|---|

City | Kansas City, KS, USA |

Period | 3/1/92 → 3/3/92 |

### Fingerprint

### ASJC Scopus subject areas

- Engineering(all)

### Cite this

*Applied Computing: Technological Challenges of the 1990's*(pp. 137-143). (Applied Computing: Technological Challenges of the 1990's). Publ by ACM.