Substring Counting with Insertions

Substring counting is a classical algorithmic problem with numerous solutions that achieve linear time complexity. In this paper, we address a variation of the problem where, given three strings <i>p</i>, <i>t</i>, and <i>s</i>, we are interested in the number of...

Full description

Saved in:
Bibliographic Details
Main Authors: Janez Brank, Tomaž Hočevar
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/18/6/371
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Substring counting is a classical algorithmic problem with numerous solutions that achieve linear time complexity. In this paper, we address a variation of the problem where, given three strings <i>p</i>, <i>t</i>, and <i>s</i>, we are interested in the number of occurrences of <i>p</i> in all strings that would result from inserting <i>t</i> into <i>s</i> at every possible position. Essentially, we are solving several substring counting problems of the same substring <i>p</i> in related strings. We give a detailed description of several conceptually different approaches to solving this problem and conclude with an algorithm that has a linear time complexity. The solution is based on a recent result from the field of substring search in compressed sequences and exploits the periodicity of strings. We also provide a self-contained implementation of the algorithm in C++ and experimentally verify its behavior, chiefly to demonstrate that its running time is linear in the lengths of all three input strings.
ISSN:1999-4893