I’m curious in the worst-case running time of the Knuth Morris Pratt Algorithm for String Matching:
Assume that T = AAAAB and P = AB.
The prefix table in this situation is: #P Sk-D 0 A 0 1 B 0

As I go over T, I compare P values to it: AB AAAAB At Cell#0, there is a partial match. We transfer P to the next cell since the skip distance for Cell#0 for P is zero: AB AAAAB

As a result, the identical situation will be repeated three times (6 times) until T matches the substring “AB” in P. (2 more comparisons). There were a total of eight comparisons: Length of P * Length of T - Length of T = 5 * 2 - 2 = 8 = O(mn) - O(m) = O (mn). This is obviously not O(m+n), the worst-case constraint for the KMP method.

What am I doing incorrectly? For your convenience, I read this post by scaler topics, but I believe my prefix table is inaccurate. However, if I follow the definition of “longest suffix in P that is also a prefix in P,” I cannot see any additional values in the prefix table.

I would appreciate any assistance.
Thank you very much!

In your analysis, it seems that you are calculating the number of comparisons made in the worst-case scenario using the Knuth-Morris-Pratt (KMP) algorithm. However, there is a slight misunderstanding in your calculation.

The worst-case time complexity of the KMP algorithm is indeed O(m + n), where m is the length of the pattern P and n is the length of the text T. It is not correct to consider it as O(mn).

The prefix table you provided is as follows: #P Sk-D
0 A 0
1 B 0

When you compare the pattern P = “AB” against the text T = “AAAAB”, you correctly identified that there is a partial match at Cell#0. However, the skip distance (Sk-D) for Cell#0 is not zero. It should be 0 for Cell#0 and 1 for Cell#1.

The correct process of matching the pattern against the text would be as follows:

T = AAAAB
P = AB

At Cell#0, there is a partial match. The skip distance for Cell#0 is 0, so we transfer P to the next cell:
AB AAAAB

At Cell#1, there is a complete match. The pattern P has been found in the text T.

In this case, there are a total of two comparisons made, which is equal to the length of the pattern P. Therefore, the worst-case time complexity is O(m + n), not O(mn).

It’s important to note that the worst-case time complexity of the KMP algorithm arises when there are no partial matches, and the pattern needs to be compared against the entire text.