Worst-Case Analysis of the KMP Algorithm

fammyW · January 17, 2023, 10:31am

I’m curious in the worst-case running time of the Knuth Morris Pratt Algorithm for String Matching:
Assume that T = AAAAB and P = AB.
The prefix table in this situation is:
#P Sk-D
0 A 0
1 B 0

As I go over T, I compare P values to it: AB AAAAB At Cell#0, there is a partial match. We transfer P to the next cell since the skip distance for Cell#0 for P is zero: AB AAAAB

As a result, the identical situation will be repeated three times (6 times) until T matches the substring “AB” in P. (2 more comparisons). There were a total of eight comparisons: Length of P * Length of T - Length of T = 5 * 2 - 2 = 8 = O(mn) - O(m) = O (mn). This is obviously not O(m+n), the worst-case constraint for the KMP method.

What am I doing incorrectly? For your convenience, I read this post by scaler topics, but I believe my prefix table is inaccurate. However, if I follow the definition of “longest suffix in P that is also a prefix in P,” I cannot see any additional values in the prefix table.

I would appreciate any assistance.
Thank you very much!

shreya12 · June 8, 2023, 6:12am

In your analysis, it seems that you are calculating the number of comparisons made in the worst-case scenario using the Knuth-Morris-Pratt (KMP) algorithm. However, there is a slight misunderstanding in your calculation.

The worst-case time complexity of the KMP algorithm is indeed O(m + n), where m is the length of the pattern P and n is the length of the text T. It is not correct to consider it as O(mn).

The prefix table you provided is as follows:
#P Sk-D
0 A 0
1 B 0

When you compare the pattern P = “AB” against the text T = “AAAAB”, you correctly identified that there is a partial match at Cell#0. However, the skip distance (Sk-D) for Cell#0 is not zero. It should be 0 for Cell#0 and 1 for Cell#1.

The correct process of matching the pattern against the text would be as follows:

T = AAAAB
P = AB

At Cell#0, there is a partial match. The skip distance for Cell#0 is 0, so we transfer P to the next cell:
AB AAAAB

At Cell#1, there is a complete match. The pattern P has been found in the text T.

In this case, there are a total of two comparisons made, which is equal to the length of the pattern P. Therefore, the worst-case time complexity is O(m + n), not O(mn).

It’s important to note that the worst-case time complexity of the KMP algorithm arises when there are no partial matches, and the pattern needs to be compared against the entire text.

Topic		Replies	Views
Intermediate Algorithm Scripting: Missing letters - critique of my solution? JavaScript	7	1112	February 2, 2021
I need help. (Basic Algorithm Scripting, etc.)	16	3840	January 16, 2021
No repeats please, from brute force to super pattern! Code Feedback	9	1682	June 1, 2021
Need help estimating the performance of this algorithm JavaScript	34	2297	November 14, 2021
Exercice in java but the logic apply need help	9	1174	January 17, 2021

Worst-Case Analysis of the KMP Algorithm

Related topics