Calculate the longest substring

Please excuse me if this seems too silly, but I’m not sure why this doesn’t work. I’ve been trying to finish this for a while.
I have to find the longest substring where the the alphabets are arranged in alphabetical order, I should’ve been clearer when I asked.

s = input("Enter a string s: ")
length = len(s)

original_string = s
s_list = [char for char in s]
s_list.sort()
s = ''.join(s_list)

possible_entries = []

while length != 0:
    if s[:length] in original_string:
        possible_entries.append(s[:length])
    length -= 1

max_length = 0
longest_entry = ""
for word in possible_entries:
    if len(word) > max_length:
        max_length = len(word)
        longest_entry = word
    
print('Longest substring in alphabetical order is: {}'.format(longest_entry))

What is this doing for you?

The .sort() method works only on lists, so s_list = [char for char in s] would return a list of the characters in the string. (A list is the python equivalent of an array just in case you’re confused, since I’m using list comprehension here, which is mostly a pythonic thing)

But why is sorting the list of characters helpful?

I have to find the longest substring where the the alphabets are arranged in alphabetical order, I should’ve been clearer when I asked.

Sure, but if you sort all of the characters in the string, then you’re destroying all of the substrings. You end up with a completely different string with completely different substrings.

I know, could you just read the whole of the source code, so that it would be easier to understand why.
This part might make it somewhat clearer

while length != 0:
    if s[:length] in original_string:
        possible_entries.append(s[:length])
    length -= 1

So you are trying to find the longest possible string that can be created in alphabetical order by taking any characters out of your string?

That is completely different than trying to find the longest substring in alphabetical order.

I’m trying to find the longest substring in alphabetical order, so, first I’m taking all of the possible ordered substrings within this(just realised I’ve been doing this in a not-so-efficient way), and then later on in the code, I calculate which of these is the longest.

But you are not considering all possible substrings of the original string. You are considering all possible substrings of a scrambled version of the original string.

If you have "ababcd" then the longest substring should be "abcd", but your scrambled string is "aabbcd", which does not let you check for this possibility.

You’ll also run into trouble with "abaebcd", as the longest alphabetical substring is "bcd", but you will only check for "aabbcde", "aabbcd", "aabbc", "aabb", "aab", "aa", or "a".

Yes, I think I’ll use itertool.product() for this

Just go back to the start and ask yourself: How would you, as a human, look for the longest substring with alphabetical order? How would you go if the string was 1000 characters long? Would you sort it? I highly doubt it.

You would go letter by letter, count how many are in alphabetical order and once they stop being in order, compare the current count to the best you found so far (the best starts at 1 for obvious reasons) - keep the bigger of the two, then with whatever letter you are at now, start counting and checking again.

When running into a problem, first ask yourself how you would do it by hand. Don’t go out and just throw random algorithms at it to see what sticks. Especially itertools.product() is a horrible algorithm - it has a massive time- and space-complexity.

2 Likes