biomg
September 28, 2020, 4:26pm
1
I have list of items
list_of_ids = ['abc','xyz','123']
And another very big list containing long strings with ids
big_list = ['>abc\nssssssdadadddddddddddda','>qwe\nsaddwwwwwwwwwwwww',...,'>uio\nasdf','123\ndaswdwwwwwwwwww']
I need to pull out these long strings which contain ids from list
I’ve tried with
new_list = []
for ID in list_of_ids:
for i in big_list:
if ID in i:
new_list.append(i)
But I have a problem with my twisted logic and something is wrong. I’m a biologist not a programmer.
Please need help with this
All best
Can you explain what is going wrong? I’m a bit rusty on my Python, but that should be working.
Could you make a minimal example on https://repl.it/ that demonstrates the issue?
My copy-pasting of your code seems ok, though you might have issues with duplicate entries
my repl link
biomg
September 28, 2020, 5:36pm
5
If a string in really_big_list
contains multiple ID
s, then it will get added to the new_list
more than once.
For example, if you had '>abc\nlkjoihohohojohnohn123lkjohhohopihnohjo'
in your example, then it would get added to new_list
once when 'abc'
was found in it and once when '123'
was found in it.
My suspicion is that in order to do what you really want to, you’re going to want to use regular expressions.
biomg
September 28, 2020, 5:43pm
7
ArielLeslie:
If a string in really_big_list
contains multiple ID
s, then it will get added to the new_list
more than once.
For example, if you had '>abc\nlkjoihohohojohnohn123lkjohhohopihnohjo'
in your example, then it would get added to new_list
once when 'abc'
was found in it and once when '123'
was found in it.
My suspicion is that in order to do what you really want to, you’re going to want to use regular expressions.
OK thanks I will consider regex
Is it possible that every string accidentally matches?
Your code seems to not run:
repl.it link