I tried to use beautifulSoup but the text I want is not in between tags, it is between two specific strings of text in the source code.
The code below will print the entirety of the websites source code. How could I get python to print just the string between two specific strings within the source code?
Let’s say I was trying to get the product description from the source code. When I look at the source code, the product description is not inside an html tag. The only way to locate it seems to be to tell python the 10 characters that come before the text we need, and then the 10 characters that come after the text. The product descriptions will be different for each product, but it will be located between the same unique string of characters, regardless of the product.
All text displayed on a web page will be in an html element. The product description may not be the only text in the element. Is there a specific element (i.e. has a specific id and or class attribute)? If so, then you would use a parser library to get that specific element and then examine the text in that element to find the product description using a regular expression containing a capture group that will match the word between the two sets of characters that will signify the description lies between.