Web Scraping Job Adverts

biancambali1999 · May 16, 2022, 8:29pm

Hi,

I’m in the process of learning how to web scarping and doing the walkthrough with this tutorial provided by freecodecamp: Web Scraping with Python - Beautiful Soup Crash Course - YouTube.

I have ran into an issue though whilst trying to run the python code on pycharm as it takes forever to run. Is there a way to overcome this and make it run faster?

I have attached my code below:

from bs4 import BeautifulSoup
import request
#will write a program that pulls jobs from advertising website that have been posted today only.

html_text = requests.get('https://www.totaljobs.com/jobs/python?radius=10').text
soup = BeautifulSoup(html_text, 'lxml')
jobs = soup.find_all('div', class_ = 'ResultsSectionContainer-sc-gdhf14-0 kteggz')
print(jobs)

Any help or advice to make the code run would be much appreciated!
Bianca

kinome79 · May 16, 2022, 8:46pm

Are you sure that URL your using will respond to a standard GET request?

Before trying to parse it with BS have you attempted to just output the response to console to see what you are getting… if its hanging, my guess is that the URL you are using isn’t responding to your GET request, and the program is just waiting. Most major pages don’t respond to unsolicited GET requests in my limited experience. Usually if you want to collect data from a website you need to see if they have an API and set of instructions for downloading data.

biancambali1999 · May 17, 2022, 7:19pm

Hi,

Thank you for the help. I have ran the response of html_text and didn’t get anything, it must be due to what you had suggest as to an API being required.

Quick question is there a way to identify if a website has an API or not?

kinome79 · May 17, 2022, 7:24pm

Not sure, I’m still kinda new. One thing I noticed about the URL you were using was that it didn’t just take me to a page when typed in a browser, it took me to a loading prompt meaning its running scripts, so not just returning HTML. If you use the URL they use in the lesson you were watching your code does work, so you don’t necessarily need an API, but basically you need a simple page that returns HTML, not some scripted server site. I did something similar with the freeCodeCamp python cert and found I could read my pages, and other basic pages, but things like google.com, or other large pages denighed such simple requests.

biancambali1999 · May 17, 2022, 8:00pm

Got you, thank you for sharing that with me.

system · November 16, 2022, 8:01am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Web Scraping Dilemma part 2! Python	2	1124	June 1, 2021
Problem with a tutorial in freeCodeCamp Python	2	2032	July 14, 2023
Help in scraping a webpage	4	487	June 1, 2021
Python And Web Scarping Python	4	75	May 20, 2025
Python for Everybody - ERROR - Pagerank Spider Exercise Python	32	3386	May 8, 2024

Web Scraping Job Adverts

Related topics