HI, I was trying to experiment with the web scraping and held with error and no output. Your help is appreciated.
I’m getting a none data
from selenium import webdriver
from selenium.webdriver.common.by import By
driver=webdriver.Chrome(r'C:\Users\Desktop\webdrivers\chromedriver_win32\chromedriver.exe')
driver.get('https://')
#driver.maximize_window()
lists = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/div[1]/div')
#print(lists)
from csv import writer
with open(r'C:\Users\Desktop\list.csv', 'w', encoding='utf8', newline='') as f:
thewriter = writer(f)
header = ['col 3', 'col 4', 'col 5', 'col 6','col 7','col 8','col 9','col 10']
thewriter.writerow(header)
for list in lists:
col 3 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[2]/td[4]').text.replace('\n', '')
col 4 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[1]/td[5]').text.replace('\n', '')
col 5 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[1]/td[6]').text.replace('\n', '')
col 6 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[1]/td[7]').text.replace('\n', '')
col 7 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[1]/td[8]').text.replace('\n', '')
col 8 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[1]/td[9]').text.replace('\n', '')
col 9 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[1]/td[10]').text.replace('\n', '')
col 10 = driver.find_element(By.XPATH,'//*[@id="main-content"]/div[3]/div/div[2]/div[1]/table/tbody/tr[1]/td[11]').text.replace('\n', '')
info = [col 3, col 4, col 5, col 6,col 7,col 8,col 9,col 10]
thewriter.writerow(info)]
element = driver.find_element(By.XPATH, "element_xpath")
C:\Users\.spyder-py3\scrape_selenium.py:3: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
driver=webdriver.Chrome(r'C:\Users\Desktop\webdrivers\chromedriver_win32\chromedriver.exe')
[<selenium.webdriver.remote.webelement.WebElement (session="916bd566098750ef2801e7b129aec8dd", element="324f660d-8a4e-4a1f-8446-2da8eab3632d")>]
I have revised the code from xpath to by class name and modified code.
However, did not notice the result. Even though no error.
I have tried to add further class " col-7,col-8,col-9". but it is empty or intendant error. Any thoughts or tweak of code is appreciated.
import xlsxwriter
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
element_list = []
for page in range(1, 3, 1):
page_url = "OcrText=false&searchType=quickSearch&viewType=list" + str(page)
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get(page_url)
title = driver.find_elements(by=By.CLASS_NAME, value ="col-3")
price = driver.find_elements(by=By.CLASS_NAME, value ="col-4")
description = driver.find_elements(by=By.CLASS_NAME, value ="col-5")
rating = driver.find_elements(by=By.CLASS_NAME, value ="col-6")
for i in range(len(title)):
element_list.append([title[i].text, price[i].text, description[i].text, rating[i].text])
with xlsxwriter.Workbook(r'C:\Users\Desktop\list.xlsx') as workbook:
worksheet = workbook.add_worksheet()
for row_num, data in enumerate(element_list):
worksheet.write_row(row_num, 0, data)
driver.close()
What are each selector returning, did you log them out before looping?
Shouldn’t it just be By.CLASS_NAME and not by=By.CLASS_NAME? I don’t really use Python and I never tried web scraping with it (well I might have for fun at some point but I don’t remember).
I can’t really help you with the selectors as I don’t know what the page structure is. I would suggest you copy the selector using the browser if you didn’t (check the Using XPaths and CSS Selectors link I gave).