Working in a web scraping by python 3.I have a list of URLs, each URL have more than 100 article URLs(including pagination).
Added the script in lambda. First, it reads URLs from CSV file and sends each main URL to function where selects all href of each article.
all the above process works fine.
the issue is as we know lambda has a time out of 15 minutes, so we need to rerun the same with the final point where URL page number and the remaining URL.
CSV file is read by pandas and is converted as a directory. select an object from the dictionary and pass to function with a regex of page, the number of pages to scrap(pagination).
can we store the current page and the remaining URLs in a local folder in lambda and renun the same, if so how can we do the same.