Lesson E vs F in 'Python for Everybody - Networking'

I’ve just gone through chapter 12 of the ‘Python for Everybody’ course which is about networking. I’m a little confused about the difference between what was covered in lesson E as compared to F

This is the basic data structure learnt in E (if I make it look consistent with F):

import urllib.request, urllib.parse, urllib.error

url = input('Enter url: ')
html = urllib.request.urlopen(url)

for line in html:
    print(line.decode().strip())

I understand that this “opens” the html code of the webpage and removes any white-space, then displays it.

We then have lesson F:

import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup

url = input('Enter url: ')
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')


tags = soup('a')
for tag in tags:
    print(tag.get('href', None))

Aside from retrieving the tags at the end, wouldn’t this do the same thing as in E if I got rid of that bit and wrote print(soup) ? I guess what I’m asking is what’s special about BeautifulSoup…

I’ve edited your post for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (’).

1 Like

I used http://www.dr-chuck.com/page1.html as the testing url which Chuck provides in the lessons.