Python Web Scraping Example

After using your browser and your dev tools to get a good idea about the structure of the page you want to scrape, you're ready to start your web scraper by writing some code to fetch that HTML structure using Python's requests library.

Install Requests

First, you'll need to create a virtual environment and install the requests library since it's an external package:

python3 -m venv venv
source venv/bin/activate
python3 -m pip install requests

Executing these three commands in succession will create and activate a new virtual environment named venv and install requests into that virtual environment.

What is the `requests` Library?

The requests library is a widely used external Python library that allows you to interact with the Internet. It has a very user-friendly interface, which is acknowledged also in its catchphrase:

HTTP for Humans

You've used the requests library before, and often, you won't need more than a few lines of code to get what you want. For your Python web scraping needs in this example, that holds true as well:

import requests

BASE_URL = "https://codingnomads.github.io/recipes/"
page = requests.get(BASE_URL)
print(page.text)

This short code snippet will fetch the content of the main page of the CodingNomads recipe collection and print the HTML content to your console.

Info: When you see the output in your console, you'll understand why it's much more user-friendly to inspect the HTML structure using your dev tools in your browser :)

There's your page content! With just a bit of code, you've got access to all the HTML of the page inside of your Python script. So, what type of content are you working with here:

print(type(page.text))  # OUTPUT: <class 'str'>

Looks like this is one big str! Well, that's kind of hard to work with! You could use Python's string methods that you've learned about at the beginning of this course and identify the interesting parts inside of this soup of HTML code, or you could learn to use regular expressions to pick information from this text.

But there's an easier way! Like so often, someone else already did the work for you, and you can rely on Python's extensive package ecosystem to provide a well-tested solution for your needs.

In the next lesson, you'll use the Beautiful Soup package to parse the HTML soup that you gathered using requests.

Additional Resources

Requests Documentation: Requests: HTTP for Humans™

Summary: Python Web Scraping

Python's requests library allows you to fetch the HTML content of a static website with a single line of code
The requests library is one part of your Python web scraper
The requests.models.Response() object gives you access to the HTML of the page
The .text attribute of the response object provides the HTML of the page as a string

Courses

Career Tracks

Python Web Scraping Example

Copy Link LinkedIn Message Facebook Email X

Contents

Install Requests

What is the `requests` Library?

Additional Resources

Summary: Python Web Scraping

Python Web Scraping Example Copy Link LinkedIn Message Facebook Email X

Contents

Install Requests

What is the requests Library?

Additional Resources

Summary: Python Web Scraping

Python Web Scraping Example

Copy Link LinkedIn Message Facebook Email X

What is the `requests` Library?