This is an small implementation of webcrapping data from a real website using Pandas and then formatting it using BeautifulSoup. I use a Jupyter Notebooks instance to execute this code.

Site used:

List of largest companies in the United States by revenue

To begin, I import the libs and set the variables that I’ll be using to get the data

Untitled

I inspect the elements on the webpage, to find the table class:

Untitled

I use the find method to find the ‘table’ values in the page HTML

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled

Untitled