Question

In: Computer Science

Use Python (hint: Beautiful Soup, Selenium, or Scrapy) to web scrape (clean and parse) an HTML...

Use Python (hint: Beautiful Soup, Selenium, or Scrapy) to web scrape (clean and parse) an HTML data site. You can also use other modules or libraries to clean and manipulate the data. Identify and explain any inconsistencies in the dataset.

The website link : https://www.iii.org/table-archive/23284

Get the wildfire tables from only 2019 and 2020 from the website.

Solutions

Expert Solution

Structure of my answer.

1. images of code

2. code

3. explaination

Images of Code

Output : of the table

Explanation:

1. the webiste has devided the data into two sections current and archives, where the current section holds the data of the current year. (2019) (actually the previous year) and the archies holds data from 2010 - 2018

steps :

1. fist we get the parse the html page using beautiful soup ,

2. get the div elements

3. then split the div elements into current and archives

from  bs4 import BeautifulSoup
import requests 
import pandas as pd 
import re

url = "https://www.iii.org/table-archive/23284"

wild_fire = requests.get(url)

site = BeautifulSoup(wild_fire.content,"html5lib")

divs = site.find_all('div',class_ = "view-content")

current,archives = divs[0],divs[1]

Extracting the Current table data (2019)

def parse_table(table):
    
    cols = ['State','Number of Fires','Number of acres burned']


    data = []
    count = 1
    index = 0


    for i in table.find_all('td'):

        if count == 1:
            col1 = i.get_text()

        elif count == 2:
            col2 = i.get_text()

        elif count == 3:
            col3 = i.get_text()

            data.append([col1,col2,col3])


            count = 0

        count += 1

    df = pd.DataFrame(data,columns = cols)
    df.set_index('State',inplace = True)
    
    
    return df

    
    
df = parse_table(current.find_all("table")[-1])
df.head()    
    

Extracting the table data from the archives

One can extract any table data by just changing the year_toget var

archives = list(archives.children)

year_toget = 2018

for table in archives: 
    
    res = table.find('span')
    if res != -1 and res:
        
        years = re.findall("\d{4}",res.get_text())
        
        if len(years) == 1:          
            year = int(years[0])
            
        if year_toget == year:
            df = parse_table(table.find_all('table')[-1])
         

df

Output:

I have extracted 2019 and 2018 year tables since 2020 was not available. When it becomes available then 2020 tabel will be in the current section and 2019 in the archived section hence the process to extract data will not change .

You my answer helps then upvote!!


Related Solutions

Make a modest or simple Web page using Python flask. The basic components of HTML should...
Make a modest or simple Web page using Python flask. The basic components of HTML should be included. The Web page should have at least 3 Headings(<h1>), paragraph (<p>), comments (<!-- -->), ordered list, unordered list, three links to website, and should display time & date. Example: <html>     <head>         <title>Page Title</title>     </head> <body>     ..new page content.. </body> </html>
Generate a modest Web page via Python flask. It should include basic components of HTML. The...
Generate a modest Web page via Python flask. It should include basic components of HTML. The Web page should have at least three Headings(<h1>), a paragraph (<p>), comments (<!-- -->), ordered list, unordered list, three links to website, and should display time & date.
Write an HTML file for a web page that contains the items below. Use an internal...
Write an HTML file for a web page that contains the items below. Use an internal style sheet to specify all fonts, sizes, colors, and any other aspects of the presentation. Your page should contain the following items: 1) A header with white text on dark green background (just for the header, not the entire page), in Impact font, bold, and centered. 2) Two paragraphs of text, each with dark gray text in Tahoma font, on a light blue background,...
use CSS, java and HTML to make a charity admin page. It's a web page after...
use CSS, java and HTML to make a charity admin page. It's a web page after you log in to your account. I can have personal information, the amount of money donated, the children who have received charity, and some blogs. In fact, all the things are what I want. But you are free to do whatever You like, even if you don't say what I say. I just need a charity Admin page for charity User.Don't ask me for...
use CSS, java and HTML to make a charity admin page. It's a web page after...
use CSS, java and HTML to make a charity admin page. It's a web page after you log in to your account. I can have personal information, the amount of money donated, the children who have received charity, and some blogs. In fact, all the things are what I want. But you are free to do whatever You like, even if you don't say what I say. I just need a charity Admin page for charity User.Don't ask me for...
use CSS, java and HTML to make a charity admin page. It's a web page after...
use CSS, java and HTML to make a charity admin page. It's a web page after you log in to your account. I can have personal information, the amount of money donated, the children who have received charity, and some blogs. In fact, all the things are what I want. But you are free to do whatever You like, even if you don't say what I say. I just need a charity Admin page for charity User.Don't ask me for...
use CSS, java and HTML to make a charity admin page. It's a web page after...
use CSS, java and HTML to make a charity admin page. It's a web page after you log in to your account. I can have personal information, the amount of money donated, the children who have received charity, and some blogs. In fact, all the things are what I want. But you are free to do whatever You like, even if you don't say what I say. I just need a charity Admin page for charity User.Don't ask me for...
use CSS, java and HTML to make a charity admin page. It's a web page after...
use CSS, java and HTML to make a charity admin page. It's a web page after you log in to your account. I can have personal information, the amount of money donated, the children who have received charity, and some blogs. In fact, all the things are what I want. But you are free to do whatever You like, even if you don't say what I say. I just need a charity Admin page for charity User.Don't ask me for...
use CSS, java and HTML to make a charity admin page. It's a web page after...
use CSS, java and HTML to make a charity admin page. It's a web page after you log in to your account. I can have personal information, the amount of money donated, the children who have received charity, and some blogs. In fact, all the things are what I want. But you are free to do whatever You like, even if you don't say what I say. I just need a charity Admin page for charity User.Don't ask me for...
Plot the unemployment rate in the US from 1980-2016 (Hint: Use the FRED web site). What...
Plot the unemployment rate in the US from 1980-2016 (Hint: Use the FRED web site). What is the average unemployment rate in this period? What is the most recent unemployment rate in the US? Plot the percentage change (yearly) in the unemployment rate and the percentage change (yearly) in the real GDP from 1980-2016. What is the relationship between the fluctuations in the unemployment rate and the fluctuations in the real GDP?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT