Assignment 4

Getting to know some butterflies


Common Buckeye (Junonia coenia)

Due Date: Friday, April 1st, 11:59:59 PM

Value: 150 points

Github Invite: Click Here

Collaboration: For Assignment 4, collaboration is not allowed; you must work individually. You may still see your TA and come to office hours for help, but you may not work with any other CMSC 210 students. You may post questions on Discord, but you may not post code.

Objectives:

Instructions

The state of Maryland is home to many beautiful species of Butterfly.The website marylandbutterflies.com is designed to aid in the identification of these insects, and includes information on and photos of hundreds of native species. The assignment is to generate a CSV file containing data about insects by scraping a mirrored version of the site. An example of what the CSV file should contain is included in the GitHub assignment repository in the file example_output.csv.

Your code should access each detail page for all Maryland butterflies. You should use BeautifulSoup to extract these detail page links from the site's home page An example detail page is this one for the pink-edged sulphur. On that page we see the following facts:

Name:
Pink-edged Sulphur
Latin Name:
Colias interior
Size:
Wingspan ranges from 1.5" - 2.6"
Occurrence Level:
Rare
Flight Period:
July
Larval Host Plant:
Velvetleaf blueberry
This is the information your scraper should extract for each butterfly.

One final note: the home page contains a handful of sections of butterflies:

The butterflies in the last, non-Maryland section should not be scraped or appear in your output CSV.

Pre-defined Python Library Usage

You may use the requests and BeautifulSoup libraries. Generating the final CSV file can be handled using the csv.writer function or csv.DictWriter class in the CSV standard library.

Grading Rubric

Following coding standards 30 points
CSV Includes data about all butterflies 60 points
CSV columns contain correct data 60 points
Extra credit question: which page or pages contain the largest number of butterfly photos? (Note your assignment must include Python code to answer this question). 20 points