What it does

Our project involved scraping datapoints from the Walmart.com site into 9 categories and subcategories: Food, Electronics, Personal Care, Toys, Home, Movies, Sports Equipment, Clothing, Pharmacy. Each Category(Food) has sub-categories (Fruits, Dairy, Bakery, Frozen).

A K-Means cluster is then used to cluster the datapoints into clusters that can then be queried.

How we built it

We used Python Scrapy for the web scraping and Google Collab for the python K-clustering.

Challenges we ran into

We were unable to get 20,000 required datapoints due to the time it takes to scrape and the security features of the Walmart.com site preventing us from scraping.

Accomplishments that We're proud of

We learned a lot about using web crawling and about time constraints from group projects.

Built With

Share this project:

Updates