What it does
Our project involved scraping datapoints from the Walmart.com site into 9 categories and subcategories: Food, Electronics, Personal Care, Toys, Home, Movies, Sports Equipment, Clothing, Pharmacy. Each Category(Food) has sub-categories (Fruits, Dairy, Bakery, Frozen).
A K-Means cluster is then used to cluster the datapoints into clusters that can then be queried.
How we built it
We used Python Scrapy for the web scraping and Google Collab for the python K-clustering.
Challenges we ran into
We were unable to get 20,000 required datapoints due to the time it takes to scrape and the security features of the Walmart.com site preventing us from scraping.
Accomplishments that We're proud of
We learned a lot about using web crawling and about time constraints from group projects.
Built With
- scikit-learn
- scrapy


Log in or sign up for Devpost to join the conversation.