Inspiration
I was at home one day in a deep conflict. I didn't want to work on my university work, but I also didn't want to waste my life and not improve myself. I had conceived of this idea as a system that can entertain my generation's short attention span and improve my knowledge base.
What it does
Sample forms a database through web-scraping thousands of scientific articles from varying sources such as nature.com and science.org. It uses a cosine similarity test between the articles within this database to categorise articles as a particular scientific topic (currently space, nature, physiology and sociology) and presents these categorised articles to a user through a web interface. Since the content of each article is tokenised and lemmatised, the user may search for the most relevant articles to a chosen topic, or search for related articles based on keywords.
How we built it
The project is built on a flask (python) backend. We have used both javascript, java and python to scrape thousands of articles from reputable sources, storing these in an SQL database, obtaining metadata alongside the most prevalent key terms. We then use the cosine similarity test as a comparison model to assign articles to the most similar topic. We display this to the users through a combination of HTML, CSS (bootstrap) and javascript.
Challenges we ran into
There were several challenges that we encountered. Due to a large number of connected parts (backend, frontend, database), we had some difficulties communicating and often had to wait for the other person to finish. We initially planned to include featured articles in our carousel, however due to the low resolution of the images, this proved difficult without compromising quality.
Accomplishments that we're proud of
We are proud to have made a fully functioning website hosted on a remote server, that both visually and technically expresses our dreams. Despite all the challenges, we have learnt so much from each other, as we all have our own strengths and weaknesses.
What we learned
We learned how to work as a team, communicating well despite working on different parts. Additionally, we all specialised in different aspects, and we taught each other things we may have not known. Further, the 24-hour limit pushed us to our physical and mental limits, training us to encounter new problems more easily.
What's next for Sample
The sample is very easily scalable. A further implementation would be to add user keys, so that each user has their own personalised ID. To enhance this, we plan to implement AI to develop personalisation to a greater degree. The website also has capabilities to generate revenue, other by sponsorship from the news sources, advertisement or a subscription based system.
Log in or sign up for Devpost to join the conversation.