The app performs the following functionalities:
- Create a supervised training set with the Pocket App:
a. Install Pocket Extension on Chrome (https://chrome.google.com/webstore/search/pocket)
b. Set up Pocket API (https://getpocket.com/developer/apps/new)
- Save articles to personal repository with a tag of our choosing (‘y’ for interesting articles and ‘n’ for non-interesting) – For creating supervised (tagged) training dataset
- Fetch the JSON data using the API
- Save articles to personal repository with a tag of our choosing (‘y’ for interesting articles and ‘n’ for non-interesting) – For creating supervised (tagged) training dataset
- Download story bodies using Embed.ly API from URLs generated in Step 1
a. Set up Embed.ly (https://app.embed.ly/signup)
b. Get story bodies using Embed.ly API - Transform the articles to the TF-IDF matrix to apply machine learning algorithms
- Apply linear SVM to our data
a. SVM attempts to linearly separate data points into classes using a maximum margin hyperplane (H3 in the below figure is the maximum margin line). SVM however does not work effectively when there is an overlap of points which can be solved in two ways
- Soft margin SVM : This formulation also maximizes the margin but at the cost of penalty of points that falls under the wrong side of the margin
- Kernel trick : This method transforms the data into a higher dimensional space where the data can be linearly separated (https://www.cs.utah.edu/~piyush/teaching/15-9-print.pdf)
- Soft margin SVM : This formulation also maximizes the margin but at the cost of penalty of points that falls under the wrong side of the margin
- IFTTT Integration with Feeds, Google Sheets & Email:
a. Search for feed and click connect
b. Do the same for Google Drive and allow IFTTT to access the drive account
c. Create a new applet, on this select the New Feed Item and add the following URL : http://feeds2.feedburner.com/business and click create trigger
d. Next, click on that, search for Google Drive and select the Add row to spreadsheet and click Create Action and then Finish
e. pip install gspread library to download articles from Google Drive
- Follow the instructions after installing the gspread library http://gspread.readthedocs.io/en/latest/oauth2.html
- Follow the instructions after installing the gspread library http://gspread.readthedocs.io/en/latest/oauth2.html
- Dump the model using Pickle
- Using IFTTT to send mails for new digest
a. Load the dumped pickle model
b. Use IFTTT to automatically send emails for new digest