Skip to content

Asymmetric-OG/news-class

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NEWS CATEGORY CLASSIFICATION USING RECURRENT NEURAL NETWORKS

Comparing two different architectures of RNNs, the former a bidirectional GRU and the latter a bidirection LSTM, on how they train on a text categorization task.


Technologies

  • Pytorch
  • Natural Language ToolKit (NLTK)
  • Scikit-learn
  • Pandas, Numpy
  • Netron.app
  • Matplotlib

Model Architectures

GRU (https://netron.app/?url=https://github.com/Asymmetric-OG/NewsClass/raw/refs/heads/master/grumodel.onnx)

grumodel onnx

LSTM (https://netron.app/?url=https://github.com/Asymmetric-OG/NewsClass/raw/refs/heads/master/lstm.onnx)

lstm onnx

Observations (Training-Validation Curves)

Evidently, the GRU overfits early and heavily due to the vanishing gradients problems whereas the LSTM has more stable training due to its better performance on longer sequences of text.

LR=1e-3

graph

Peak GRU validation accuracy : 66.2
Peak LSTM validation accuracy : 71.5

LR=1e-5

EPOCH(1-25) one-twenty5

EPOCH(25-50) twenty5-fifty

Peak GRU validation accuracy : 60+ (OVERFITTED)
Peak LSTM validation accuracy : 60+ (GENERALISES WELL)

This emphasizes on the LSTMs ability to tweak its gradients efficiently over a period of 50 epochs whereas they explode/vanish for the former model.


File Overview

  • Dataset.json : News Category Classification Dataset.
  • classifier.ipynb : The entire workflow.
  • grumodel.onnx : Post-training GRU model for visualisation
  • lstm.onnx : Post-training GRU model for visualisation

About

Comparitive study of two RNN architectures at News Category Classification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors