📊 Social Media Analytics

A collection of course materials, tutorials, and projects covering the full pipeline of social media analytics — from data collection to advanced NLP modelling — with a focus on Indonesian-language content and the Twitter/X platform.

📁 Repository Structure

social-media-analytics/
├── Tutorial1_TextMining/          # Text mining fundamentals
├── Tutorial2_Topic Modelling/     # Topic modelling with LDA & variants
├── Tutorial3_Data Collection/     # Twitter data collection (API & Twint)
├── tugas_1/                       # Assignment 1: basic data analysis
├── proyek tengah semester/        # Mid-term project: user profiling
├── proyek akhir semester/         # Final project: stance detection
├── graph1.ipynb                   # Graph/network analysis basics
├── graph2.ipynb                   # Extended graph analysis
├── pagerank.ipynb                 # PageRank algorithm on social networks
└── script.py                      # Twitter API utility script (Tweepy)

📚 Tutorials

Tutorial 1 — Text Mining

Covers the core NLP pipeline applied to social media text:

Text preprocessing (tokenization, stopword removal, normalization)
Word Embeddings (Word2Vec, GloVe)
Transformer-based Language Models (BERT, IndoBERT)

Tutorial 2 — Topic Modelling

Unsupervised discovery of latent topics from Twitter corpora:

Latent Dirichlet Allocation (LDA)
Indonesian-language datasets (e.g., trending topics on Twitter)
Preprocessing with colloquial lexicon & abbreviation dictionaries

Tutorial 3 — Data Collection

Methods for collecting social media data:

Twitter API v2 via tweepy
Twint for scraping without API rate limits
Structured storage of collected tweets

🎓 Projects

Mid-Term Project — User Profiling

Predicting user attributes from tweet content and profile metadata:

Gender classification using TF-IDF, LSTM, and Transformer models
Occupation classification using large Transformer models
Exploratory Data Analysis (EDA) and error analysis included

Final Project — Stance Detection & Network Analysis

End-to-end analysis of opinion and influence on Twitter:

Tweet collection via Twint
Stance detection (e.g., pro/against a topic) using fine-tuned Transformers
Network analysis: retweet/like graphs, PageRank-based influence scoring

🛠️ Technologies & Libraries

Category	Tools
Data Collection	`tweepy`, `twint`
Data Processing	`pandas`, `numpy`
NLP	`nltk`, `scikit-learn`, `gensim`
Deep Learning	`transformers` (HuggingFace), `tensorflow` / `pytorch`
Network Analysis	`networkx`
Visualization	`matplotlib`, `seaborn`
Environment	Python 3, Jupyter Notebook

🚀 Getting Started

Clone the repository

git clone https://github.com/nichsedge/social-media-analytics.git
cd social-media-analytics

Set up a virtual environment

python -m venv .env
source .env/bin/activate        # Linux/macOS
.env\Scripts\activate.bat       # Windows

Install dependencies (per tutorial/project folder as needed)

pip install tweepy pandas numpy nltk scikit-learn gensim transformers networkx matplotlib seaborn

Open notebooks
```
jupyter notebook
```

⚠️ Notes

Some notebooks use Indonesian-language datasets and lexicons (e.g., colloquial-indonesian-lexicon.csv, stopwordsID.csv).
Twitter API credentials in script.py are for reference only — replace with your own keys before running.
Large dataset files (.csv) may not be included in the repository due to size constraints.

📄 License

This repository is intended for educational purposes as part of a Social Media Analytics course.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Social Media Analytics

📁 Repository Structure

📚 Tutorials

Tutorial 1 — Text Mining

Tutorial 2 — Topic Modelling

Tutorial 3 — Data Collection

🎓 Projects

Mid-Term Project — User Profiling

Final Project — Stance Detection & Network Analysis

🛠️ Technologies & Libraries

🚀 Getting Started

⚠️ Notes

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Tutorial1_TextMining		Tutorial1_TextMining
Tutorial2_Topic Modelling		Tutorial2_Topic Modelling
Tutorial3_Data Collection		Tutorial3_Data Collection
proyek akhir semester		proyek akhir semester
proyek tengah semester		proyek tengah semester
tugas_1		tugas_1
README.md		README.md
graph1.ipynb		graph1.ipynb
graph2.ipynb		graph2.ipynb
pagerank.ipynb		pagerank.ipynb
script.py		script.py

Folders and files

Latest commit

History

Repository files navigation

📊 Social Media Analytics

📁 Repository Structure

📚 Tutorials

Tutorial 1 — Text Mining

Tutorial 2 — Topic Modelling

Tutorial 3 — Data Collection

🎓 Projects

Mid-Term Project — User Profiling

Final Project — Stance Detection & Network Analysis

🛠️ Technologies & Libraries

🚀 Getting Started

⚠️ Notes

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages