- Apache Airflow
- Big Data Algorithms
- Big Data Architecture
- Big Data Basics
- Big Data Environment Setup
- Data Analysis
- Data Science
- Data Stream Development with Apache Spark, Kafka, and Spring Boot
- Data Virtualization
- Data Warehouse
- Elastic
- ETL
- Flume
- Flume and Kafka Integration
- Generative AI
- Hadoop
- HBase
- Hive
- Kafka
- Kafka Streams
- Machine Learning
- Pig
- Power BI
- Python Data Science Libraries
- Spark
- Spark Coding
- Spark Streaming
- Spark Streaming Real Project Tutorial
- Kafka client app (producer and consumer)
- Spark Streaming receives socket data and does word count.
- Spark Streaming processes file system (local/hdfs) data and does word count.
- Spark Streaming processes socket data with state and does word count.
- Spark Streaming processes socket data and save the wordcount result into MySQL.
- Use Spark Streaming to filter blacklist.
- Integrate Spark Streaming and Spark SQL to process socket data and do word count.
- Integrate Spark Streaming and Flume to process socket data and do word count in push-based approach.
- Integrate Spark Streaming and Flume to process socket data and do word count in pull-based approach.
- Integrate Spark Streaming and Kafka to do word count in direct approach.
- Web Log Streaming Workflow
- Spark Streaming Real Project
- Web Log Generator
- Web Log ==> Flume
- Web Log ==> Flume ==> Kafka
- Web Log ==> Flume ==> Kafka ==> Spark Streaming
- Web Log ==> Flume ==> Kafka ==> Spark Streaming ==> Data Cleansing
- Do Statistics about Page View (PV)
- Data Visualization
- Web Scraping
- ZooKeeper
lizhanmit/learning-notes
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|