Skip to content

vineet-k09/E-Book-Recommendation

Welcome to Biblioverse

πŸ“š E-Book Recommendation System

🎯 Overview

A full-stack E-Book Recommendation System that leverages:

  • User interactions stored in MongoDB
  • Batch exports to Hadoop HDFS
  • MapReduce jobs for data aggregation
  • Hive for querying processed data
  • Spark MLlib for collaborative filtering recommendations
  • Frontend in Next.js showing personalized and explored books

All running locally with manual batch jobs to keep system resource usage in check.

🧭 Visual Blueprint β€” See the System in Action

Figma Design

Click the badge to explore the full layout on Figma β€” from user flow to UI components.


🧰 Tech Stack

Layer Tools & Frameworks
Frontend Next.js, React, Context API
Backend Node.js, Express, MongoDB
Big Data Hadoop (HDFS, MapReduce), PySpark MLlib
Environment Localhost (Hadoop & Spark), manual batch runs

πŸ—‚οΈ Backend Data Flow

MongoDB Schema for Interactions

interactions: [
  {
    userId: String,
    bookId: String,
    action: 'bookmark' | 'like' | 'dislike' | 'read',
    timestamp: Date
  }
]

Export Script (exportInteractions.js)

Run this outside your Express backend to keep batch jobs clean and safe.

async function exportInteractions() {
  try {
    const books = await Book.find().lean();
    fs.writeFileSync('interactions.json', JSON.stringify(books, null, 2));
    console.log('βœ… Exported to interactions.json');
    mongoose.disconnect();
  } catch (err) {
    console.error('❌ Error exporting interactions:', err);
  }
}
exportInteractions();

How to push to HDFS (terminal):

hdfs dfs -put interactions.json /user/yourusername/interactions/

Hadoop MapReduce Job Example

hadoop jar %HADOOP_HOME%\share\hadoop\tools\lib\hadoop-streaming-*.jar \
  -input /user/vineet/interactions/interactions.json \
  -output /user/vineet/processed/interaction_summary \
  -mapper "python3 backend/scripts/hadoop/mapper.py" \
  -reducer "python3 backend/scripts/hadoop/reducer.py" \
  -file backend/scripts/hadoop/mapper.py \
  -file backend/scripts/hadoop/reducer.py

  • Mapper: emits (bookId, action) pairs
  • Reducer: aggregates counts per (bookId, action)


Hive Query Sample

CREATE EXTERNAL TABLE interactions (
  bookId STRING,
  action STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION '/user/vineet/processed/interaction_summary';

SELECT bookId, COUNT(*) AS total_likes
FROM interactions
WHERE action = 'like'
GROUP BY bookId;

PySpark MLlib Recommendation Pipeline (Overview)

  • Input: (userId, bookId, rating) tuples from output.txt
  • Model: ALS collaborative filtering
  • Output:
    1. Personalized book recommendations
    2. Popular Choises Suggestions
  • Batch-run manually to control memory use

Check the code on GoogleCollab


πŸ–₯️ Frontend Highlights

Auth Context Setup

pages/_app.tsx:

import type { AppProps } from 'next/app';
import { AuthProvider } from '@/context/AuthContext';
import '../styles/globals.css';

export default function MyApp({ Component, pageProps }: AppProps) {
  return (
    <AuthProvider>
      <Component {...pageProps} />
    </AuthProvider>
  );
}

🌐 Local Hadoop UI Access


βš™οΈ Java & Hadoop Setup (PowerShell)

$env:JAVA_HOME="E:\hadoop\jdk-8.0.302.8-hotspot"
$env:PATH="$env:JAVA_HOME\bin;$env:PATH"

javac -cp "" -d classes GenrePreference*.java
# Hadoop config files are in E:\hadoop\hadoop324\etc\...

🏁 Summary

  • Hadoop & Spark run locally; manual batch runs prevent RAM overload
  • Full MongoDB schema and interaction logging implemented
  • Book metadata and covers (via Google Books API) fully integrated
  • Frontend ready to consume recommendations from backend API

About

BiblioVerse - E-Book Recommendation project based on hadoop and react with spark

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors