| Nama | NIM |
|---|---|
| Niko Samuel Simanjuntak | 13524029 |
| Irvin Tandiarrang Sumual | 13524030 |
| Kalyca Nathania B. Manullang | 13524071 |
Aplikasi web untuk pencarian dan rekomendasi buku menggunakan teknik Image Similarity (PCA) dan Text Similarity (LSA).
- Fitur
- Tech Stack
- Struktur Project
- Setup & Installation
- Cara Menjalankan
- API Documentation
- Kontribusi
- Pencarian Judul - Cari buku berdasarkan nama judul
- Pencarian Gambar - Upload cover buku, sistem akan mencari kesamaan visual
- Pencarian Dokumen - Upload file txt, sistem akan mencari kesamaan konten
- Rekomendasi LSA - Rekomendasi buku berdasarkan kesamaan konten teks
- Lihat cover dan judul buku
- Baca konten lengkap buku
- Dapatkan rekomendasi buku serupa
- Framework: Next.js 14 (React)
- Styling: Tailwind CSS
- UI Components: NextUI
- HTTP Client: Fetch API
- TypeScript: Type safety
- Framework: FastAPI (Python)
- Server: Uvicorn
- CORS: FastAPI CORS Middleware
- File Handling: FastAPI UploadFile
- Image Similarity: PCA (Principal Component Analysis)
- Text Similarity: LSA (Latent Semantic Analysis)
- Format: JSON (mapper), TXT (dokumen), JPG (cover)
- Storage: Local filesystem
algeo2-lemonnipis/
βββ data/ # Data terpusat
β βββ mapper.json # Mapping buku (ID, judul, cover, txt)
β βββ covers/ # Cover images (JPG)
β βββ txt/ # Dokumen buku (TXT)
β βββ uploads/ # Uploaded files (temporary)
β
βββ src/
β βββ backend/ # FastAPI backend
β β βββ main.py # Entry point, API routes
β β βββ pca_model.pkl # Trained PCA model
β β βββ lsa_model.pkl # Trained LSA model
β β βββ image/
β β β βββ __init__.py
β β β βββ image_processing.py # PCA image similarity
β β βββ document/
β β βββ __init__.py
β β βββ document_processing.py # LSA text similarity
β β
β βββ frontend/ # Next.js frontend
β βββ app/
β β βββ page.tsx # Home page
β β βββ layout.tsx # Root layout
β β βββ book-collection/
β β β βββ page.tsx # Book list page
β β β βββ [id]/ # Dynamic book detail page
β β βββ search-result/
β β βββ page.tsx # Search results page
β βββ components/
β β βββ navbar.tsx # Navigation bar
β β βββ search-input.tsx # Search input component
β β βββ book-detail/
β β β βββ content-view.tsx # Book content display
β β β βββ detail-wrapper.tsx # Book detail wrapper
β β β βββ recommendation-view.tsx # Recommendations
β β βββ icons.tsx # SVG icons & logo
β βββ config/
β β βββ api.ts # API configuration
β βββ public/
β β βββ LemonNipis.png # Logo
β βββ styles/
β βββ globals.css # Global styles
β
βββ .gitignore
βββ README.md # This file
βββ package.json
- Python 3.9+
- Node.js 18+
- npm atau yarn
git clone https://github.com/lemonnipis/algeo2-lemonnipis.git
cd algeo2-lemonnipis# Navigate ke backend
cd src/backend
# Create virtual environment
python -m venv venv
# Activate virtual environment
# Windows:
venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate
# Install dependencies
pip install fastapi uvicorn python-multipart nltk pillow scikit-learn numpy scipy
# Download NLTK data (one time only)
python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords')"# Navigate ke frontend
cd src/frontend
# Install dependencies
npm install
# atau
yarn install
# Create .env.local
echo "NEXT_PUBLIC_API_URL=http://localhost:8000" > .env.localPastikan folder data/ di root level sudah ada dengan struktur:
data/
βββ mapper.json # JSON mapping
βββ covers/ # Cover JPG files
βββ txt/ # Text TXT files
βββ uploads/ # Auto-created by backend
Format mapper.json:
{
"38427": {
"title": "The World as Will and Idea (Vol. 1 of 3)",
"cover": "covers/38427.jpg",
"txt": "txt/38427.txt"
}
}cd src/backend
# Activate venv (jika belum)
source venv/bin/activate # Linux/Mac
# atau
venv\Scripts\activate # Windows
# Run server
python main.pyServer akan berjalan di http://localhost:8000
Output yang diharapkan:
============================================================
PATH CONFIGURATION
============================================================
BASE_PATH: C:\ITB\Semester 3\AlGeo\algeo2-lemonnipis
DATA_DIR: C:\ITB\Semester 3\AlGeo\algeo2-lemonnipis\data -
...
============================================================
Starting Server
============================================================
Server Ready!
============================================================
Terminal baru:
cd src/frontend
# Development mode
npm run dev
# atau
yarn devFrontend akan berjalan di http://localhost:3000
http://localhost:3000
http://localhost:8000
Dapatkan semua buku dengan pagination
Query Parameters:
skip: int (default: 0)limit: int (default: 15)
Response:
{
"total": 1000,
"results": [
{
"id": "38427",
"title": "The World as Will and Idea",
"cover": "covers/38427.jpg",
"txt": "txt/38427.txt"
}
]
}Cari buku berdasarkan judul
Query Parameters:
q: string (required) - Query pencarianskip: int (default: 0)limit: int (default: 15)
Response:
{
"query": "harry",
"total": 5,
"results": [...]
}Dapatkan detail dan konten buku
Response:
{
"id": "38427",
"title": "The World as Will and Idea",
"cover": "covers/38427.jpg",
"content": "Lorem ipsum dolor sit amet..."
}Dapatkan rekomendasi buku berdasarkan LSA
Response:
{
"buku_yang_dicari_ditemukan": false,
"recommendations": [
{
"id": "12345",
"title": "Similar Book",
"cover": "covers/12345.jpg",
"similarity": 0.85
}
]
}Cari buku berdasarkan upload gambar
Request:
- Form data dengan file:
file(JPG/PNG)
Response:
{
"uploaded_file": "search_20250105_153022.jpg",
"uploaded_url": "/data/uploads/search_20250105_153022.jpg",
"total": 5,
"query_results": [
{
"id": "38427",
"title": "The World as Will and Idea",
"cover": "covers/38427.jpg",
"similarity": 0.92
}
]
}Cari buku berdasarkan upload dokumen TXT
Request:
- Form data dengan file:
file(TXT)
Response:
{
"total": 5,
"query_results": [...]
}Health check
Response:
{
"status": "ok"
}Cara kerja:
- Sistem membaca semua cover images dari
data/covers/ - Extract fitur visual menggunakan PCA
- User upload gambar
- Sistem bandingkan dengan database dan return top-5 hasil
Konfigurasi:
- Target image size: 200x300 pixels
- PCA components: 50
- Model file:
src/backend/pca_model.pkl
Cara kerja:
- Sistem membaca semua txt files dari
data/txt/ - Extract fitur semantic menggunakan LSA dengan stemming
- User upload txt file
- Sistem bandingkan dan return top-5 hasil
Konfigurasi:
- LSA components: 50
- Stemming: Enabled (Porter Stemmer)
- Model file:
src/backend/lsa_model.pkl
Cara kerja:
- User membaca buku tertentu
- Sistem ambil konten buku tersebut
- Query dengan LSA model
- Return rekomendasi buku dengan konten serupa
# Paths
BASE_PATH = Path(__file__).parent.parent.parent # Root project
DATA_DIR = BASE_PATH / "data"
MAPPER_PATH = DATA_DIR / "mapper.json"
# CORS
CORS Origins: ["http://localhost:3000", "http://localhost:3001"]
# PCA Model
target_width = 200
target_height = 300
k = 50
# LSA Model
k = 50
use_stemming = Trueexport const API_BASE_URL =
process.env.NEXT_PUBLIC_API_URL || "http://localhost:8000";cd src/backend
# Test health endpoint
curl http://localhost:8000/health
# Test get all books
curl http://localhost:8000/api/books
# Test search by title
curl "http://localhost:8000/api/search?q=harry"cd src/frontend
# Run tests
npm run test
# Build untuk production
npm run build
# Start production server
npm run startmapper.json harus berisi:
{
"ID": {
"title": "Judul Buku",
"cover": "covers/ID.jpg",
"txt": "txt/ID.txt"
}
}File naming:
- Cover:
{ID}.jpg(e.g.,38427.jpg) - Text:
{ID}.txt(e.g.,38427.txt) - ID harus unique dan match di mapper
- Large datasets: Update
limitparameter di pagination - Model training: Models di-cache di
src/backend/*.pkl - Image size: Standardize ke 200x300 untuk consistency
# Check Python version
python --version # harus 3.9+
# Check dependencies
pip list | grep fastapi
# Check port 8000 sudah digunakan?
# Windows:
netstat -ano | findstr :8000
# Linux/Mac:
lsof -i :8000
# Kill process dan restartpython -c "import nltk; nltk.download('punkt'); nltk.download('stopwords')"- Pastikan data/covers/ dan data/txt/ ada
- Pastikan file format JPG dan TXT valid
- Cek console untuk error messages
# Cek .env.local
cat src/frontend/.env.local
# Harus berisi:
# NEXT_PUBLIC_API_URL=http://localhost:8000
# Restart Next.js dev server
npm run devfastapi==0.104.1
uvicorn==0.24.0
python-multipart==0.0.6
nltk==3.8.1
pillow==10.1.0
scikit-learn==1.3.2
numpy==1.26.2
scipy==1.11.4
{
"next": "14.0.0",
"react": "^18.2.0",
"tailwindcss": "^3.3.0",
"@nextui-org/react": "^2.2.0"
}MIT License - Bebas digunakan untuk keperluan apapun
Happy Searching! π