A lightweight, log-structured key-value store built from scratch in C++17, inspired by Bitcask-style storage engines. TitanDB demonstrates core systems concepts like persistent storage, crash recovery, append-only design, and in-memory indexing.
Basic CRUD + Compaction in action
-
⚡ Append-Only Log Storage
- All writes (
put,update,delete) are appended to disk - Ensures sequential I/O for better performance
- All writes (
-
🧠 In-Memory Hash Index
- Maps keys → file offsets
- Enables O(1) lookups without loading entire DB into memory
-
💾 Persistent Storage
- Data survives across program restarts
- Automatic index reconstruction via file scan on startup
-
🪦 Soft Deletes (Tombstones)
- Deletes are handled via tombstone records
- Avoids expensive in-place modifications
-
🛡️ Crash Resilience (Basic)
- Safe record reading using
rec_read_safe - Prevents corruption from partial writes
- Safe record reading using
-
🔁 Update via Append
- Updates create new versions instead of overwriting
- Older versions are cleaned during compaction (planned)
+----------------------+
| In-Memory Index |
| key → file offset |
+----------+-----------+
|
↓
+----------------------+
| Append-Only Log |
| [Record][Record]... |
+----------------------+
struct Record {
char key[32];
char val[64];
bool alive; // tombstone flag
};- Fixed-size binary serialization
- Enables direct offset-based access (
seekg,seekp)
- Appends new record to file
- Updates index with latest offset
- Looks up offset from index
- Reads record from disk
- Returns value if
alive == true
- Internally calls
put() - Follows append-only semantics
- Appends tombstone record (
alive = false) - Removes key from index
On initialization:
- File is scanned sequentially
- Index is rebuilt from records
- Latest state of each key is restored
for (pos = 0; pos < file_end; pos += sizeof(Record)) {
if (!rec_read_safe(...)) break;
if (r.alive)
ind[r.key] = pos;
else
ind.erase(r.key);
}g++ -std=c++17 main.cpp TitanDB.cpp record.cpp -o titan./titanTitanDB db("titan.db");
db.put("username", "abhigyan");
db.put("college", "NIT Silchar");
db.update("college", "IIT Bombay");
auto val = db.get("username");
if (val) std::cout << *val << std::endl; // abhigyan
db.remove("username");
auto gone = db.get("username");
if (!gone) std::cout << "Key not found\n"; // Key not found- Fixed-size keys/values
- No concurrency control (single-threaded)
- No checksum validation (corruption detection is basic)
- No compaction (log grows indefinitely)
- 🔁 Log Compaction — Remove stale records and reclaim disk space
- 🔐 Thread Safety — Add mutex/shared_mutex for concurrent access
- 🧾 Write-Ahead Logging (WAL) — Improve crash guarantees
- 🧠 Bloom Filters — Faster negative lookups
- 📦 Variable-Length Records — More flexible storage
- 🛡️ Checksums — Detect corrupted/partial writes reliably
- File I/O (
seekg,seekp, binary serialization) - Log-structured storage design
- In-memory indexing
- Crash recovery techniques
- Tombstone-based deletion
- Trade-offs between in-place vs append-only updates
29 automated tests across 8 categories:
| Category | What it covers |
|---|---|
| Basic CRUD | put, get, update, delete, duplicates |
| Persistence | index rebuild after restart |
| Compaction | live records preserved, dead dropped |
| Stale Cleanup | latest version kept after multiple updates |
| Compaction Persistence | compacted file survives restart |
| Corruption Handling | partial writes don't corrupt valid data |
| Edge Cases | spaces in values, bulk insert/delete |
| Size Limits | keys/values exceeding fixed-width bounds |
g++ -std=c++17 tests/test_titandb.cpp src/TitanDB.cpp src/record.cpp -Iinclude -o run_tests
./run_testsExpected output:
Results: 28 passed, 1 failed
Success Rate: 96.5517%
Execution Time: X ms
Abhigyan Tiwari
- GitHub: @Abh-igyan
- B.Tech CSE, NIT Silchar (2024–2028)
This project is licensed under the MIT License.
