ARCHIVED — This repository has been merged into open-pryv.io. All development continues there. This repo is kept as read-only reference.
Pryv.io core server — handles user data (events, streams, accesses, webhooks) with pluggable storage engines.
Current version: 2.0.0-pre.2
Working branch: refactor/pre-v2
Node.js: 22.x
Linting: neostandard with { semi: true }
- System streams refactor — clean Mall-based account store replaces scattered serializer (639→158 lines)
- openSource flag removed — all features always enabled (webhooks, HFS, cache sync, email check)
- Webhooks in-process — webhooks service runs inside the API server process (no separate container)
- Metadata updater inlined — direct function call in HFS replaces TChannel RPC;
metadata/tprpccomponents andtchannel/protobufjsdependencies removed - Cluster mode —
bin/master.jsmanages N API workers via Node.js cluster module (replaces runit)
- Storage plugin architecture — engines (MongoDB, PostgreSQL, SQLite, filesystem, InfluxDB) are plugins under
storages/engines/with manifest-driven config - Engine-agnostic production code — zero
@pryv/boilerimports in engines, config/logging injected via_internals.js - Formal storage interfaces —
storages/interfaces/with contracts for all storage types
node bin/master.js
│
├── Master process
│ ├── TCP pub/sub broker (:4222)
│ └── Process manager (fork/monitor workers)
│
├── N × API Worker (cluster, shared :3000)
│ ├── API routes (events, streams, accesses, auth, …)
│ ├── Socket.IO (real-time notifications)
│ └── Webhooks subscriber (in-process)
│
├── M × HFS Worker (cluster, shared :4000, 0 = disabled)
│ ├── Series routes (high-frequency data)
│ └── Metadata updater (in-process)
│
└── 0-1 × Previews Worker (:3001, lazy/optional)
Standalone mode (dev/test): just start api-server
Cluster mode: just start-master or node bin/master.js
cluster:
apiWorkers: 2 # number of API workers (default: 2)
hfsWorkers: 1 # number of HFS workers (default: 1, 0 = disabled)
webhooks:
inProcess: true # start webhooks in API server (default: true)
minIntervalMs: 5000
maxRetries: 5| Component | Purpose |
|---|---|
api-server |
HTTP API + Socket.IO + webhooks service |
hfs-server |
High-frequency series data (InfluxDB) |
previews-server |
Image preview generation (GraphicsMagick) |
business |
Business logic, webhooks, series, system streams |
storage |
Engine-agnostic storage layer |
storages |
Plugin barrel — engines, interfaces, init |
mall |
Data access layer (events, streams, accesses) |
cache |
In-memory caching with pub/sub invalidation |
messages |
TCP pub/sub broker + client |
audit |
Audit logging (SQLite) |
middleware |
Express middleware (auth, versioning, errors) |
test-helpers |
Shared test infrastructure |
| Engine | Storage types | Status |
|---|---|---|
| MongoDB | base, dataStore | Production |
| PostgreSQL | base, dataStore, series, audit | Production |
| SQLite | dataStore (per-user), user account, user index, audit | Production |
| rqlite | platform (single- and multi-core) | Production |
| Filesystem | file (attachments) | Production |
| InfluxDB | series (HFS) | Production |
Prerequisites:
makeand C/C++ compilation support- Node.js 22.x (use nvm or n)
- MongoDB 4.2+ (included via
scripts/setup-dev-env) - InfluxDB 1.x (
storages/engines/influxdb/scripts/setupon Linux,brew install influxdb@1on macOS) - GraphicsMagick (optional, for previews):
apt-get install graphicsmagick/brew install graphicsmagick - just
just setup-dev-env # setup local file structure + MongoDB
just install # install node modulesjust start-deps # start MongoDB + InfluxDB
just start-master # cluster mode (N API workers)
just start api-server # single API server (dev)
just start hfs-server # HFS server (dev)
just start-mon api-server # auto-restart on file changesUse nginx as a reverse proxy in front of bin/master.js for SSL termination, domain routing, and serving multiple Pryv services on a single host.
Each backend uses cluster mode internally — bin/master.js forks N API workers sharing a single port, and HFS workers share their own port. nginx routes traffic to those ports.
# Single command — master manages all workers:
# N API workers sharing :3000 (includes webhooks)
# M HFS workers sharing :4000 (0 = disabled)
# Previews worker on :3001 (optional, lazy)
NODE_ENV=production node bin/master.js --config /path/to/config.yml# Config keys for worker counts
cluster:
apiWorkers: 2 # N API workers (default: 2)
hfsWorkers: 1 # M HFS workers (default: 1, 0 = disabled)
previewsWorker: true # lazy spawn on first requestThe master process hosts the TCP pub/sub broker (:4222). All workers connect as clients automatically.
upstream api_backend {
# Cluster workers share :3000 — single upstream entry
# ip_hash recommended for connection affinity (optional: Socket.IO uses WebSocket-only in cluster mode)
ip_hash;
server 127.0.0.1:3000;
}
upstream hfs_backend {
server 127.0.0.1:4000;
}
server {
listen 443 ssl;
server_name core.example.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
# API server (default)
location / {
proxy_pass http://api_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Socket.IO — requires WebSocket upgrade
location /socket.io/ {
proxy_pass http://api_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
# HFS (high-frequency series)
location ~ ^/[^/]+/series/ {
proxy_pass http://hfs_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}bin/master.js alone |
Behind nginx | |
|---|---|---|
| SSL termination | Via backloop.dev or config | nginx handles SSL |
| Domain routing | Single service | Multiple services per host |
| Static files | Not handled | nginx serves directly |
| Process management | Built-in auto-restart | Combine with systemd or PM2 |
| Clustering | Built-in (cluster module) | Same — cluster runs behind nginx |
Export all user data (events, streams, accesses, profile, webhooks, account, audit, attachments) and platform data to a portable JSONL+gzip archive.
# Full backup (compressed by default)
node bin/backup.js --output /path/to/backup
# Single user
node bin/backup.js --output /path/to/backup --user <userId>
# Uncompressed (for debugging / human inspection)
node bin/backup.js --output /path/to/backup --no-compress
# Incremental (only changes since previous backup, auto-detected per user)
node bin/backup.js --output /path/to/backup --incrementalOutput is engine-agnostic: the same backup can be restored into MongoDB, PostgreSQL, or SQLite.
Backups use snapshot consistency: a timestamp is recorded at start, and only items modified before that timestamp are exported. Concurrent writes during backup are excluded and will be captured by the next incremental run. No system interruption or user freeze required.
# Full restore
node bin/backup.js --restore /path/to/backup
# Overwrite existing data
node bin/backup.js --restore /path/to/backup --overwrite
# Single user
node bin/backup.js --restore /path/to/backup --user <userId>
# Skip conflicting users + cleanup
node bin/backup.js --restore /path/to/backup --skip-conflicts --move-on-success /path/to/done
# Verify integrity after restore (rolls back on failure)
node bin/backup.js --restore /path/to/backup --overwrite --verify-integrityWhen --verify-integrity is set, integrity hashes are recomputed on every restored event and access. If any mismatch is found, the affected user's data is rolled back (cleared).
Standalone per-user integrity verification for health data compliance. Recomputes hashes on events and accesses and compares against stored values.
# Check all users
node bin/integrity-check.js
# Check a single user
node bin/integrity-check.js --user <userId>
# JSON output (for automation)
node bin/integrity-check.js --jsonExit code 0 = all checks passed, 1 = integrity errors found.
just test all # all components (MongoDB)
just test api-server # single component
just test-pg all # PostgreSQL mode
just test-detailed api-server # verbose output
just test-debug api-server # with debugger
just test-parallel all # parallel file execution
just clean-test-data # reset SQLite DBs + user dirsExtra Mocha params: --bail (stop on first failure), --grep <text> (filter tests)
Environment variables: LOGS=<level> (show server output), DEBUG="*" (debug info)
service-core/
├── bin/ # Entry points
│ ├── master.js # Cluster master (N API workers)
│ ├── backup.js # Backup/restore CLI
│ └── integrity-check.js # Data integrity verification CLI
├── components/ # Application components (npm workspaces)
│ ├── api-server/ # Main API server
│ ├── hfs-server/ # High-frequency series server
│ ├── previews-server/ # Image previews
│ ├── business/ # Business logic
│ ├── storage/ # Storage abstraction layer
│ ├── mall/ # Data access layer
│ ├── cache/ # Caching
│ ├── messages/ # TCP pub/sub
│ ├── audit/ # Audit logging
│ ├── middleware/ # Express middleware
│ ├── webhooks/ # Webhook business logic (runs in api-server)
│ └── test-helpers/ # Test infrastructure
├── storages/ # Plugin system (npm workspace)
│ ├── engines/ # mongodb, postgresql, sqlite, filesystem, influxdb
│ └── interfaces/ # Formal contracts per storage type
│ └── backup/ # Backup/restore writer/reader interfaces
├── build/ # Docker + deployment
└── justfile # Development commands
Configuration loads from (last takes precedence):
- Component default config (
components/<name>/config/default-config.yml) - Environment-specific config (
{env}-config.yml) - Config file via
--config <path> - Command-line options (
--key:path=value)
Multi-core deployments host users across N cores sharing a single rqlite-replicated PlatformDB. Two topology variants:
| Variant | DNS | core.url |
|---|---|---|
| Domain-derived (legacy) | {username}.{domain} resolved by the embedded DNS server or external wildcard |
Auto-derived from core.id + dns.domain |
| DNSless (Plan 27 Phase 2) | Externally managed (load balancer, fixed FQDNs) | Explicit core.url per core in YAML |
Both variants use:
GET /reg/cores?username=X— discovery route. Returns the URL of the core hosting the user. Any core can answer (reads from PlatformDB).- HTTP 421 wrong-core middleware — if a
/:username/*request hits the wrong core, the response is421 Misdirected Requestwith{ error: { id: 'wrong-core', coreUrl } }. SDKs retry againstcoreUrl. No HTTP redirect (cross-origin redirects stripAuthorizationheaders). /reg/*and/system/*are intentionally load-balanced — the wrong-core middleware is bypassed for those.
See SINGLE-TO-MULTIPLE.md for the full upgrade procedure.
service-core v2 groups configuration into three categories. Multi-core deployments must respect this split or cores will drift and users will see inconsistent behaviour depending on which core answers their request.
| Category | Meaning | Source |
|---|---|---|
| Per-core | Local to this node — ports, IPs, worker counts, log paths, DB credentials for this host, local tuning | YAML/env, each node has its own values |
| Platform-wide | MUST be identical across all cores in a deployment — policy, user schema, identity, feature toggles | PlatformDB (rqlite-replicated) is authoritative; YAML seeds on first boot |
| Bootstrap | Platform-wide in meaning, but needed before PlatformDB is reachable — how to connect to PlatformDB, admin key, first-boot seeds | YAML only; operator responsibility to keep identical across cores |
Per-core examples: http.port, cluster.apiWorkers, core.id, logs.*, storages.engines.mongodb.*
Platform-wide examples: dns.domain, hostings, invitationTokens (already in PlatformDB), custom.systemStreams, password policy, integrity.algorithm, webhook retry contract
Bootstrap examples: storages.platform.engine + storages.engines.rqlite.*, auth.adminAccessKey, first-boot seeds
Every block in config/default-config.yml is annotated with its category. On startup, cores log a warning when local YAML disagrees with PlatformDB for known platform-wide values — look for [platform-drift] in the logs.
For the full categorization of every config key, see _plans/27-pre-open-pryv-merge-atwork/CONFIG-SEPARATION.md.
Consolidating service-core's separate processes behind a single master.
- Phase 1.1: Inline metadata updater into HFS (remove TChannel RPC)
- Phase 1.2: Webhooks as in-process subscriber (remove separate container)
- Phase 2: Create
bin/master.jswith cluster module - Phase 3: Add HFS as configurable child processes (M workers, 0=disabled)
- Phase 4: Add previews worker (config-toggleable, GM check)
- Phase 5: Single Dockerfile (replace per-component Dockerfiles + runit)
- Phase 6: Socket.IO WebSocket-only in cluster mode (no sticky sessions needed)
| Plan | Description | Priority |
|---|---|---|
| Merge service-register | Integrate user discovery service into service-core; evaluate PG/SQLite for shared register data | High |
| Merge service-mfa | Absorb MFA service as in-process module within API server | High |
| Previews: replace GM | Replace GraphicsMagick with pure-Node image processing | Medium |
| Finalize storage plugins | Complete test infrastructure for storage plugin architecture | Medium |
| SQLite streams storage | Re-implement SQLite nested-set tree for streams + prevent system stream leaks into storage | Low |
| TypeScript + ESM | Migrate from CommonJS to TypeScript with ESM output; enable top-level await | Low |