Stats resources and usage sorting by unique field#10472
Conversation
📝 WalkthroughWalkthroughAdded deterministic pre-write sorting in multiple locations to align document write order with unique/index keys and reduce DB lock/deadlock contention. Changes:
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ts-resources-try-catch
Security Scan Results for PRDocker Image Scan Results
Source Code Scan Results🎉 No vulnerabilities found! |
| Console::error('Error: ' . $message . ' | Exception: ' . $e->getMessage()); | ||
| throw $e; | ||
| } finally { | ||
| $this->documents = []; |
There was a problem hiding this comment.
Reverting since it is set in the beginning of the script.
Was added in the previous PR , which was merged
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/Appwrite/Platform/Workers/StatsResources.php (1)
437-451: Tighten comparator: type-safe accessors, null-safe time ordering, and PSR-12 fixes
- Prefer Document::getAttribute over array access for consistency and type-safety.
- Handle null time explicitly (period "inf") to avoid loose comparisons.
- Add parameter/return types to the comparator and fix spacing for PSR-12 (addresses the linter failure).
- Clarify the comment (was incomplete).
Apply:
- // sort by unique index key to make - usort($this->documents, function($a, $b) { - // metric DESC - $cmp = strcmp($b['metric'], $a['metric']); - if ($cmp !== 0) return $cmp; - - // period ASC - $cmp = strcmp($a['period'], $b['period']); - if ($cmp !== 0) return $cmp; - - // time ASC - if ($a['time'] === $b['time']) return 0; - return ($a['time'] < $b['time']) ? -1 : 1; - }); + // Sort by composite unique index: metric DESC, period ASC, time ASC. + usort($this->documents, function (Document $a, Document $b): int { + // metric DESC + $cmp = strcmp((string) $b->getAttribute('metric'), (string) $a->getAttribute('metric')); + if ($cmp !== 0) { + return $cmp; + } + // period ASC + $cmp = strcmp((string) $a->getAttribute('period'), (string) $b->getAttribute('period')); + if ($cmp !== 0) { + return $cmp; + } + // time ASC (null treated as smallest) + $ta = $a->getAttribute('time'); + $tb = $b->getAttribute('time'); + if ($ta === $tb) { + return 0; + } + if ($ta === null) { + return -1; + } + if ($tb === null) { + return 1; + } + return strcmp((string) $ta, (string) $tb); + });If region can vary within a batch (unlikely here), include it in the sort key for strict index-order writes.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Appwrite/Platform/Workers/StatsResources.php(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linter
src/Appwrite/Platform/Workers/StatsResources.php
[error] 1-1: Pint lint failed: PSR-12 style issue detected in StatsResources.php. Step: docker run --rm -v $PWD:/app composer sh -c 'composer install --profile --ignore-platform-reqs && composer lint'.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: E2E Service Test (Teams)
- GitHub Check: E2E Service Test (Health)
- GitHub Check: E2E Service Test (Webhooks)
- GitHub Check: E2E Service Test (Messaging)
- GitHub Check: E2E Service Test (Users)
- GitHub Check: E2E Service Test (Migrations)
- GitHub Check: E2E Service Test (VCS)
- GitHub Check: E2E Service Test (Account)
- GitHub Check: E2E Service Test (Sites)
- GitHub Check: E2E Service Test (Console)
- GitHub Check: E2E Service Test (Projects)
- GitHub Check: E2E Service Test (Functions)
- GitHub Check: E2E Service Test (Avatars)
- GitHub Check: E2E Service Test (Locale)
- GitHub Check: E2E Service Test (GraphQL)
- GitHub Check: E2E Service Test (Databases/TablesDB)
- GitHub Check: E2E Service Test (Dev Keys)
- GitHub Check: E2E Service Test (Site Screenshots)
- GitHub Check: E2E General Test
🔇 Additional comments (1)
src/Appwrite/Platform/Workers/StatsResources.php (1)
455-456: Replace magic number with constant, improve comment, and verify parameter meaning
- Declare a class constant (e.g.
private const UPSERT_CHUNK_SIZE = 10) and use it instead of the literal10to convey intent.- Change the placeholder comment to something descriptive, for example:
// Batch size for upsert/chunking- Ensure that this value maps to the correct parameter in
Database::createOrUpdateDocuments(the method signature lives in an external dependency), and adjust if needed before merging.Apply:
- $this->documents, - 10 // See if this make an effect + $this->documents, + self::UPSERT_CHUNK_SIZE // Batch size for upsert/chunkingAdd near the top of the class:
private const UPSERT_CHUNK_SIZE = 10;
✨ Benchmark results
⚡ Benchmark Comparison
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/Appwrite/Platform/Workers/StatsResources.php (3)
437-439: Comment clarityNit: “sort by unique index key reduce locks/deadlocks” → “Sort by unique index key to reduce locks/deadlocks.”
440-447: Optional: explicit period orderingIf the intended order is 1h < 1d < inf, lexicographic compare won’t guarantee it. Consider a map for deterministic ASC.
- // Period ASC - $cmp = strcmp($a->getAttribute('period'), $b->getAttribute('period')); + // Period ASC (explicit order) + $order = ['1h' => 0, '1d' => 1, 'inf' => 2]; + $cmp = ($order[$a->getAttribute('period')] ?? 999) + <=> ($order[$b->getAttribute('period')] ?? 999);
460-462: Batch size: verify API, avoid magic number
- Ensure
Database::createOrUpdateDocumentssupports a third param and that 10 is an intended batch size.- Replace magic number with a named constant/variable.
- 10 // See if this make an effect + 10 // Batch sizeOutside this hunk (suggested):
// At class top private const LOGS_BATCH_SIZE = 10; // Here $dbForLogs->createOrUpdateDocuments('stats', $this->documents, self::LOGS_BATCH_SIZE);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Appwrite/Platform/Workers/StatsResources.php(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linter
src/Appwrite/Platform/Workers/StatsResources.php
[error] 1-1: Pint PSR-12 lint issue: 'control_structure_braces' violation in StatsResources.php.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: E2E Service Test (Teams)
- GitHub Check: E2E Service Test (Site Screenshots)
- GitHub Check: E2E Service Test (Locale)
- GitHub Check: E2E Service Test (Tokens)
- GitHub Check: E2E Service Test (VCS)
- GitHub Check: E2E Service Test (Users)
- GitHub Check: E2E Service Test (Storage)
- GitHub Check: E2E Service Test (Messaging)
- GitHub Check: E2E Service Test (Databases/Legacy)
- GitHub Check: E2E Service Test (Console)
- GitHub Check: E2E Service Test (GraphQL)
- GitHub Check: E2E Service Test (Health)
- GitHub Check: E2E Service Test (Databases/TablesDB)
- GitHub Check: E2E Service Test (Realtime)
- GitHub Check: E2E Service Test (FunctionsSchedule)
- GitHub Check: Unit Test
- GitHub Check: E2E Service Test (Dev Keys)
- GitHub Check: E2E General Test
- GitHub Check: Benchmark
- GitHub Check: scan
🔇 Additional comments (1)
src/Appwrite/Platform/Workers/StatsResources.php (1)
440-454: Confirm stats index alignment
No explicit unique/index definition for thestatscollection was found—please verify the actual index (columns and sort directions) and ensure themetric DESC, period ASC, time ASCorder matches it.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/Appwrite/Platform/Workers/StatsResources.php (1)
440-462: Type the comparator and use Document::getAttribute (optional: make closure static)Improves type-safety, readability, and avoids array-style access on Document. Also caches attributes locally. Optionally mark the closure static.
[ suggest_optional_refactor ]
Apply:- usort($this->documents, function ($a, $b) { + usort($this->documents, static function (Document $a, Document $b): int { // Metric DESC - $cmp = strcmp($b['metric'], $a['metric']); + $aMetric = $a->getAttribute('metric'); + $bMetric = $b->getAttribute('metric'); + $cmp = strcmp($bMetric, $aMetric); if ($cmp !== 0) { return $cmp; } // Period ASC - $cmp = strcmp($a['period'], $b['period']); + $aPeriod = $a->getAttribute('period'); + $bPeriod = $b->getAttribute('period'); + $cmp = strcmp($aPeriod, $bPeriod); if ($cmp !== 0) { return $cmp; } // Time ASC, NULLs first - if ($a['time'] === null) { - return ($b['time'] === null) ? 0 : -1; - } - if ($b['time'] === null) { + $aTime = $a->getAttribute('time'); + $bTime = $b->getAttribute('time'); + if ($aTime === null) { + return ($bTime === null) ? 0 : -1; + } + if ($bTime === null) { return 1; } - return strcmp($a['time'], $b['time']); + return strcmp($aTime, $bTime); });If the unique index for stats also includes region (or any other field), extend the tie-breakers accordingly to fully match index order to minimize lock contention. To confirm, run:
#!/bin/bash # Locate stats collection index definitions (PHP/JSON) rg -nP -C3 '(create|add).*index.*stats|indexes?.*stats' -g '!**/vendor/**'
🧹 Nitpick comments (1)
src/Appwrite/Platform/Workers/StatsResources.php (1)
437-439: Polish the comment for claritySuggested wording: “Sort by unique index columns to reduce lock contention and deadlocks.”
Apply:
- /** - * sort by unique index key reduce locks/deadlocks - */ + /** + * Sort by unique index columns to reduce lock contention and deadlocks. + */
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Appwrite/Platform/Workers/StatsResources.php(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: E2E Service Test (Teams)
- GitHub Check: E2E Service Test (Users)
- GitHub Check: E2E Service Test (Storage)
- GitHub Check: E2E Service Test (Locale)
- GitHub Check: E2E Service Test (Realtime)
- GitHub Check: E2E Service Test (Messaging)
- GitHub Check: E2E Service Test (VCS)
- GitHub Check: E2E Service Test (Databases/Legacy)
- GitHub Check: E2E Service Test (Projects)
- GitHub Check: E2E Service Test (GraphQL)
- GitHub Check: E2E Service Test (Account)
- GitHub Check: E2E Service Test (Avatars)
- GitHub Check: E2E Service Test (Console)
- GitHub Check: E2E General Test
- GitHub Check: E2E Service Test (Dev Keys)
- GitHub Check: E2E Service Test (Site Screenshots)
- GitHub Check: scan
🔇 Additional comments (1)
src/Appwrite/Platform/Workers/StatsResources.php (1)
467-468: Trailing comma in function call is finePHP 7.3+ allows trailing commas in calls; keeps diffs cleaner. LGTM.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/Appwrite/Platform/Workers/StatsUsage.php (1)
540-544: Clear$this->statDocumentsafter a successful push to avoid double-counting on the next batch.Without clearing, subsequent runs will re-send previous docs and keep increasing their values.
$dbForLogs->createOrUpdateDocumentsWithIncrease( 'stats', 'value', $this->statDocuments ); Console::success('Usage logs pushed to Logs DB'); + $this->statDocuments = [];
🧹 Nitpick comments (2)
src/Appwrite/Platform/Workers/StatsUsage.php (2)
431-454: Type the comparator and simplify comparisons.Add types to catch misuse and use the spaceship operator for brevity; keep NULLs-first handling for time.
- usort($projectStats['stats'], function ($a, $b) { + usort($projectStats['stats'], static function (Document $a, Document $b): int { // Metric DESC - $cmp = strcmp($b['metric'], $a['metric']); + $cmp = ($b['metric'] ?? '') <=> ($a['metric'] ?? ''); if ($cmp !== 0) { return $cmp; } // Period ASC - $cmp = strcmp($a['period'], $b['period']); + $cmp = ($a['period'] ?? '') <=> ($b['period'] ?? ''); if ($cmp !== 0) { return $cmp; } // Time ASC, NULLs first - if ($a['time'] === null) { + $at = $a['time'] ?? null; + $bt = $b['time'] ?? null; + if ($at === null) { - return ($b['time'] === null) ? 0 : -1; + return ($bt === null) ? 0 : -1; } - if ($b['time'] === null) { + if ($bt === null) { return 1; } - - return strcmp($a['time'], $b['time']); + return $at <=> $bt; });
429-454: De-duplicate the comparator logic.Optional: extract private static comparators and reuse to keep behavior consistent and testable.
Example:
private static function compareStatsDocs(Document $a, Document $b): int { /* same logic */ } private static function compareLogsDocs(Document $a, Document $b): int { /* includes $tenant first */ }Then:
usort($projectStats['stats'], self::compareStatsDocs(...)); usort($this->statDocuments, self::compareLogsDocs(...));Also applies to: 505-534
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Appwrite/Platform/Workers/StatsUsage.php(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: E2E Service Test (Messaging)
- GitHub Check: E2E Service Test (Tokens)
- GitHub Check: E2E Service Test (Users)
- GitHub Check: E2E Service Test (Teams)
- GitHub Check: E2E Service Test (Webhooks)
- GitHub Check: E2E Service Test (Databases/TablesDB)
- GitHub Check: E2E Service Test (Databases/Legacy)
- GitHub Check: E2E Service Test (Locale)
- GitHub Check: E2E Service Test (Projects)
- GitHub Check: E2E Service Test (Functions)
- GitHub Check: E2E Service Test (Health)
- GitHub Check: E2E Service Test (Account)
- GitHub Check: E2E Service Test (Avatars)
- GitHub Check: E2E Service Test (Site Screenshots)
- GitHub Check: E2E Service Test (Dev Keys)
- GitHub Check: E2E General Test
- GitHub Check: Unit Test
- GitHub Check: Benchmark
- GitHub Check: scan
🔇 Additional comments (3)
src/Appwrite/Platform/Workers/StatsUsage.php (3)
429-431: Good call adding deterministic sort before writes.This should reduce lock/deadlock contention by aligning with the unique key order.
500-504: Sorting by the logs unique key is the right approach.Including tenant first matches the composite unique key and should minimize conflicts.
429-454: Confirm DB unique index ordering matches comparator (tenant ASC, metric DESC, period ASC, time ASC)Found a comment in src/Appwrite/Platform/Workers/StatsUsage.php (lines ~499–503) documenting UNIQUE KEY "_key_metric_period_time" ("_tenant","metric" DESC,"period","time") and noting sort by _tenant due to setTenantPerDocument. Repository search did not return any migration/schema definitions for this index. Verify the collection migrations/schema define a composite UNIQUE key with fields and directions exactly: _tenant ASC, metric DESC, period ASC, time ASC — and ensure the stats/logs comparator/sorting matches that ordering (add tenant sorting where required).
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/Appwrite/Platform/Workers/StatsUsage.php (1)
505-533: Bug: comparator reads_tenantbut documents store$tenant; also add types.prepareForLogsDB() sets
$tenant, so$a['_tenant']is always null and tenant ordering is skipped, undermining the lock/deadlock reduction goal. Fix by reading$tenantvia getAttribute() and type the comparator.- usort($this->statDocuments, function ($a, $b) { + usort($this->statDocuments, static function (Document $a, Document $b): int { // Tenant ASC - $cmp = $a['_tenant'] <=> $b['_tenant']; + $cmp = ((int) $a->getAttribute('$tenant', 0)) <=> ((int) $b->getAttribute('$tenant', 0)); if ($cmp !== 0) { return $cmp; } // Metric DESC - $cmp = strcmp($b['metric'], $a['metric']); + $cmp = ($b->getAttribute('metric', '')) <=> ($a->getAttribute('metric', '')); if ($cmp !== 0) { return $cmp; } // Period ASC - $cmp = strcmp($a['period'], $b['period']); + $cmp = ($a->getAttribute('period', '')) <=> ($b->getAttribute('period', '')); if ($cmp !== 0) { return $cmp; } // Time ASC, NULLs first - if ($a['time'] === null) { - return ($b['time'] === null) ? 0 : -1; - } - if ($b['time'] === null) { - return 1; - } - - return strcmp($a['time'], $b['time']); + $at = $a->getAttribute('time'); + $bt = $b->getAttribute('time'); + if ($at === null) { + return ($bt === null) ? 0 : -1; + } + if ($bt === null) { + return 1; + } + return $at <=> $bt; });
🧹 Nitpick comments (2)
src/Appwrite/Platform/Workers/StatsUsage.php (2)
431-453: Add types + null-safe comparator for deterministic sort (no behavior change).Use typed comparator with Document::getAttribute() and the spaceship operator for clarity and static analysis friendliness.
- usort($projectStats['stats'], function ($a, $b) { + usort($projectStats['stats'], static function (Document $a, Document $b): int { // Metric DESC - $cmp = strcmp($b['metric'], $a['metric']); + $cmp = ($b->getAttribute('metric', '')) <=> ($a->getAttribute('metric', '')); if ($cmp !== 0) { return $cmp; } // Period ASC - $cmp = strcmp($a['period'], $b['period']); + $cmp = ($a->getAttribute('period', '')) <=> ($b->getAttribute('period', '')); if ($cmp !== 0) { return $cmp; } // Time ASC, NULLs first - if ($a['time'] === null) { - return ($b['time'] === null) ? 0 : -1; - } - if ($b['time'] === null) { - return 1; - } - - return strcmp($a['time'], $b['time']); + $at = $a->getAttribute('time'); + $bt = $b->getAttribute('time'); + if ($at === null) { + return ($bt === null) ? 0 : -1; + } + if ($bt === null) { + return 1; + } + return $at <=> $bt; });
431-453: DRY: extract shared sorters into private static comparators.Both blocks implement the same metric/period/time ordering (one with an extra tenant step). Consider:
- private static function compareStats(Document $a, Document $b): int
- private static function compareStatsWithTenant(Document $a, Document $b): int
Then call them from both usort sites. Improves consistency and reduces risk of drift.
Also applies to: 505-533
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Appwrite/Platform/Workers/StatsUsage.php(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Benchmark
- GitHub Check: E2E Service Test (Messaging)
- GitHub Check: E2E Service Test (Storage)
- GitHub Check: E2E Service Test (Teams)
- GitHub Check: E2E Service Test (Webhooks)
- GitHub Check: E2E Service Test (VCS)
- GitHub Check: E2E Service Test (Tokens)
- GitHub Check: E2E Service Test (Users)
- GitHub Check: E2E Service Test (Databases/TablesDB)
- GitHub Check: E2E Service Test (Projects)
- GitHub Check: E2E Service Test (Proxy)
- GitHub Check: E2E Service Test (Sites)
- GitHub Check: E2E Service Test (Realtime)
- GitHub Check: E2E Service Test (GraphQL)
- GitHub Check: E2E Service Test (FunctionsSchedule)
- GitHub Check: E2E Service Test (Functions)
- GitHub Check: E2E Service Test (Databases/Legacy)
- GitHub Check: E2E Service Test (Account)
- GitHub Check: E2E General Test
🔇 Additional comments (1)
src/Appwrite/Platform/Workers/StatsUsage.php (1)
459-461: Good move: cleanup in finally ensures per-project batch memory is released.Keeps memory bounded and avoids leaking state across iterations. LGTM.
| /** | ||
| * todo: Do we need to unset $this->statDocuments? | ||
| */ | ||
|
|
There was a problem hiding this comment.
Clear batch after successful push to avoid duplicate increments and memory growth.
Without clearing, subsequent calls will re-send the same documents, risking double-counting with “increase” writes.
- /**
- * todo: Do we need to unset $this->statDocuments?
- */
+ // Clear batch after successful push
+ $this->statDocuments = [];If you prefer clearing even on partial failures, add a finally block that conditionally requeues only the failed subset.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * todo: Do we need to unset $this->statDocuments? | |
| */ | |
| // Clear batch after successful push | |
| $this->statDocuments = []; |
🤖 Prompt for AI Agents
In src/Appwrite/Platform/Workers/StatsUsage.php around lines 542 to 545, the
batch of $this->statDocuments isn't cleared after a successful push which causes
duplicate "increase" writes and memory growth; modify the push logic to clear
$this->statDocuments (e.g., set to [] or unset) immediately after a confirmed
successful send so subsequent runs don't resend the same docs, and if you want
safer behavior on partial failures wrap the send in try/catch/finally and in
finally requeue only the failed subset while clearing the successfully sent
items.
…ts-resources-try-catch
| } catch (Throwable $e) { | ||
| Console::error('Error processing stats: ' . $e->getMessage()); | ||
| } finally { | ||
| unset($this->projects[$sequence]); |
There was a problem hiding this comment.
we shouldn't clear if the insert fails, or we will loose the data.
There was a problem hiding this comment.
We have 3 retry attempts in the Utopia lib in case of failure.
When we do INSERT ... ON DUPLICATE KEY UPDATE in the upsert method
If we have many concurrent workers inserting/updating the same table, they may lock different parts of the unique index in different orders.
That can create the classic deadlock
What does this PR do?
(Provide a description of what this PR does and why it's needed.)
Test Plan
(Write your test plan here. If you changed any code, please provide us with clear instructions on how you verified your changes work. Screenshots may also be helpful.)
Related PRs and Issues
Checklist