fix(sync): persist compaction counter when no cache exists

The incrementCompactionCounter() function returned 1 without persisting
when no state cache existed. This caused the compaction threshold to
never be reached, leading to unbounded operation log growth.

Also updated review docs:
- Marked compaction counter bug as fixed
- Identified memory leak in TaskElectronEffects as false positive
  (take(1) ensures automatic cleanup)
- Identified TODAY_TAG "board-style" claim as false positive
  (code correctly REMOVES TODAY_TAG from task.tagIds)
- Identified cache invalidation claim as false positive
  (sequence-based invalidation works correctly)
- Identified repair mutex race condition as false positive
  (JavaScript single-threaded, no await between assignment)
This commit is contained in:
Johannes Millan 2025-12-08 10:54:36 +01:00
parent 78c65acf4d
commit bbd50c223a
5 changed files with 958 additions and 1 deletions

View file

@ -0,0 +1,92 @@
# NgRx Implementation Analysis
**Date:** December 8, 2025
**Scope:** `src` directory, focusing on `FeatureStoresModule`, `Task` feature, and core state management patterns.
**Last Updated:** December 8, 2025 (added cross-references, validated findings)
## 1. Architecture Overview
The application employs a mature, feature-based NgRx architecture. It avoids a monolithic root state in favor of a modular approach where each domain feature manages its own slice of the state.
- **Modular Organization:** State is strictly compartmentalized. The `src/app/root-store/feature-stores.module.ts` acts as a central registry, importing over 20 distinct feature modules (e.g., `tasks`, `projects`, `tags`, `config`). This is a scalable and clean pattern that keeps feature concerns separated.
- **Root Setup:** The root store is initialized in `src/main.ts` rather than `AppModule`. This unique setup likely caters to the specific bootstrapping requirements of the Electron/Web hybrid environment, allowing for early state initialization or meta-reducer configuration before the full Angular app launches.
- **Core State (Tasks):** The `task` feature (`src/app/features/tasks/store/`) is the most complex part of the application. It utilizes `@ngrx/entity` to manage a normalized state of tasks. This is crucial for handling the relational nature of tasks (parents, subtasks) and their efficient updates.
- **Interaction Pattern:** The application uses a hybrid "Service-Facade" pattern. Components like `TaskListComponent` inject the `Store` directly for dispatching actions but often rely on Services (e.g., `TaskService`) to expose selectors as Observables. This provides a pragmatic balance between the strict Command/Query segregation of NgRx and the developer ergonomics of Services.
## 2. Potential Bugs & Risks
### Type Safety Violations in Effects
- **Location:** `src/app/features/tasks/store/task-internal.effects.ts`
- **Issue:** The `autoSetNextTask$` effect contains distinct type safety bypasses:
- `// @ts-ignore` is used to suppress errors.
- `const a = action as any;` is used to access properties on union types without proper type narrowing.
- **Risk:** This bypasses TypeScript's compile-time safety. If the structure of the actions handled by this effect (`updateTask`, `moveProjectTaskToBacklogList`, etc.) changes, this code could fail silently at runtime, potentially breaking the "auto-next-task" feature.
### Complex Reducer Logic
- **Location:** `src/app/features/tasks/store/task.reducer.ts`
- **Issue:** The `moveSubTask` reducer handles a complex multi-step state update. It manually updates the old parent's `subTaskIds`, the new parent's `subTaskIds`, and recalculates time estimates (`reCalcTimesForParentIfParent`) for both.
- **Risk:** Orchestrating relational data updates manually within a reducer is brittle. If one step in this sequence calculates incorrectly or fails, the state could become inconsistent (e.g., a task thinks it has a parent, but the parent doesn't list it as a subtask).
### ~~Critical: Compaction Counter Bug~~ - FIXED
- **Location:** `src/app/core/persistence/operation-log/store/operation-log-store.service.ts:434-439`
- **Original Issue:** When no state cache existed, `incrementCompactionCounter()` returned `1` without persisting the value.
- **Status:** Fixed on 2025-12-08. Counter now persists when no cache exists.
## 3. Redundancies & Complexity
### Logic in Effects
- **Observation:** The `TaskInternalEffects` class contains a private method `_findNextTask`. This method encapsulates significant business logic for determining which task should be focused next.
- **Impact:** Coupling complex business logic with the Effects class makes it harder to test. Effects should ideally be "dumb" pipelines that connect Actions to Services.
### Cross-Feature Coupling
- **Observation:** The `Planner` feature and `Task` feature are tightly coupled via Actions. For instance, `PlannerActions.planTaskForDay` is caught and handled inside `task.reducer.ts`.
- **Impact:** While "Event-Driven" communication via Actions is a correct NgRx pattern, this "scattered" logic means that understanding how a Task entity changes requires auditing reducers in multiple different feature directories.
## 4. Testing Analysis
- **Strengths:**
- **Selector Coverage:** `src/app/features/tasks/store/task.selectors.spec.ts` is robust. It mocks the full state tree (including dependencies like `Project` and `Tag`) and verifies logic for complex derived data (e.g., `selectTasksDueForDay`).
- **Artifact coverage:** Most Store artifacts (Reducers, Effects, Selectors) have corresponding `*.spec.ts` files, indicating a strong testing culture.
- **Gaps:**
- The complex helper logic in `task.reducer.util.ts` (used by the reducer for calculations) should be verified to have its own independent unit tests, as the reducer relies heavily on its correctness for data integrity.
- Effects coverage is at 51% - see `docs/ai/ngrx-implementation-review.md` for the full list.
## 5. Well-Designed Patterns
### TODAY_TAG Virtual Pattern (Correctly Implemented)
- **Location:** Meta-reducers in `src/app/root-store/meta/task-shared-meta-reducers/`
- **Pattern:** TODAY_TAG is correctly implemented as a "virtual tag" where:
- Membership is determined by `task.dueDay`, NOT by `task.tagIds`
- `TODAY_TAG.taskIds` only stores ordering
- Code actively removes any legacy TODAY_TAG from `task.tagIds` (cleanup)
- **Note:** This pattern was initially misidentified as a bug in early reviews. The code is correct.
### Meta-Reducer Pattern for Atomic Operations
- **Location:** `src/app/root-store/meta/task-shared-meta-reducers/`
- **Pattern:** Multi-entity changes are handled atomically in meta-reducers to ensure sync consistency.
- **See:** `docs/ai/today-tag-architecture.md` and CLAUDE.md point 8.
## 6. Recommendations
1. **Refactor for Type Safety:**
- Rewrite the `autoSetNextTask$` effect in `task-internal.effects.ts`. Use proper TypeScript Discriminated Unions or type guards to narrow the `action` type safely instead of casting to `any`.
2. **Extract Logic from Effects:**
- Move the `_findNextTask` logic into a stateless utility function or a domain service. This would allow it to be unit tested in isolation from the RxJS stream and the Store.
3. **Solidify Reducer Helpers:**
- Ensure that `src/app/features/tasks/store/task.reducer.util.ts` has comprehensive unit tests covering edge cases, specifically for `reCalcTimesForParentIfParent`, to prevent data corruption during subtask moves.
4. ~~**Fix Critical Bugs:**~~ - COMPLETED
- ✅ Compaction counter bug fixed on 2025-12-08.
## 7. Related Documentation
- `docs/ai/ngrx-implementation-review.md` - Detailed bug list and testing gaps
- `docs/ai/sync/operation-log-implementation-review.md` - Operation log specific issues
- `docs/ai/operation-log-review-plan.md` - Architecture review and implementation plan
- `docs/ai/today-tag-architecture.md` - TODAY_TAG virtual pattern explanation

View file

@ -0,0 +1,411 @@
# NgRx Implementation Review
**Date:** December 8, 2025
**Branch:** `feat/operation-logs`
**Reviewed by:** Claude (AI-assisted code review)
**Last Updated:** December 8, 2025 (corrected false positives)
---
## Executive Summary
A comprehensive review of the NgRx implementation identified **22 issues** across bugs, redundancies, architecture concerns, and testing gaps.
| Category | Critical | High | Medium | Total |
| ------------ | ----------- | ---- | ------ | ----- |
| Bugs | 0 (1 fixed) | 4 | 8 | 12 |
| Testing Gaps | 3 | 4 | 2 | 9 |
### Testing Coverage
| Category | Total | Tested | Coverage | Status |
| ------------- | ----- | ------ | -------- | ---------------- |
| Reducers | 35 | 28 | 80% | Good |
| Effects | 39 | 20 | **51%** | **Critical Gap** |
| Selectors | 13 | 12 | 92% | Excellent |
| Meta-reducers | 10 | 8 | 80% | Good |
---
## Corrected Analysis (False Positives Removed)
### ~~TODAY_TAG Pattern Inconsistency~~ - FALSE POSITIVE
**Status:** Not a bug - code is correct.
**Original Claim:** Functions like `handleTransferTask` and `handlePlanTaskForDay` use "board-style" (adding TODAY_TAG to `task.tagIds`).
**Actual Code Behavior:** The code explicitly **REMOVES** TODAY_TAG from `task.tagIds`:
```typescript
// planner-shared.reducer.ts:102-114
// Remove TODAY from task.tagIds if present
if (hasTaskTodayTag) {
state = {
...state,
[TASK_FEATURE_NAME]: taskAdapter.updateOne(
{
id: task.id,
changes: { tagIds: currentTagIds.filter((id) => id !== TODAY_TAG.id) },
},
state[TASK_FEATURE_NAME],
),
};
}
```
The comments in the code confirm: "Ensure TODAY_TAG is NOT in task.tagIds (cleanup if present from legacy data)".
**Conclusion:** The code correctly enforces the virtual tag pattern by removing any legacy TODAY_TAG references from `task.tagIds`. No fix needed.
---
### ~~Repair Mutex Race Condition~~ - FALSE POSITIVE
**Status:** Not a bug in JavaScript's single-threaded model.
**Original Claim:** Between promise creation and mutex assignment, another call can pass the `if (!this._repairMutex)` check.
**Why It's Not a Bug:** JavaScript is single-threaded. The code:
```typescript
const repairPromise = (async () => { ... })(); // Returns immediately
this._repairMutex = repairPromise; // Assigned synchronously on next line
```
There is no `await` between these lines, so the event loop cannot yield. No other code can execute between promise creation and mutex assignment.
**Optional Improvement:** Setting the mutex before creating the promise is a valid defensive coding practice for clarity, but the current implementation is not buggy.
---
### ~~Cache Not Invalidated on Append~~ - FALSE POSITIVE
**Status:** Not a bug - cache self-invalidates.
**Original Claim:** `append()` doesn't invalidate `_appliedOpIdsCache`, causing stale data.
**Actual Implementation:**
```typescript
async getAppliedOpIds(): Promise<Set<string>> {
const currentLastSeq = await this.getLastSeq();
// Cache valid only if lastSeq hasn't changed
if (this._appliedOpIdsCache && this._cacheLastSeq === currentLastSeq) {
return new Set(this._appliedOpIdsCache);
}
// Rebuild cache if seq changed
...
}
```
When new ops are appended, `lastSeq` changes (auto-increment). The cache automatically invalidates because `currentLastSeq !== _cacheLastSeq`.
**Conclusion:** No bug - the sequence-based invalidation works correctly.
---
## Critical Bugs (Fix Immediately)
### 1. ~~Compaction Counter Never Persists~~ - FIXED
**Severity:** Critical
**File:** `src/app/core/persistence/operation-log/store/operation-log-store.service.ts`
**Lines:** 434-439
**Status:** Fixed on 2025-12-08
**Original Issue:** When no state cache exists, `incrementCompactionCounter()` returned 1 without persisting the value.
**Fix Applied:**
```typescript
if (!cache) {
// No state cache yet - create one with counter starting at 1
await store.put({ id: 'current', compactionCounter: 1 });
await tx.done;
return 1;
}
```
---
### ~~Memory Leak in TaskElectronEffects~~ - FALSE POSITIVE
**Status:** Not a bug - `take(1)` ensures automatic cleanup.
**Original Claim:** Subscriptions accumulate without cleanup in the IPC handler.
**Why It's Not a Bug:** Looking at the actual code:
```typescript
this._store$.pipe(
select(selectCurrentTask),
withLatestFrom(...),
take(1), // <-- Completes after 1 emission, auto-unsubscribes
).subscribe(...)
```
The `take(1)` operator ensures the observable completes after receiving one value, which automatically triggers unsubscription. Each IPC event creates a short-lived subscription that self-destructs after emitting once.
---
## High Priority Issues
### 1. Snapshot Saved Before Validation
**File:** `src/app/core/persistence/operation-log/store/operation-log-hydrator.service.ts`
**Lines:** 167-176
Snapshot saved at line 171, validation runs at line 176. If validation finds corruption and repairs, the snapshot is stale.
**Fix:** Move validation (Checkpoint C) BEFORE saving snapshot.
---
### 2. Missing Subtask Cascade in Tag Deletion
**File:** `src/app/root-store/meta/task-shared-meta-reducers/tag-shared.reducer.ts`
**Lines:** 85-116
Orphan detection only runs on parent tasks. Subtasks of surviving parents that lose all tags become orphaned but aren't cleaned up.
**Fix:** After removing tags from tasks, also check subtasks of surviving parents.
---
### 3. Race Conditions in TaskDueEffects
**File:** `src/app/features/tasks/store/task-due.effects.ts`
**Lines:** 40-65
Issues:
- Double sync wait (before AND after debounce)
- Multiple effects (`addTasksForTomorrow$`, `removeOverdueFormToday$`) compete on same signals
- Potential infinite loop if action triggers date change
**Fix:** Consolidate sync waits, ensure effects don't trigger each other.
---
### 4. Inconsistent Error Handling Across Effects
| File | Has Error Handling |
| -------------------------- | --------------------- |
| `short-syntax.effects.ts` | Yes (`catchError()`) |
| `task-electron.effects.ts` | No |
| `idle.effects.ts` | No |
| `tag.effects.ts` | No (async operations) |
**Fix:** Standardize error handling pattern across all effects. Use `catchError()` with appropriate fallbacks.
---
## Medium Priority Issues
### 5. Missing Bounds Checking
**File:** `src/app/root-store/meta/task-shared-meta-reducers/planner-shared.reducer.ts`
**Line:** 247
```typescript
const targetIndex = taskIds.indexOf(toTaskId);
taskIds.splice(targetIndex, 0, fromTask.id); // What if indexOf returns -1?
```
If `toTaskId` not found, `targetIndex = -1`, and `splice(-1, 0, id)` inserts at wrong position.
---
### 6. TOCTOU Race in Compaction
**File:** `src/app/core/persistence/operation-log/store/operation-log-compaction.service.ts`
**Lines:** 66-93
Time-of-check-time-of-use between `getLastSeq()` and `deleteOpsWhere()`. New ops could be written between these calls.
---
### 7. State Capture ID Non-Deterministic
**File:** `src/app/core/persistence/operation-log/processing/state-change-capture.service.ts`
**Lines:** 354-360
`JSON.stringify(action)` hash is non-deterministic for object property order. Could cause capture ID mismatches.
---
### 8. LOCAL_ACTIONS Filter Edge Case
**File:** `src/app/util/local-actions.token.ts`
**Line:** 37
```typescript
return actions$.pipe(filter((action: Action) => !(action as any).meta?.isRemote));
```
Actions without `meta` field pass through (undefined !== true). Remote operations without explicit `isRemote: true` could trigger effects twice.
---
### 9. Duplicated TODAY_TAG Logic
**Locations:**
- `planner-shared.reducer.ts:158-234`
- `short-syntax-shared.reducer.ts:241-309`
Same logic for "moving to/from today" duplicated. Changes require updates in multiple places.
**Fix:** Extract to shared helper function.
---
### 10. getTag() Throws But Callers Don't Handle
**File:** `src/app/root-store/meta/task-shared-meta-reducers/task-shared-helpers.ts`
**Lines:** 51-80
Meta-reducers are synchronous and can't catch thrown errors. If `getTag()` throws, entire reducer chain fails.
**Fix:** Use `getTagOrUndefined()` pattern consistently, or add guards before calling `getTag()`.
---
### 11. Async Operations Without Error Handling
**File:** `src/app/features/tag/store/tag.effects.ts`
**Lines:** 101-118
```typescript
tap(async (tagIdsToRemove: string[]) => {
await this._taskArchiveService.removeTagsFromAllTasks(tagIdsToRemove);
// No error handling, effect completes before async finishes
});
```
**Fix:** Use proper async handling with `concatMap` and error boundaries.
---
### 12. Orphan Detection Doesn't Verify Project Exists
**File:** `src/app/root-store/meta/task-shared-meta-reducers/tag-shared.reducer.ts`
**Lines:** 105-108
Checks `!task.projectId` but doesn't verify project still exists (could be deleted in same batch).
---
## Testing Gaps
### Critical: Effects Coverage at 51%
**19 untested effect files:**
**Priority 1 (Core):**
- `src/app/imex/sync/sync.effects.ts` - Core sync engine
- `src/app/core-ui/layout/store/layout.effects.ts` - Navigation/panel management
- `src/app/features/tasks/store/task-related-model.effects.ts` - Task model sync
- `src/app/features/work-context/store/work-context.effects.ts` - Context switching
**Priority 2 (Features):**
- `src/app/features/project/store/project.effects.ts`
- `src/app/features/tag/store/tag.effects.ts`
- `src/app/features/tasks/store/short-syntax.effects.ts`
- `src/app/features/tasks/store/task-internal.effects.ts`
- `src/app/features/tasks/store/task-ui.effects.ts`
- Issue provider effects (caldav, gitlab, jira, open-project)
---
### Critical: Disabled Test Files
- `src/app/root-store/app-state/app-state.effects.spec.ts` - Completely commented out
- `src/app/features/note/store/note.effects.spec.ts` - Completely commented out
- `src/app/features/boards/store/boards.selectors.spec.ts` - Disabled
---
### Missing Meta-Reducer Tests
- `src/app/root-store/meta/task-shared-meta-reducers/issue-provider-shared.reducer.ts`
- `src/app/root-store/meta/task-shared-meta-reducers/task-repeat-cfg-shared.reducer.ts`
---
### Missing Integration Tests
- Race conditions (concurrent append + compaction)
- Error recovery (quota exceeded mid-write)
- Multi-tab coordination (lock deadlocks)
- 3-way conflict resolution
- Very large conflict sets (100+ ops on same entity)
- Network failure mid-pagination (2 tests marked `pending()`)
---
## Architecture Observations
### Well-Designed Aspects
1. **Meta-reducer pattern** - Atomic multi-entity operations ensure sync consistency
2. **Persistent action pattern** - Explicit `isPersistent: true` enables operation logging
3. **LOCAL_ACTIONS token** - Filters remote operations from side-effect triggers
4. **Entity adapters** - Normalized state with proper NgRx patterns
5. **Self-healing selectors** - `computeOrderedTaskIdsForTag()` dynamically computes membership
6. **TODAY_TAG virtual pattern** - Correctly implemented with cleanup for legacy data
### Concerns
1. **Meta-reducer ordering** - 10 ordered reducers create tight coupling; order changes break logic
2. **Action metadata duplication** - Easy to forget `isPersistent` or use incorrect `opType`
3. **Three-way relationships** - `task.tagIds`, `tag.taskIds`, `planner.days` require careful synchronization
---
## Prioritized Fix Recommendations
### Phase 1: Critical (Immediate) - COMPLETED
| # | Issue | Status |
| --- | ------------------ | -------------------------------------- |
| 1 | Compaction counter | **FIXED** (2025-12-08) |
| 2 | ~~Memory leak~~ | FALSE POSITIVE - `take(1)` auto-cleans |
### Phase 2: High Priority (Current)
| # | Issue | File |
| --- | ------------------------------ | ----------------------------------------- |
| 1 | Snapshot timing | operation-log-hydrator.service.ts:167-176 |
| 2 | Subtask orphan detection | tag-shared.reducer.ts |
| 3 | Race conditions | task-due.effects.ts:40-65 |
| 4 | Error handling standardization | Multiple effects |
### Phase 3: Add Tests
| # | Test | Priority |
| --- | -------------------------------------------- | -------- |
| 1 | Re-enable app-state.effects.spec.ts | Critical |
| 2 | sync.effects.spec.ts (new) | Critical |
| 3 | issue-provider-shared.reducer.spec.ts (new) | High |
| 4 | task-repeat-cfg-shared.reducer.spec.ts (new) | High |
| 5 | layout.effects.spec.ts (new) | High |
### Phase 4: Medium Priority
- Bounds checking in planner-shared.reducer.ts
- Extract duplicated TODAY_TAG logic
- Fix getTag() error handling pattern
- Add async error handling in tag.effects.ts
---
## Related Documentation
- `docs/ai/sync/operation-log-implementation-review.md` - Detailed operation log review
- `docs/ai/operation-log-review-plan.md` - Original review plan
- `src/app/core/persistence/operation-log/docs/operation-log-architecture.md` - Architecture docs
- `docs/ai/today-tag-architecture.md` - TODAY_TAG virtual pattern

View file

@ -0,0 +1,145 @@
# Operation Log Architecture Review & Plan
**Date:** Monday, December 8, 2025
**Status:** Planning
**Module:** `src/app/core/persistence/operation-log/`
**Last Updated:** December 8, 2025 (added compaction counter bug, validated findings)
## Executive Summary
A comprehensive architectural analysis of the `operation-log` module has identified **one remaining critical issue**: a data loss risk in the file-based synchronization mechanism (WebDAV/Dropbox/etc.). The compaction counter bug has been fixed. This document outlines the findings and the proposed implementation plan to address remaining issues.
## 1. Critical Findings
### ✅ ~~Critical Bug: Compaction Counter Never Persists~~ - FIXED
**Severity:** Critical
**Location:** `OperationLogStoreService.incrementCompactionCounter` (lines 434-439)
**Status:** Fixed on 2025-12-08
**Original Issue:** When no state cache existed, the function returned `1` without persisting that value. The compaction threshold was never reached.
**Fix Applied:**
```typescript
if (!cache) {
// No state cache yet - create one with counter starting at 1
await store.put({ id: 'current', compactionCounter: 1 });
await tx.done;
return 1;
}
```
---
### 🚨 Data Loss Risk in Manifest Sync (File-Based Providers)
**Severity:** Critical
**Location:** `OperationLogManifestService.uploadRemoteManifest` & `OperationLogUploadService`
**The Issue:**
When uploading the list of operation files (`manifest.json`), the service explicitly forces an overwrite (`revToMatch: null`, `force: true`). This creates a "Lost Update" race condition.
**Scenario:**
1. **Client A** reads manifest: `['file1']`.
2. **Client B** reads manifest: `['file1']`.
3. **Client A** uploads `file2`, writes manifest: `['file1', 'file2']`.
4. **Client B** uploads `file3`, writes manifest: `['file1', 'file3']`.
**Outcome:**
`file2` (Client A's data) is orphaned. **Client C** will never see Client A's operations. If `file3` depends on `file2` (e.g., Update Task vs Create Task), Client C will crash or have corrupted state.
**Recommendation:**
Implement Optimistic Locking.
- `loadRemoteManifest` must return the file's `rev` (revision ID).
- `uploadRemoteManifest` must accept `revToMatch`.
- Implement a retry loop: if upload fails due to conflict (412), re-download, merge file lists (union), and retry.
---
### ⚠️ Performance Bottleneck in Compaction
**Severity:** High
**Location:** `OperationLogStoreService.deleteOpsWhere`
**The Issue:**
Compaction deletes old operations based on the `appliedAt` timestamp. However, the IndexedDB store only has indexes for `byId` and `bySyncedAt`.
**Impact:**
Compaction performs a full table scan of the `ops` store. As the log grows (the very problem compaction solves), this operation becomes slower ($O(N)$), potentially causing UI jank or locking timeouts during the critical compaction phase.
**Recommendation:**
Update `DB_VERSION` and add an index for `appliedAt` in `OperationLogStoreService.init()`.
---
### ⚠️ UX Deadlock in Conflict Resolution
**Severity:** Medium
**Location:** `ConflictResolutionService`
**The Issue:**
If the user cancels the Conflict Resolution dialog (e.g., clicks outside or presses Escape), the service returns without applying any changes or advancing the sync state.
**Impact:**
The sync cycle aborts. The next sync attempt will likely hit the exact same conflict, prompting the user again. This creates a "stuck" state where the user is pestered indefinitely until they resolve it.
**Recommendation:**
Treat "Cancel" as "Defer" (pause sync temporarily) or provide a "Skip/Ignore" option that allows other non-conflicting ops to proceed (with strict dependency checks). At minimum, log a clear warning that sync is stalled.
## 2. Architectural Observations
- **Redundant Logic:** `ConflictResolutionService` (Step 3) duplicates the logic of applying operations found in `OperationApplierService` (error handling, snackbars, etc.). This violates DRY and risks inconsistent behavior.
- _Plan:_ Refactor `ConflictResolutionService` to delegate fully to `OperationApplier`.
- **Locking Integrity:** `OperationLogEffects` (writer) and `OperationLogCompactionService` (cleaner) correctly use the `sp_op_log` lock, preventing race conditions during snapshot creation.
- **Fresh Client Safety:** The `isWhollyFreshClient` check in `OperationLogSyncService` correctly safeguards against accidental remote overwrites by empty clients.
- **Cache Self-Invalidation:** The `_appliedOpIdsCache` uses sequence-based invalidation which works correctly (not a bug as previously thought).
## 3. Missing Test Coverage
The current test suite lacks coverage for distributed system edge cases:
1. **Manifest Concurrency:** No tests simulate two clients updating the manifest simultaneously to reproduce the lost update scenario.
2. **Compaction Race:** No tests verify that `append` waits for `compact` (locking verification).
3. **Partial Sync:** No tests for scenarios where a client downloads `file1` but fails to download `file2` (network error).
4. **Compaction Counter Persistence:** No test verifies counter is persisted when cache doesn't exist.
## 4. Implementation Plan
### Phase 1: Fix Critical Bugs (High Priority)
1. **Fix Compaction Counter Bug:**
- **File:** `src/app/core/persistence/operation-log/store/operation-log-store.service.ts`
- **Lines:** 428-447
- **Change:** Persist counter=1 when no cache exists.
2. **Modify `OperationLogManifestService`:**
- Update `loadRemoteManifest` to return `{ manifest, rev }`.
- Update `uploadRemoteManifest` to accept `revToMatch`.
3. **Update `OperationLogUploadService`:**
- Implement the read-modify-write loop with retries for `_uploadPendingOpsViaFiles`.
- Logic: `while (retries < max) { load(rev) -> merge -> upload(rev) }`
### Phase 2: Optimize Store Indexes
1. **Update `OperationLogStoreService`:**
- Increment `DB_VERSION`.
- Add `opStore.createIndex('byAppliedAt', 'appliedAt')` in the upgrade callback.
- Optimize `deleteOpsWhere` to use `IDBKeyRange.upperBound` on the new index.
### Phase 3: Refactor Conflict Resolution
1. **Refactor `ConflictResolutionService`:**
- Remove custom batch application logic.
- Use `OperationApplierService.applyOperations` directly.
### Phase 4: Enhance Test Suite
1. Create `manifest-concurrency.spec.ts` to reproduce the lost update scenario and verify the fix.
2. Create `compaction-counter.spec.ts` to verify counter persistence.
3. Add performance benchmarks for compaction with large datasets.

View file

@ -0,0 +1,308 @@
# Operation Log Implementation Review
**Date:** 2025-12-08
**Branch:** `feat/operation-logs`
**Last Updated:** December 8, 2025 (corrected false positives)
This document summarizes findings from a comprehensive review of the operation log implementation, covering bugs, redundancies, architecture issues, and test coverage gaps.
---
## Executive Summary
The operation log implementation has a solid architectural foundation with event sourcing, vector clocks, and meta-reducers for atomic operations. However, several issues and test gaps were discovered that should be addressed before production use.
| Category | Critical | High | Medium | Total |
| --------- | ----------- | ---- | ------ | ----- |
| Bugs | 0 (1 fixed) | 3 | 6 | 9 |
| Test Gaps | 3 | 4 | 3 | 10 |
---
## Corrected Analysis (False Positives Removed)
### ~~Repair Mutex Race Condition~~ - FALSE POSITIVE
**Status:** Not a bug in JavaScript's single-threaded model.
**Original Claim:** Between promise creation and mutex assignment, another call can pass the `if (!this._repairMutex)` check.
**Why It's Not a Bug:** JavaScript is single-threaded. The code at lines 700-722:
```typescript
const repairPromise = (async () => { ... })(); // Returns immediately
this._repairMutex = repairPromise; // Assigned synchronously on next line
```
There is no `await` between these lines, so the event loop cannot yield. No other code can execute between promise creation and mutex assignment.
**Optional Improvement:** Setting the mutex before creating the promise is a valid defensive coding practice for clarity, but the current implementation is not buggy.
---
### ~~Cache Not Invalidated on Append~~ - FALSE POSITIVE
**Status:** Not a bug - cache self-invalidates.
**Original Claim:** `append()` doesn't invalidate `_appliedOpIdsCache`, causing stale data.
**Actual Implementation:**
```typescript
async getAppliedOpIds(): Promise<Set<string>> {
const currentLastSeq = await this.getLastSeq();
// Cache valid only if lastSeq hasn't changed
if (this._appliedOpIdsCache && this._cacheLastSeq === currentLastSeq) {
return new Set(this._appliedOpIdsCache);
}
// Rebuild cache if seq changed
...
}
```
When new ops are appended, `lastSeq` changes (auto-increment). The cache automatically invalidates because `currentLastSeq !== _cacheLastSeq`.
**Conclusion:** No bug - the sequence-based invalidation works correctly.
---
## Critical Issues (Fix Immediately)
### 1. ~~incrementCompactionCounter() Bug - Compaction Never Triggers~~ - FIXED
**File:** `src/app/core/persistence/operation-log/store/operation-log-store.service.ts:434-439`
**Status:** Fixed on 2025-12-08
**Original Issue:** When no cache exists, returned 1 without persisting. Compaction threshold was never reached.
**Fix Applied:**
```typescript
if (!cache) {
// No state cache yet - create one with counter starting at 1
await store.put({ id: 'current', compactionCounter: 1 });
await tx.done;
return 1;
}
```
---
## High Priority Issues
### 2. Missing Subtask Cascade in Tag Deletion
**File:** `src/app/root-store/meta/task-shared-meta-reducers/tag-shared.reducer.ts:85-116`
When tags deleted, orphan detection only runs on parent tasks. Subtasks of surviving parents that lose all tags become orphaned but aren't cleaned up.
### 3. Snapshot Saved Before Validation
**File:** `src/app/core/persistence/operation-log/store/operation-log-hydrator.service.ts:167-176`
Snapshot saved at line 171, validation runs at line 176. If validation finds corruption and repairs, the snapshot is now stale/wrong.
**Fix:** Move validation (Checkpoint C) BEFORE saving snapshot.
### 4. No Error Handling in applyShortSyntax Sub-Functions
**File:** `src/app/root-store/meta/task-shared-meta-reducers/short-syntax-shared.reducer.ts:53-126`
Multiple state mutations (`moveTaskToProject`, `handleScheduleWithTime`, `handlePlanForDay`) with no try-catch. If any throws mid-operation, state becomes partially updated.
### 5. Missing Recovery When clientId Load Fails
**File:** `src/app/core/persistence/operation-log/operation-log.effects.ts:70-76`
If `loadClientId()` fails or returns empty, operation is lost. No retry mechanism.
---
## Medium Priority Issues
### 6. TOCTOU Race in Compaction
**File:** `src/app/core/persistence/operation-log/store/operation-log-compaction.service.ts:66-93`
Time-of-check-time-of-use between `getLastSeq()` and `deleteOpsWhere()`. New ops written between these calls could theoretically cause issues.
### 7. Orphan Detection Doesn't Verify Project Exists
**File:** `src/app/root-store/meta/task-shared-meta-reducers/tag-shared.reducer.ts:105-108`
```typescript
if (newTagIds.length === 0 && !task.projectId && !task.parentId) {
```
Checks `!task.projectId` but doesn't verify project still exists (could be deleted in same batch).
### 8. State Capture ID Non-Deterministic
**File:** `src/app/core/persistence/operation-log/processing/state-change-capture.service.ts:354-360`
`JSON.stringify(action)` hash is non-deterministic for object property order. Could cause capture ID mismatch.
### 9. Stale Capture Cleanup Only on New Capture
**File:** `src/app/core/persistence/operation-log/processing/state-change-capture.service.ts:231`
`cleanupStaleCaptures()` only runs when new capture created. If operations are infrequent, stale captures accumulate.
### 10. TimeTracking Archive Not Cleaned Atomically
**File:** `src/app/root-store/meta/task-shared-meta-reducers/tag-shared.reducer.ts:188-210`
Current time tracking cleaned in meta-reducer, archives handled async in effect. Can cause sync inconsistency.
### 11. LOCAL_ACTIONS Filter Edge Case
**File:** `src/app/util/local-actions.token.ts:37`
```typescript
return actions$.pipe(filter((action: Action) => !(action as any).meta?.isRemote));
```
If action has no `meta` field, it passes (undefined !== true). Remote operations without explicit `isRemote: true` could trigger effects twice.
---
## Test Coverage Gaps
### Critical Missing Tests
| Gap | Scenario | Impact |
| -------------------------- | ------------------------------ | ---------------- |
| **Race conditions** | Concurrent append + compaction | Data loss |
| **Error recovery** | Quota exceeded mid-write | State corruption |
| **Multi-tab coordination** | Tab A append + Tab B compact | Lock deadlock |
### Missing Scenarios
1. **Concurrent operations:** No test for append during compaction
2. **Quota exceeded:** Emergency compaction path never tested
3. **Schema migration:** Version mismatch during hydration
4. **3-way conflicts:** Only 2-way conflict resolution tested
5. **Conflict on deleted entity:** Update for entity local deleted
6. **Very large conflict sets:** 100+ ops on same entity
7. **Download retry:** Network failure mid-pagination (2 tests skipped with `pending()`)
### Test Infrastructure Issues
- `sync-scenarios.integration.spec.ts:18-32`: All SimulatedClients share same IndexedDB (not true isolation)
- `operation-log-download.service.spec.ts:360-373`: 2 tests marked `pending()` due to timeout
- Benchmark tests disabled (`xdescribe`)
---
## Architecture Observations
### Well-Designed Aspects
1. **Event sourcing model** with vector clocks for conflict detection
2. **Meta-reducer pattern** for atomic multi-entity operations
3. **Snapshot + tail replay** for fast hydration
4. **LOCAL_ACTIONS token** for filtering remote operations from effects
5. **Sequence-based cache invalidation** for `_appliedOpIdsCache`
### Potential Redundancies
1. **Dual compaction triggers:** Both counter-based and time-based cleanup
2. **Multiple validation checkpoints:** A, B, C, D with similar logic
3. **Lock service fallback:** Both Web Locks API and localStorage locking maintained
### Known Limitations (Documented)
- Cross-version sync (A.7.11) not implemented
- Archive data bypasses operation log intentionally
- Schema version currently fixed at v1
---
## Implementation Plan
### Phase 1: Fix Critical Bug
#### 1.1 Fix incrementCompactionCounter() Bug
**File:** `src/app/core/persistence/operation-log/store/operation-log-store.service.ts`
**Lines:** 428-447
### Phase 2: Fix High Priority Issues
#### 2.1 Fix Snapshot Timing in Hydrator
**File:** `src/app/core/persistence/operation-log/store/operation-log-hydrator.service.ts`
**Lines:** 167-176
**Fix:** Move validation (Checkpoint C) BEFORE saving snapshot.
#### 2.2 Add Subtask Orphan Re-detection
**File:** `src/app/root-store/meta/task-shared-meta-reducers/tag-shared.reducer.ts`
**Lines:** After 116
**Fix:** After removing tags from tasks, run orphan detection on subtasks of surviving parents.
### Phase 3: Add Missing Tests
#### 3.1 Race Condition Tests
**File:** `src/app/core/persistence/operation-log/integration/race-conditions.integration.spec.ts` (new)
Tests to add:
- Concurrent append + compaction
- Concurrent upload + download
- Multi-tab lock acquisition
- Rapid local ops during sync
#### 3.2 Error Recovery Tests
**File:** `src/app/core/persistence/operation-log/integration/error-recovery.integration.spec.ts` (new)
Tests to add:
- Quota exceeded mid-write
- Network failure during download pagination
- Corrupted operation in log
- Schema mismatch during hydration
#### 3.3 Fix Pending Download Tests
**File:** `src/app/core/persistence/operation-log/sync/operation-log-download.service.spec.ts`
**Lines:** 360-373
**Fix:** Either mock timers properly or increase test timeout.
### Phase 4: Verify & Run Tests
1. Run `npm run checkFile` on all modified files
2. Run existing tests: `npm test`
3. Run new integration tests
4. Verify no regressions
---
## Files Reference
| Category | Files |
| ----------------- | --------------------------------------------------------------------------------------------------------------------- |
| **Core effects** | `operation-log.effects.ts` |
| **Store** | `operation-log-store.service.ts`, `operation-log-hydrator.service.ts`, `operation-log-compaction.service.ts` |
| **State capture** | `state-change-capture.service.ts`, `state-capture.meta-reducer.ts` |
| **Sync** | `operation-log-sync.service.ts`, `conflict-resolution.service.ts`, `vector-clock.service.ts` |
| **Meta-reducers** | `planner-shared.reducer.ts`, `short-syntax-shared.reducer.ts`, `tag-shared.reducer.ts`, `task-shared-crud.reducer.ts` |
| **Filter** | `local-actions.token.ts` |
---
## Files to Modify
| File | Changes |
| ------------------------------------------- | ---------------------------- |
| `operation-log-store.service.ts` | Fix compaction counter |
| `operation-log-hydrator.service.ts` | Fix snapshot timing |
| `tag-shared.reducer.ts` | Add subtask orphan detection |
| `operation-log-download.service.spec.ts` | Fix pending tests |
| (new) `race-conditions.integration.spec.ts` | Add race condition tests |
| (new) `error-recovery.integration.spec.ts` | Add error recovery tests |

View file

@ -432,7 +432,8 @@ export class OperationLogStoreService {
const cache = await store.get('current');
if (!cache) {
// No state cache yet - counter starts at 1
// No state cache yet - create one with counter starting at 1
await store.put({ id: 'current', compactionCounter: 1 });
await tx.done;
return 1;
}