diff --git a/docs/sync/vector-clocks.md b/docs/sync/vector-clocks.md index 61124d743..5546c3daf 100644 --- a/docs/sync/vector-clocks.md +++ b/docs/sync/vector-clocks.md @@ -207,6 +207,42 @@ web.vectorClock = { desktop: 4, mobile: 3, web: 7 }; // Mobile vs Web: Web is ahead (7 > 2, everything else equal) ``` +### Example 4: Vector Clock Dominance (SYNC_IMPORT Handling) + +When a client receives a full state import (SYNC_IMPORT), it must replay local synced operations that happened "after" the import. Vector clock comparison determines which ops are "dominated" (happened-before) vs "not dominated" (happened-after or concurrent). + +```typescript +// Client receives SYNC_IMPORT with this vector clock: +const syncImportClock = { clientA: 10, clientB: 5 }; + +// Local synced operations to evaluate: +const op1 = { vectorClock: { clientB: 1 } }; // LESS_THAN - dominated +const op2 = { vectorClock: { clientA: 5, clientB: 3 } }; // LESS_THAN - dominated +const op3 = { vectorClock: { clientB: 6 } }; // GREATER_THAN - NOT dominated +const op4 = { vectorClock: { clientA: 10, clientB: 5, clientC: 1 } }; // CONCURRENT - NOT dominated + +// Only op3 and op4 should be replayed +// op1 and op2 are dominated - their state is already in the SYNC_IMPORT + +// Comparison logic: +const comparison = compareVectorClocks(op.vectorClock, syncImportClock); +if (comparison === VectorClockComparison.LESS_THAN) { + // Op is dominated - skip (state already captured in SYNC_IMPORT) + return false; +} +// EQUAL, GREATER_THAN, or CONCURRENT - replay the op +return true; +``` + +**Why This Matters:** + +- **LESS_THAN** (dominated): The op's changes are already reflected in the SYNC_IMPORT snapshot. Replaying would be redundant or cause issues. +- **GREATER_THAN**: The op happened after the SYNC_IMPORT was created. Must replay to preserve local work. +- **CONCURRENT**: The op happened independently of the SYNC_IMPORT. Must replay because it may contain unique changes not in the snapshot. +- **EQUAL**: Edge case where clocks match exactly. Safe to replay. + +See the operation log architecture docs for detailed diagrams of this late-joiner replay scenario. + ## Debugging ### Enable Verbose Logging diff --git a/src/app/core/persistence/operation-log/docs/operation-log-architecture-diagrams.md b/src/app/core/persistence/operation-log/docs/operation-log-architecture-diagrams.md index cdf075998..6981a18ca 100644 --- a/src/app/core/persistence/operation-log/docs/operation-log-architecture-diagrams.md +++ b/src/app/core/persistence/operation-log/docs/operation-log-architecture-diagrams.md @@ -1,6 +1,6 @@ # Operation Log: Architecture Diagrams -**Last Updated:** December 8, 2025 +**Last Updated:** December 12, 2025 **Status:** All core diagrams reflect current implementation These diagrams visualize the Operation Log system architecture. For implementation details, see [operation-log-architecture.md](./operation-log-architecture.md). @@ -254,6 +254,113 @@ flowchart TB 2. **Efficiency**: Snapshot endpoint is designed for large payloads and stores state directly 3. **Server-Side Handling**: Server creates a synthetic operation record for audit purposes +## 2c. Late-Joiner Replay with Vector Clock Dominance ✅ IMPLEMENTED + +When a client receives a SYNC_IMPORT (full state from another client), it must replay any local synced operations that happened "after" the import's vector clock. This ensures local work isn't lost when receiving a full state snapshot. + +**Implementation Status:** Complete. See `OperationLogSyncService._replayLocalSyncedOpsAfterImport()`. + +### The Late-Joiner Problem + +```mermaid +sequenceDiagram + participant A as Client A + participant S as Server + participant B as Client B + + Note over A,B: Both start synced + + A->>S: Upload Op1 (task created) + A->>S: Upload Op2 (task updated) + + Note over B: Client B is offline + + B->>B: Make local changes (Op3, Op4) + B->>S: Upload Op3, Op4 + + Note over B: Client B comes online, receives SYNC_IMPORT from A + + S->>B: SYNC_IMPORT (A's full state) + + Note over B: Problem: Op3, Op4 were already synced!
If we just apply SYNC_IMPORT, we lose B's work +``` + +### The Solution: Vector Clock Dominance Filter + +Before replaying local synced ops after a SYNC_IMPORT, we filter out ops that are "dominated" by the SYNC_IMPORT's vector clock. An op is dominated if its vector clock is `LESS_THAN` the SYNC_IMPORT's clock - meaning the op's state is already captured in the imported snapshot. + +```mermaid +flowchart TD + subgraph Input["SYNC_IMPORT Received"] + SI[SYNC_IMPORT
vectorClock: {A:10, B:5}] + end + + subgraph LocalOps["Local Synced Operations"] + Op1["Op1: {B:1}
LESS_THAN → dominated"] + Op2["Op2: {A:5, B:3}
LESS_THAN → dominated"] + Op3["Op3: {B:6}
GREATER_THAN → NOT dominated"] + Op4["Op4: {A:10, B:5, C:1}
CONCURRENT → NOT dominated"] + end + + subgraph Filter["Vector Clock Comparison"] + Check{Compare each op's clock
with SYNC_IMPORT clock} + end + + subgraph Result["Ops to Replay"] + Replay["Only Op3 and Op4
(not dominated)"] + end + + SI --> Check + LocalOps --> Check + Check --> |"LESS_THAN"| Skip[Skip - already in snapshot] + Check --> |"Otherwise"| Replay + + style Op1 fill:#ffcdd2,stroke:#c62828 + style Op2 fill:#ffcdd2,stroke:#c62828 + style Op3 fill:#c8e6c9,stroke:#2e7d32 + style Op4 fill:#c8e6c9,stroke:#2e7d32 + style Skip fill:#ffebee,stroke:#c62828 + style Replay fill:#e8f5e9,stroke:#2e7d32 +``` + +### Vector Clock Comparison Results + +| Comparison | Meaning | Action | +| -------------- | ------------------------------ | ---------------------------------- | +| `LESS_THAN` | Op happened-before SYNC_IMPORT | Skip (state already captured) | +| `EQUAL` | Same causal history | Replay (edge case, safe to replay) | +| `GREATER_THAN` | Op happened-after SYNC_IMPORT | Replay (newer than snapshot) | +| `CONCURRENT` | Independent changes | Replay (may have unique changes) | + +### Implementation Details + +```typescript +// In OperationLogSyncService._replayLocalSyncedOpsAfterImport() +const localSyncedOps = allEntries.filter((entry) => { + // Must be created by this client + if (entry.op.clientId !== clientId) return false; + // Must be synced (accepted by server) + if (!entry.syncedAt) return false; + // Must NOT be a full-state op itself + if (entry.op.opType === OpType.SyncImport || entry.op.opType === OpType.BackupImport) + return false; + + // Must NOT be dominated by the SYNC_IMPORT's vector clock + const comparison = compareVectorClocks(entry.op.vectorClock, syncImportClock); + if (comparison === VectorClockComparison.LESS_THAN) { + return false; // Skip - state already captured in SYNC_IMPORT + } + return true; +}); +``` + +**Key Points:** + +- Only filters local ops (created by this client) +- Only considers synced ops (accepted by server) +- Uses vector clock comparison to determine dominance +- `LESS_THAN` means dominated (skip), all other results mean not dominated (replay) + --- ## 3. Conflict-Aware Migration Strategy (The Migration Shield) diff --git a/src/app/core/persistence/operation-log/docs/operation-log-architecture.md b/src/app/core/persistence/operation-log/docs/operation-log-architecture.md index 7462fea95..1c98c42d0 100644 --- a/src/app/core/persistence/operation-log/docs/operation-log-architecture.md +++ b/src/app/core/persistence/operation-log/docs/operation-log-architecture.md @@ -2,7 +2,7 @@ **Status:** Parts A, B, C, D Complete (single-version; cross-version sync requires A.7.11) **Branch:** `feat/operation-logs` -**Last Updated:** December 8, 2025 +**Last Updated:** December 12, 2025 --- @@ -1268,6 +1268,64 @@ interface OperationDependency { // After MAX_RETRY_ATTEMPTS (3), they're marked as permanently failed ``` +## C.7 Late-Joiner Replay (SYNC_IMPORT Handling) + +When a client receives a `SYNC_IMPORT` (full state from another client), local synced operations must be replayed on top of the imported state to preserve work that was already accepted by the server. + +### The Problem + +Consider this scenario: + +1. Client B uploads ops to server (Op3, Op4) +2. Client B goes offline +3. Client A uploads a SYNC_IMPORT (full state snapshot) +4. Client B comes online and downloads the SYNC_IMPORT +5. **Without replay**: Client B loses Op3 and Op4's changes + +### The Solution: Vector Clock Dominance Filtering + +When replaying local synced ops after a SYNC_IMPORT, we filter out ops that are **dominated** by the SYNC_IMPORT's vector clock: + +```typescript +// In OperationLogSyncService._replayLocalSyncedOpsAfterImport() +const localSyncedOps = allEntries.filter((entry) => { + // Must be created by this client + if (entry.op.clientId !== clientId) return false; + // Must be synced (accepted by server) + if (!entry.syncedAt) return false; + // Must NOT be a full-state op itself + if (entry.op.opType === OpType.SyncImport || entry.op.opType === OpType.BackupImport) + return false; + + // Must NOT be dominated by the SYNC_IMPORT's vector clock + const comparison = compareVectorClocks(entry.op.vectorClock, syncImportClock); + if (comparison === VectorClockComparison.LESS_THAN) { + return false; // Skip - state already captured in SYNC_IMPORT + } + return true; +}); +``` + +### Vector Clock Dominance + +An operation is "dominated" if its vector clock is `LESS_THAN` the SYNC_IMPORT's clock: + +| Comparison | Meaning | Replay? | +| -------------- | ------------------------------ | ------------------------------- | +| `LESS_THAN` | Op happened-before SYNC_IMPORT | No (state captured in snapshot) | +| `EQUAL` | Same causal history | Yes (edge case) | +| `GREATER_THAN` | Op happened-after SYNC_IMPORT | Yes (newer than snapshot) | +| `CONCURRENT` | Independent changes | Yes (may have unique changes) | + +**Example:** + +- SYNC_IMPORT clock: `{A: 10, B: 5}` +- Local op clock: `{B: 3}` → `LESS_THAN` → Skip (dominated) +- Local op clock: `{B: 6}` → `GREATER_THAN` → Replay (not dominated) +- Local op clock: `{A: 10, B: 5, C: 1}` → `CONCURRENT` → Replay (not dominated) + +See [operation-log-architecture-diagrams.md](./operation-log-architecture-diagrams.md) Section 2c for visual diagrams. + --- # Part D: Data Validation & Repair