docs: improve mermaid diagrams with file names

This commit is contained in:
Johannes Millan 2025-12-03 18:05:48 +01:00
parent 4cfd73cbd6
commit fb590c7fd6
2 changed files with 153 additions and 23 deletions

View file

@ -0,0 +1,130 @@
# Hybrid Manifest & Snapshot Architecture for File-Based Sync
**Status:** Proposal / Planned
**Context:** Optimizing WebDAV/Dropbox sync for the Operation Log architecture.
---
## 1. The Problem
The current `OperationLogSyncService` fallback for file-based providers (WebDAV, Dropbox) is inefficient for frequent, small updates.
**Current Workflow (Naive Fallback):**
1. **Write Operation File:** Upload `ops/ops_CLIENT_TIMESTAMP.json`.
2. **Read Manifest:** Download `ops/manifest.json` to get current list.
3. **Update Manifest:** Upload new `ops/manifest.json` with the new filename added.
**Issues:**
- **High Request Count:** Minimum 3 HTTP requests per sync cycle.
- **File Proliferation:** Rapidly creates thousands of small files, degrading WebDAV directory listing performance.
- **Latency:** On slow connections (standard WebDAV), this makes sync feel sluggish.
## 2. Proposed Solution: Hybrid Manifest
Instead of treating the manifest solely as an _index_ of files, we treat it as a **buffer** for recent operations.
### 2.1. Concept
- **Embedded Operations:** Small batches of operations are stored directly inside `manifest.json`.
- **Lazy Flush:** New operation files (`ops_*.json`) are only created when the manifest buffer fills up.
- **Snapshots:** A "base state" file allows us to delete old operation files and clear the manifest history.
### 2.2. New Data Structures
**Updated Manifest:**
```typescript
interface HybridManifest {
version: number; // e.g., 2
// The baseline state (snapshot). If present, clients load this first.
lastSnapshot?: {
fileName: string; // e.g. "snapshots/state_v1_170123.json"
serverSeq: number; // The max sequence number included in this snapshot
timestamp: number;
};
// Ops stored directly in the manifest (The Buffer)
// Limit: ~50 ops or 100KB
embeddedOperations: Operation[];
// References to external operation files (The Overflow)
// Older ops that were flushed out of the buffer
operationFiles: string[];
}
```
## 3. Workflows
### 3.1. Upload (Write Path)
When a client has local pending operations to sync:
1. **Lock & Read:** Acquire remote lock (if applicable) and download `manifest.json`.
2. **Evaluate Buffer:**
- Check size of `manifest.embeddedOperations`.
- Check size of `pendingOps`.
3. **Strategy Selection:**
- **Scenario A (Standard):** If `manifest.embedded + pending < THRESHOLD`:
- Append pending ops to `manifest.embeddedOperations`.
- **Result:** 1 Write (Manifest). 0 New files.
- **Scenario B (Overflow):** If buffer is full:
- Move _existing_ `embeddedOperations` into a new external file (e.g., `ops/overflow_TIMESTAMP.json`).
- Add that new filename to `manifest.operationFiles`.
- Place _new_ `pendingOps` into the now-empty `manifest.embeddedOperations`.
- **Result:** 1 Upload (Overflow) + 1 Write (Manifest).
4. **Write:** Upload updated `manifest.json`.
### 3.2. Download (Read Path)
When a client checks for updates:
1. **Read Manifest:** Download `manifest.json`.
2. **Check Snapshot:**
- If `manifest.lastSnapshot` is newer than local data, download and apply the snapshot file first.
3. **Process Files:**
- Download and apply any files in `manifest.operationFiles` that haven't been seen yet.
4. **Process Embedded:**
- Apply operations found in `manifest.embeddedOperations`.
## 4. Snapshotting (Compaction)
To prevent the "chain" of operation files from growing forever, any client can trigger a snapshot.
**Trigger:**
- Total external `operationFiles` count > 50.
- OR Total distinct operations in history > 5000.
**Process:**
1. **Download Everything:** Ensure the client has the full, consistent state.
2. **Generate Snapshot:** Serialize the current `AppDataComplete` to a file (e.g., `snapshots/snap_SEQ_TIMESTAMP.json`).
3. **Upload Snapshot:** Upload the new snapshot file.
4. **Update Manifest:**
- Set `lastSnapshot` to the new file.
- **Clear** `operationFiles` (delete the old JSON files from the server to save space).
- **Clear** `embeddedOperations`.
5. **Cleanup:** (Async) Delete the obsolete snapshot files and old operation files from the server.
## 5. Advantages
| Metric | Old Approach | New Approach |
| :---------------------- | :----------------------------------- | :----------------------------------------------------- |
| **Requests per Sync** | 3 (Upload Op + Read Man + Write Man) | **2** (Read Man + Write Man) |
| **Files on Server** | 1 per sync (Unlimited growth) | **Bounded** (1 Manifest + ~0-50 Op Files + 1 Snapshot) |
| **Fresh Install Speed** | Slow (Replay thousands of JSONs) | **Fast** (Download 1 Snapshot + recent delta) |
| **Conflict Handling** | Same (Vector Clocks) | Same (Vector Clocks) |
## 6. Implementation Plan
1. **Modify `OperationLogSyncService`:**
- Update `_loadRemoteManifest` to handle v2 format.
- Refactor `_uploadPendingOpsViaFiles` to implement the buffer/overflow logic.
2. **Add Snapshot Logic:**
- Create `OperationLogSnapshotService` to handle generating and hydrating from the large snapshot files.
- Add simple heuristic to `sync.service` to decide when to snapshot.
3. **Migration:**
- When a v2 client sees a v1 manifest, it should automatically "upgrade" it (move existing files to `operationFiles` list and add version tag).

View file

@ -12,47 +12,47 @@ graph TD
classDef legacy fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,stroke-dasharray: 5 5,color:black;
classDef trigger fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:black;
User((User / UI)) -->|Dispatch Action| NgRx[NgRx Store <br/> Runtime Source of Truth]
User((User / UI)) -->|Dispatch Action| NgRx["NgRx Store <br/> Runtime Source of Truth<br/><sub>*.effects.ts / *.reducer.ts</sub>"]
subgraph "Write Path (Runtime)"
NgRx -->|Action Stream| OpEffects[OperationLogEffects]
NgRx -->|Action Stream| OpEffects["OperationLogEffects<br/><sub>operation-log.effects.ts</sub>"]
OpEffects -->|1. Check isPersistent| Filter{Is Persistent?}
OpEffects -->|1. Check isPersistent| Filter{"Is Persistent?<br/><sub>persistent-action.interface.ts</sub>"}
Filter -- No --> Ignore[Ignore / UI Only]
Filter -- Yes --> Transform[Transform to Operation<br/>UUIDv7, Timestamp, VectorClock]
Filter -- Yes --> Transform["Transform to Operation<br/>UUIDv7, Timestamp, VectorClock<br/><sub>operation-converter.util.ts</sub>"]
Transform -->|2. Validate| PayloadValid{Payload<br/>Valid?}
Transform -->|2. Validate| PayloadValid{"Payload<br/>Valid?<br/><sub>validate-operation-payload.ts</sub>"}
PayloadValid -- No --> ErrorSnack[Show Error Snackbar]
PayloadValid -- Yes --> DBWrite
end
subgraph "Persistence Layer (IndexedDB)"
DBWrite[Write to SUP_OPS]:::storage
DBWrite["Write to SUP_OPS<br/><sub>operation-log-store.service.ts</sub>"]:::storage
DBWrite -->|Append| OpsTable[Table: ops<br/>The Event Log]:::storage
DBWrite -->|Update| StateCache[Table: state_cache<br/>Snapshots]:::storage
DBWrite -->|Append| OpsTable["Table: ops<br/>The Event Log<br/><sub>IndexedDB</sub>"]:::storage
DBWrite -->|Update| StateCache["Table: state_cache<br/>Snapshots<br/><sub>IndexedDB</sub>"]:::storage
end
subgraph "Legacy Bridge (PFAPI)"
DBWrite -.->|3. Bridge| LegacyMeta[META_MODEL<br/>Vector Clock]:::legacy
LegacyMeta -.->|Update| LegacySync[Legacy Sync Adapters<br/>WebDAV / Dropbox / Local]:::legacy
DBWrite -.->|3. Bridge| LegacyMeta["META_MODEL<br/>Vector Clock<br/><sub>pfapi.service.ts</sub>"]:::legacy
LegacyMeta -.->|Update| LegacySync["Legacy Sync Adapters<br/>WebDAV / Dropbox / Local<br/><sub>pfapi.service.ts</sub>"]:::legacy
noteLegacy[Updates Vector Clock so<br/>Legacy Sync detects changes]:::legacy
end
subgraph "Compaction System"
OpsTable -->|Count > 500| CompactionTrig{Compaction<br/>Trigger}:::trigger
CompactionTrig -->|Yes| Compactor[CompactionService]:::process
OpsTable -->|Count > 500| CompactionTrig{"Compaction<br/>Trigger<br/><sub>operation-log.effects.ts</sub>"}:::trigger
CompactionTrig -->|Yes| Compactor["CompactionService<br/><sub>operation-log-compaction.service.ts</sub>"]:::process
Compactor -->|Read State| NgRx
Compactor -->|Save Snapshot| StateCache
Compactor -->|Delete Old Ops| OpsTable
end
subgraph "Read Path (Hydration)"
Startup((App Startup)) --> Hydrator[OperationLogHydrator]:::process
Startup((App Startup)) --> Hydrator["OperationLogHydrator<br/><sub>operation-log-hydrator.service.ts</sub>"]:::process
Hydrator -->|1. Load| StateCache
StateCache -->|Check| Schema{Schema<br/>Version?}
Schema -- Old --> Migrator[SchemaMigrationService]:::process
StateCache -->|Check| Schema{"Schema<br/>Version?<br/><sub>schema-migration.service.ts</sub>"}
Schema -- Old --> Migrator["SchemaMigrationService<br/><sub>schema-migration.service.ts</sub>"]:::process
Migrator -->|Transform State| MigratedState
Schema -- Current --> CurrentState
@ -60,12 +60,12 @@ graph TD
MigratedState -->|Load State| StoreInit
Hydrator -->|2. Load Tail| OpsTable
OpsTable -->|Replay Ops| Replayer[OperationApplier]:::process
OpsTable -->|Replay Ops| Replayer["OperationApplier<br/><sub>operation-applier.service.ts</sub>"]:::process
Replayer -->|Dispatch| NgRx
end
subgraph "Multi-Tab"
DBWrite -->|4. Broadcast| BC[BroadcastChannel]
DBWrite -->|4. Broadcast| BC["BroadcastChannel<br/><sub>multi-tab-coordinator.service.ts</sub>"]
BC -->|Notify| OtherTabs((Other Tabs))
end
@ -91,9 +91,9 @@ graph TD
end
subgraph "Client: Sync Loop"
Scheduler((Scheduler)) -->|Interval| SyncService[OperationLogSyncService]
Scheduler((Scheduler)) -->|Interval| SyncService["OperationLogSyncService<br/><sub>operation-log-sync.service.ts</sub>"]
SyncService -->|1. Get Last Seq| LocalMeta[Sync Metadata]
SyncService -->|1. Get Last Seq| LocalMeta["Sync Metadata<br/><sub>operation-log-store.service.ts</sub>"]
%% Download Flow
SyncService -->|2. Download Ops| API
@ -105,14 +105,14 @@ graph TD
end
subgraph "Client: Conflict Management"
ConflictDet{Conflict<br/>Detection}:::conflict
ConflictDet{"Conflict<br/>Detection<br/><sub>conflict-resolution.service.ts</sub>"}:::conflict
ConflictDet -->|Check Vector Clocks| VCCheck[Entity-Level Check]
VCCheck -- Concurrent --> ConflictFound[Conflict Found!]:::conflict
VCCheck -- Sequential --> NoConflict[No Conflict]
ConflictFound --> UserDialog[User Resolution Dialog]:::conflict
ConflictFound --> UserDialog["User Resolution Dialog<br/><sub>dialog-conflict-resolution.component.ts</sub>"]:::conflict
UserDialog -- "Keep Remote" --> MarkRejected[Mark Local Ops<br/>as Rejected]:::conflict
MarkRejected --> ApplyRemote[Apply Remote Ops]
@ -125,10 +125,10 @@ graph TD
subgraph "Client: Application & Validation"
ApplyRemote -->|Apply to Store| Store[NgRx Store]
Store -->|Post-Apply| Validator{Validate<br/>State?}:::repair
Store -->|Post-Apply| Validator{"Validate<br/>State?<br/><sub>validate-state.service.ts</sub>"}:::repair
Validator -- Valid --> Done((Sync Done))
Validator -- Invalid --> Repair[Auto-Repair Service]:::repair
Validator -- Invalid --> Repair["Auto-Repair Service<br/><sub>repair-operation.service.ts</sub>"]:::repair
Repair -->|Fix Data| RepairedState
Repair -->|Create Op| RepairOp[Create REPAIR Op]:::repair