mirror of
https://github.com/johannesjo/super-productivity.git
synced 2026-01-23 02:36:05 +00:00
docs: reorganize sync and operation-log documentation
Move scattered architecture docs into centralized locations: - Move operation-log docs from src/app/core/persistence/operation-log/docs/ to docs/op-log/ - Flatten docs/sync/sync/ nested structure to docs/sync/ - Move supersync-encryption-architecture.md from docs/ai/ to docs/sync/ - Copy pfapi sync README to docs/sync/pfapi-sync-overview.md - Update all cross-references to use new paths This improves discoverability and keeps architecture documentation separate from source code.
This commit is contained in:
parent
a850f8af9e
commit
b4ce9d5da6
23 changed files with 168 additions and 12 deletions
132
docs/op-log/README.md
Normal file
132
docs/op-log/README.md
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
# Operation Log Documentation
|
||||
|
||||
**Last Updated:** December 2025
|
||||
|
||||
This directory contains the architectural documentation for Super Productivity's Operation Log system - an event-sourced persistence and synchronization layer.
|
||||
|
||||
## Quick Start
|
||||
|
||||
| If you want to... | Read this |
|
||||
| ----------------------------------- | ---------------------------------------------------------------------------------- |
|
||||
| Understand the overall architecture | [operation-log-architecture.md](./operation-log-architecture.md) |
|
||||
| See visual diagrams | [operation-log-architecture-diagrams.md](./operation-log-architecture-diagrams.md) |
|
||||
| Learn the design rules | [operation-rules.md](./operation-rules.md) |
|
||||
| Understand file-based sync | [hybrid-manifest-architecture.md](./hybrid-manifest-architecture.md) |
|
||||
| Understand legacy PFAPI sync | [pfapi-sync-persistence-architecture.md](./pfapi-sync-persistence-architecture.md) |
|
||||
|
||||
## Documentation Overview
|
||||
|
||||
### Core Documentation
|
||||
|
||||
| Document | Description | Status |
|
||||
| ---------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- |
|
||||
| [operation-log-architecture.md](./operation-log-architecture.md) | Comprehensive architecture reference covering Parts A-F: Local Persistence, Legacy Sync Bridge, Server Sync, Validation & Repair, Smart Archive Handling, and Atomic State Consistency | ✅ Active |
|
||||
| [operation-log-architecture-diagrams.md](./operation-log-architecture-diagrams.md) | Mermaid diagrams visualizing data flows, sync protocols, and state management | ✅ Active |
|
||||
| [operation-rules.md](./operation-rules.md) | Design rules and guidelines for the operation log store and operations | ✅ Active |
|
||||
|
||||
### Sync Architecture
|
||||
|
||||
| Document | Description | Status |
|
||||
| ---------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- | -------------- |
|
||||
| [hybrid-manifest-architecture.md](./hybrid-manifest-architecture.md) | File-based sync optimization using embedded operations buffer and snapshots (WebDAV/Dropbox) | ✅ Implemented |
|
||||
| [pfapi-sync-persistence-architecture.md](./pfapi-sync-persistence-architecture.md) | Legacy PFAPI sync system that coexists with operation log | ✅ Active |
|
||||
|
||||
### Planning & Proposals
|
||||
|
||||
| Document | Description | Status |
|
||||
| ---------------------------------------------------------------------------------------------- | --------------------------------------------- | ------------- |
|
||||
| [e2e-encryption-plan.md](./e2e-encryption-plan.md) | End-to-end encryption design proposal | 📋 Planned |
|
||||
| [tiered-archive-proposal.md](./tiered-archive-proposal.md) | Multi-tier archive storage proposal | 📋 Planned |
|
||||
| [operation-payload-optimization-discussion.md](./operation-payload-optimization-discussion.md) | Discussion on payload optimization strategies | 📋 Historical |
|
||||
|
||||
## Architecture at a Glance
|
||||
|
||||
The Operation Log system serves four distinct purposes:
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────────┐
|
||||
│ User Action │
|
||||
└────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
NgRx Store
|
||||
(Runtime Source of Truth)
|
||||
│
|
||||
┌───────────────────┼───────────────────┐
|
||||
▼ │ ▼
|
||||
OpLogEffects │ Other Effects
|
||||
│ │
|
||||
├──► SUP_OPS ◄──────┘
|
||||
│ (Local Persistence - Part A)
|
||||
│
|
||||
└──► META_MODEL vector clock
|
||||
(Legacy Sync Bridge - Part B)
|
||||
|
||||
PFAPI reads from NgRx for sync (not from op-log)
|
||||
```
|
||||
|
||||
### The Four Parts
|
||||
|
||||
| Part | Purpose | Description |
|
||||
| -------------------------- | --------------------------- | ----------------------------------------------------------------------------- |
|
||||
| **A. Local Persistence** | Fast writes, crash recovery | Operations stored in IndexedDB (`SUP_OPS`), with snapshots for fast hydration |
|
||||
| **B. Legacy Sync Bridge** | PFAPI compatibility | Updates vector clocks so WebDAV/Dropbox sync continues to work |
|
||||
| **C. Server Sync** | Operation-based sync | Upload/download individual operations via SuperSync server |
|
||||
| **D. Validation & Repair** | Data integrity | Checkpoint validation with automatic repair and REPAIR operations |
|
||||
|
||||
Additional architectural patterns:
|
||||
|
||||
| Pattern | Purpose |
|
||||
| ------------------------------- | ------------------------------------------------------------------ |
|
||||
| **E. Smart Archive Handling** | Deterministic archive operations synced via instructions, not data |
|
||||
| **F. Atomic State Consistency** | Meta-reducers ensure multi-entity changes are atomic |
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### Event Sourcing
|
||||
|
||||
The Operation Log treats the database as a **timeline of events** rather than mutable state:
|
||||
|
||||
- **Source of Truth**: The log is truth; current state is derived by replaying the log
|
||||
- **Immutability**: Operations are never modified, only appended
|
||||
- **Snapshots**: Periodic snapshots speed up hydration (replay from snapshot + tail ops)
|
||||
|
||||
### Vector Clocks
|
||||
|
||||
Vector clocks track causality for conflict detection:
|
||||
|
||||
- Each client has its own counter in the vector clock
|
||||
- Comparison reveals: `EQUAL`, `LESS_THAN`, `GREATER_THAN`, or `CONCURRENT`
|
||||
- `CONCURRENT` indicates a true conflict requiring resolution
|
||||
|
||||
### LOCAL_ACTIONS Token
|
||||
|
||||
Effects that perform side effects (snacks, external APIs, UI) must use `LOCAL_ACTIONS` instead of `Actions`:
|
||||
|
||||
```typescript
|
||||
private _actions$ = inject(LOCAL_ACTIONS); // Excludes remote operations
|
||||
```
|
||||
|
||||
This prevents duplicate side effects when syncing operations from other clients.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
| Location | Content |
|
||||
| ---------------------------------------------------------------------- | ----------------------------------- |
|
||||
| [/docs/sync/vector-clocks.md](/docs/sync/vector-clocks.md) | Vector clock implementation details |
|
||||
| [/docs/ai/sync/](/android/sync/) | Historical planning documents |
|
||||
| [/packages/super-sync-server/](/packages/super-sync-server/) | SuperSync server implementation |
|
||||
| [/src/app/pfapi/api/sync/README.md](/src/app/pfapi/api/sync/README.md) | PFAPI sync overview |
|
||||
|
||||
## Implementation Status
|
||||
|
||||
| Component | Status |
|
||||
| ---------------------------- | --------------------------------------------------- |
|
||||
| Local Persistence (Part A) | ✅ Complete |
|
||||
| Legacy Sync Bridge (Part B) | ✅ Complete |
|
||||
| Server Sync (Part C) | ✅ Complete (single-version) |
|
||||
| Validation & Repair (Part D) | ✅ Complete |
|
||||
| Cross-version Sync (A.7.11) | ⚠️ Not implemented |
|
||||
| Schema Migrations | ✅ Infrastructure ready (no migrations defined yet) |
|
||||
|
||||
See [operation-log-architecture.md#implementation-status](./operation-log-architecture.md#implementation-status) for detailed status.
|
||||
411
docs/op-log/e2e-encryption-plan.md
Normal file
411
docs/op-log/e2e-encryption-plan.md
Normal file
|
|
@ -0,0 +1,411 @@
|
|||
# E2E Encryption for SuperSync Server
|
||||
|
||||
## Summary
|
||||
|
||||
Add end-to-end encryption to SuperSync where the server cannot read operation payloads. Users provide a separate encryption password which is used to derive an encryption key client-side. This is the same approach used by legacy sync providers (Dropbox, WebDAV, Local File).
|
||||
|
||||
## Key Decisions
|
||||
|
||||
| Decision | Choice |
|
||||
| -------------------- | -------------------------------------------------------------- |
|
||||
| Encryption scope | Payload-only (metadata stays plaintext for conflict detection) |
|
||||
| Key derivation | User-provided encryption password → Argon2id → key |
|
||||
| Password change | Not supported (would require re-encrypting all data) |
|
||||
| Server changes | None required |
|
||||
| Missing key handling | Fail gracefully with dialog to enter password |
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
### Encryption Flow
|
||||
|
||||
```
|
||||
User Encryption Password → Argon2id (64MB, 3 iter) → AES-256 Key → encrypt/decrypt payloads
|
||||
```
|
||||
|
||||
### Data Flow
|
||||
|
||||
```
|
||||
Upload: Operation → encrypt payload with key → upload (metadata plaintext)
|
||||
Download: Receive ops → decrypt payload with key → apply to state
|
||||
```
|
||||
|
||||
### Why Payload-Only Encryption?
|
||||
|
||||
The server needs plaintext metadata for:
|
||||
|
||||
- **Conflict detection** - Uses vector clocks to detect concurrent edits
|
||||
- **Deduplication** - Uses operation IDs to prevent duplicates
|
||||
- **Ordering** - Uses timestamps and server sequence numbers
|
||||
- **Tombstone tracking** - Uses entity IDs for delete tracking
|
||||
|
||||
The server does NOT need to read:
|
||||
|
||||
- **Payloads** - The actual data being created/updated/deleted
|
||||
|
||||
This design encrypts payloads while keeping metadata accessible, giving the server enough information to coordinate sync without seeing user data.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Data Model Changes
|
||||
|
||||
**File:** `src/app/pfapi/api/sync/sync-provider.interface.ts`
|
||||
|
||||
Add encryption flag to `SyncOperation`:
|
||||
|
||||
```typescript
|
||||
export interface SyncOperation {
|
||||
// ... existing fields ...
|
||||
isPayloadEncrypted?: boolean; // NEW: true if payload is encrypted string
|
||||
}
|
||||
```
|
||||
|
||||
**File:** `src/app/pfapi/api/sync/providers/super-sync/super-sync.model.ts`
|
||||
|
||||
The `encryptKey` field already exists in `SyncProviderPrivateCfgBase`. Just need to add the enable flag:
|
||||
|
||||
```typescript
|
||||
export interface SuperSyncPrivateCfg extends SyncProviderPrivateCfgBase {
|
||||
// ... existing fields ...
|
||||
isEncryptionEnabled?: boolean; // NEW
|
||||
// encryptKey?: string; // Already inherited from base
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Client-Side Encryption Service
|
||||
|
||||
**New file:** `src/app/core/persistence/operation-log/sync/operation-encryption.service.ts`
|
||||
|
||||
```typescript
|
||||
import { inject, Injectable } from '@angular/core';
|
||||
import { encrypt, decrypt } from '../../../../pfapi/api/encryption/encryption';
|
||||
|
||||
@Injectable({ providedIn: 'root' })
|
||||
export class OperationEncryptionService {
|
||||
/**
|
||||
* Encrypts the payload of a SyncOperation.
|
||||
* Returns a new operation with encrypted payload and isPayloadEncrypted=true.
|
||||
*/
|
||||
async encryptOperation(op: SyncOperation, encryptKey: string): Promise<SyncOperation> {
|
||||
const payloadStr = JSON.stringify(op.payload);
|
||||
const encryptedPayload = await encrypt(payloadStr, encryptKey);
|
||||
return {
|
||||
...op,
|
||||
payload: encryptedPayload,
|
||||
isPayloadEncrypted: true,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Decrypts the payload of a SyncOperation.
|
||||
* Returns a new operation with decrypted payload.
|
||||
* Throws DecryptError if decryption fails.
|
||||
*/
|
||||
async decryptOperation(op: SyncOperation, encryptKey: string): Promise<SyncOperation> {
|
||||
if (!op.isPayloadEncrypted) {
|
||||
return op; // Pass through unencrypted ops
|
||||
}
|
||||
const decryptedStr = await decrypt(op.payload as string, encryptKey);
|
||||
return {
|
||||
...op,
|
||||
payload: JSON.parse(decryptedStr),
|
||||
isPayloadEncrypted: false,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Batch encrypt operations for upload.
|
||||
*/
|
||||
async encryptOperations(
|
||||
ops: SyncOperation[],
|
||||
encryptKey: string,
|
||||
): Promise<SyncOperation[]> {
|
||||
return Promise.all(ops.map((op) => this.encryptOperation(op, encryptKey)));
|
||||
}
|
||||
|
||||
/**
|
||||
* Batch decrypt operations after download.
|
||||
* Non-encrypted ops pass through unchanged.
|
||||
*/
|
||||
async decryptOperations(
|
||||
ops: SyncOperation[],
|
||||
encryptKey: string,
|
||||
): Promise<SyncOperation[]> {
|
||||
return Promise.all(ops.map((op) => this.decryptOperation(op, encryptKey)));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Reuses:** Existing `src/app/pfapi/api/encryption/encryption.ts` (AES-GCM, Argon2id)
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Upload Integration
|
||||
|
||||
**File:** `src/app/core/persistence/operation-log/sync/operation-log-upload.service.ts`
|
||||
|
||||
Modify `_uploadPendingOpsViaApi()`:
|
||||
|
||||
```typescript
|
||||
// Add injection
|
||||
private encryptionService = inject(OperationEncryptionService);
|
||||
|
||||
// In _uploadPendingOpsViaApi(), after converting to SyncOperation format:
|
||||
const privateCfg = await syncProvider.privateCfg.load();
|
||||
let opsToUpload = syncOps;
|
||||
|
||||
if (privateCfg?.isEncryptionEnabled && privateCfg?.encryptKey) {
|
||||
opsToUpload = await this.encryptionService.encryptOperations(syncOps, privateCfg.encryptKey);
|
||||
}
|
||||
|
||||
// Upload opsToUpload instead of syncOps
|
||||
const response = await syncProvider.uploadOps(opsToUpload, clientId, lastKnownServerSeq);
|
||||
|
||||
// Also encrypt piggybacked ops handling needs decryption:
|
||||
if (response.newOps && response.newOps.length > 0) {
|
||||
let ops = response.newOps.map((serverOp) => serverOp.op);
|
||||
if (privateCfg?.encryptKey) {
|
||||
ops = await this.encryptionService.decryptOperations(ops, privateCfg.encryptKey);
|
||||
}
|
||||
const operations = ops.map((op) => syncOpToOperation(op));
|
||||
piggybackedOps.push(...operations);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Download Integration
|
||||
|
||||
**File:** `src/app/core/persistence/operation-log/sync/operation-log-download.service.ts`
|
||||
|
||||
Modify `_downloadRemoteOpsViaApi()`:
|
||||
|
||||
```typescript
|
||||
// Add injection
|
||||
private encryptionService = inject(OperationEncryptionService);
|
||||
private matDialog = inject(MatDialog);
|
||||
|
||||
// After downloading ops, before converting to Operation format:
|
||||
const privateCfg = await syncProvider.privateCfg.load();
|
||||
let syncOps = response.ops
|
||||
.filter((serverOp) => !appliedOpIds.has(serverOp.op.id))
|
||||
.map((serverOp) => serverOp.op);
|
||||
|
||||
// Check if any ops are encrypted
|
||||
const hasEncryptedOps = syncOps.some(op => op.isPayloadEncrypted);
|
||||
|
||||
if (hasEncryptedOps) {
|
||||
let encryptKey = privateCfg?.encryptKey;
|
||||
|
||||
// If no key cached, prompt user
|
||||
if (!encryptKey) {
|
||||
encryptKey = await this._promptForEncryptionPassword();
|
||||
if (!encryptKey) {
|
||||
// User cancelled - abort sync
|
||||
return { newOps: [], success: false };
|
||||
}
|
||||
}
|
||||
|
||||
try {
|
||||
syncOps = await this.encryptionService.decryptOperations(syncOps, encryptKey);
|
||||
} catch (e) {
|
||||
if (e instanceof DecryptError) {
|
||||
// Wrong password - prompt again
|
||||
await this._showDecryptionErrorDialog();
|
||||
return { newOps: [], success: false };
|
||||
}
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
|
||||
const operations = syncOps.map((op) => syncOpToOperation(op));
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: UI Changes
|
||||
|
||||
**File:** `src/app/features/config/form-cfgs/sync-form.const.ts`
|
||||
|
||||
Add encryption fields to SuperSync provider form:
|
||||
|
||||
```typescript
|
||||
// In SuperSync fieldGroup, add:
|
||||
{
|
||||
key: 'isEncryptionEnabled',
|
||||
type: 'checkbox',
|
||||
props: {
|
||||
label: T.F.SYNC.FORM.SUPER_SYNC.L_ENABLE_E2E_ENCRYPTION,
|
||||
},
|
||||
},
|
||||
{
|
||||
hideExpression: (model: any) => !model.isEncryptionEnabled,
|
||||
key: 'encryptKey',
|
||||
type: 'input',
|
||||
props: {
|
||||
type: 'password',
|
||||
label: T.F.SYNC.FORM.L_ENCRYPTION_PASSWORD,
|
||||
required: true,
|
||||
},
|
||||
},
|
||||
{
|
||||
hideExpression: (model: any) => !model.isEncryptionEnabled,
|
||||
type: 'tpl',
|
||||
props: {
|
||||
tpl: `<div class="warn-text">{{ T.F.SYNC.FORM.SUPER_SYNC.ENCRYPTION_WARNING | translate }}</div>`,
|
||||
},
|
||||
},
|
||||
```
|
||||
|
||||
**Translations:** `src/assets/i18n/en.json`
|
||||
|
||||
```json
|
||||
{
|
||||
"F": {
|
||||
"SYNC": {
|
||||
"FORM": {
|
||||
"SUPER_SYNC": {
|
||||
"L_ENABLE_E2E_ENCRYPTION": "Enable end-to-end encryption",
|
||||
"ENCRYPTION_WARNING": "Warning: If you forget your encryption password, your data cannot be recovered. This password is separate from your login password."
|
||||
}
|
||||
},
|
||||
"S": {
|
||||
"DECRYPTION_FAILED": "Failed to decrypt synced data. Please check your encryption password.",
|
||||
"ENCRYPTION_PASSWORD_REQUIRED": "Encryption password required to sync encrypted data."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**New dialog component:** `src/app/imex/sync/dialog-encryption-password/`
|
||||
|
||||
Simple dialog to prompt for encryption password when needed:
|
||||
|
||||
- Input field for password
|
||||
- Cancel and OK buttons
|
||||
- Used when encrypted ops are received but no password is cached
|
||||
|
||||
---
|
||||
|
||||
## File Summary
|
||||
|
||||
### New Files
|
||||
|
||||
| File | Purpose |
|
||||
| ----------------------------------------------------------------------------- | -------------------------- |
|
||||
| `src/app/core/persistence/operation-log/sync/operation-encryption.service.ts` | Encrypt/decrypt operations |
|
||||
| `src/app/imex/sync/dialog-encryption-password/` | Password prompt dialog |
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| ------------------------------------------------------------------------------- | ----------------------------------------- |
|
||||
| `src/app/pfapi/api/sync/sync-provider.interface.ts` | Add `isPayloadEncrypted` to SyncOperation |
|
||||
| `src/app/pfapi/api/sync/providers/super-sync/super-sync.model.ts` | Add `isEncryptionEnabled` flag |
|
||||
| `src/app/core/persistence/operation-log/sync/operation-log-upload.service.ts` | Encrypt before upload |
|
||||
| `src/app/core/persistence/operation-log/sync/operation-log-download.service.ts` | Decrypt after download |
|
||||
| `src/app/features/config/form-cfgs/sync-form.const.ts` | Add encryption toggle + password field |
|
||||
| `src/app/t.const.ts` | Add translation keys |
|
||||
| `src/assets/i18n/en.json` | Add translation strings |
|
||||
|
||||
### No Server Changes Required
|
||||
|
||||
The server treats encrypted payloads as opaque strings - no modifications needed.
|
||||
|
||||
---
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### What's Protected
|
||||
|
||||
1. **Payload content** - All user data (tasks, projects, notes, etc.) is encrypted
|
||||
2. **Zero-knowledge** - Server never sees encryption password or plaintext data
|
||||
3. **Strong crypto** - AES-256-GCM with Argon2id key derivation
|
||||
|
||||
### What's Exposed (by design)
|
||||
|
||||
1. **Operation metadata** - IDs, timestamps, entity types, vector clocks
|
||||
2. **Traffic patterns** - Server knows when you sync and how many operations
|
||||
3. **Encryption status** - Server can see `isPayloadEncrypted: true`
|
||||
|
||||
### Cryptographic Details
|
||||
|
||||
| Component | Algorithm | Parameters |
|
||||
| ------------------ | ----------- | --------------------------------------------- |
|
||||
| Key derivation | Argon2id | 64MB memory, 3 iterations |
|
||||
| Payload encryption | AES-256-GCM | Random 12-byte IV, 16-byte salt per operation |
|
||||
|
||||
### Limitations
|
||||
|
||||
| Limitation | Reason |
|
||||
| -------------------- | --------------------------------------------------------- |
|
||||
| No password change | Would require re-encrypting all operations on all clients |
|
||||
| No password recovery | True zero-knowledge means no recovery possible |
|
||||
| Two passwords | Login password + encryption password (by design) |
|
||||
|
||||
### Threat Model
|
||||
|
||||
| Threat | Mitigated? | Notes |
|
||||
| -------------------- | ---------- | --------------------------------------------------- |
|
||||
| Server reads data | Yes | Payloads encrypted client-side |
|
||||
| Server breach | Yes | Attacker gets encrypted blobs, needs password |
|
||||
| MITM attack | Yes | HTTPS + authenticated encryption |
|
||||
| Password brute force | Partially | Argon2id makes attacks expensive (64MB per attempt) |
|
||||
| Lost password | No | Data unrecoverable without password |
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
### Enabling Encryption (Existing User)
|
||||
|
||||
1. User enables "E2E Encryption" in SuperSync settings
|
||||
2. User enters encryption password
|
||||
3. Warning shown about password recovery
|
||||
4. Password saved to `SuperSyncPrivateCfg.encryptKey`
|
||||
5. `isEncryptionEnabled` set to true
|
||||
6. Future operations encrypted; existing operations remain plaintext
|
||||
7. Other clients prompted for password when they encounter encrypted ops
|
||||
|
||||
### Multi-Client Scenario
|
||||
|
||||
When encryption is enabled on one client:
|
||||
|
||||
1. Other clients download operations normally
|
||||
2. When they encounter `isPayloadEncrypted: true`, decryption is attempted
|
||||
3. If no password cached, dialog prompts for password
|
||||
4. Password cached in local `SuperSyncPrivateCfg` for future syncs
|
||||
|
||||
### Disabling Encryption
|
||||
|
||||
1. User unchecks encryption toggle
|
||||
2. `isEncryptionEnabled` set to false
|
||||
3. Future operations sent without encryption
|
||||
4. Existing encrypted operations remain encrypted (still readable with password)
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
1. **Unit tests** for `OperationEncryptionService`
|
||||
|
||||
- Encrypt/decrypt round-trips with various payload types
|
||||
- Non-encrypted ops pass through unchanged
|
||||
- Wrong password throws DecryptError
|
||||
|
||||
2. **Integration tests** for upload/download
|
||||
|
||||
- Encrypted operations sync correctly
|
||||
- Mixed encrypted/unencrypted history works
|
||||
- Piggybacked operations decrypt correctly
|
||||
|
||||
3. **E2E tests**
|
||||
- Two clients with same password sync correctly
|
||||
- Missing password shows dialog
|
||||
- Wrong password shows error and retries
|
||||
627
docs/op-log/hybrid-manifest-architecture.md
Normal file
627
docs/op-log/hybrid-manifest-architecture.md
Normal file
|
|
@ -0,0 +1,627 @@
|
|||
# Hybrid Manifest & Snapshot Architecture for File-Based Sync
|
||||
|
||||
**Status:** ✅ Implemented (December 2025)
|
||||
**Context:** Optimizing WebDAV/Dropbox sync for the Operation Log architecture.
|
||||
**Related:** [Operation Log Architecture](./operation-log-architecture.md)
|
||||
|
||||
> **Implementation Note:** This architecture is fully implemented in `OperationLogManifestService`, `OperationLogUploadService`, and `OperationLogDownloadService`. The embedded operations buffer, overflow file creation, and snapshot support are all operational.
|
||||
|
||||
---
|
||||
|
||||
## 1. The Problem
|
||||
|
||||
The current `OperationLogSyncService` fallback for file-based providers (WebDAV, Dropbox) is inefficient for frequent, small updates.
|
||||
|
||||
**Current Workflow (Naive Fallback):**
|
||||
|
||||
1. **Write Operation File:** Upload `ops/ops_CLIENT_TIMESTAMP.json`.
|
||||
2. **Read Manifest:** Download `ops/manifest.json` to get current list.
|
||||
3. **Update Manifest:** Upload new `ops/manifest.json` with the new filename added.
|
||||
|
||||
**Issues:**
|
||||
|
||||
- **High Request Count:** Minimum 3 HTTP requests per sync cycle.
|
||||
- **File Proliferation:** Rapidly creates thousands of small files, degrading WebDAV directory listing performance.
|
||||
- **Latency:** On slow connections (standard WebDAV), this makes sync feel sluggish.
|
||||
|
||||
---
|
||||
|
||||
## 2. Proposed Solution: Hybrid Manifest
|
||||
|
||||
Instead of treating the manifest solely as an _index_ of files, we treat it as a **buffer** for recent operations.
|
||||
|
||||
### 2.1. Concept
|
||||
|
||||
- **Embedded Operations:** Small batches of operations are stored directly inside `manifest.json`.
|
||||
- **Lazy Flush:** New operation files (`ops_*.json`) are only created when the manifest buffer fills up.
|
||||
- **Snapshots:** A "base state" file allows us to delete old operation files and clear the manifest history.
|
||||
|
||||
### 2.2. Data Structures
|
||||
|
||||
**Updated Manifest:**
|
||||
|
||||
```typescript
|
||||
interface HybridManifest {
|
||||
version: 2;
|
||||
|
||||
// The baseline state (snapshot). If present, clients load this first.
|
||||
lastSnapshot?: SnapshotReference;
|
||||
|
||||
// Ops stored directly in the manifest (The Buffer)
|
||||
// Limit: ~50 ops or 100KB payload size
|
||||
embeddedOperations: EmbeddedOperation[];
|
||||
|
||||
// References to external operation files (The Overflow)
|
||||
// Older ops that were flushed out of the buffer
|
||||
operationFiles: OperationFileReference[];
|
||||
|
||||
// Merged vector clock from all embedded operations
|
||||
// Used for quick conflict detection without parsing all ops
|
||||
frontierClock: VectorClock;
|
||||
|
||||
// Last modification timestamp (for ETag-like cache invalidation)
|
||||
lastModified: number;
|
||||
}
|
||||
|
||||
interface SnapshotReference {
|
||||
fileName: string; // e.g. "snapshots/snap_1701234567890.json"
|
||||
schemaVersion: number; // Schema version of the snapshot
|
||||
vectorClock: VectorClock; // Clock state at snapshot time
|
||||
timestamp: number; // When snapshot was created
|
||||
}
|
||||
|
||||
interface OperationFileReference {
|
||||
fileName: string; // e.g. "ops/overflow_1701234567890.json"
|
||||
opCount: number; // Number of operations in file (for progress estimation)
|
||||
minSeq: number; // First operation's logical sequence in this file
|
||||
maxSeq: number; // Last operation's logical sequence
|
||||
}
|
||||
|
||||
// Embedded operations are lightweight - full Operation minus redundant fields
|
||||
interface EmbeddedOperation {
|
||||
id: string;
|
||||
actionType: string;
|
||||
opType: OpType;
|
||||
entityType: EntityType;
|
||||
entityId?: string;
|
||||
entityIds?: string[];
|
||||
payload: unknown;
|
||||
clientId: string;
|
||||
vectorClock: VectorClock;
|
||||
timestamp: number;
|
||||
schemaVersion: number;
|
||||
}
|
||||
```
|
||||
|
||||
**Snapshot File Format:**
|
||||
|
||||
```typescript
|
||||
interface SnapshotFile {
|
||||
version: 1;
|
||||
schemaVersion: number; // App schema version
|
||||
vectorClock: VectorClock; // Merged clock at snapshot time
|
||||
timestamp: number;
|
||||
data: AppDataComplete; // Full application state
|
||||
checksum?: string; // Optional SHA-256 for integrity verification
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Workflows
|
||||
|
||||
### 3.1. Upload (Write Path)
|
||||
|
||||
When a client has local pending operations to sync:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Upload Flow │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 1. Download manifest.json │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 2. Detect remote changes │
|
||||
│ (compare frontierClock) │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
┌───────────────┴───────────────┐
|
||||
▼ ▼
|
||||
Remote has new ops? No remote changes
|
||||
│ │
|
||||
▼ │
|
||||
Download & apply first ◄───────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 3. Check buffer capacity │
|
||||
│ embedded.length + pending │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
┌───────────────┴───────────────┐
|
||||
▼ ▼
|
||||
< BUFFER_LIMIT (50) >= BUFFER_LIMIT
|
||||
│ │
|
||||
▼ ▼
|
||||
Append to embedded Flush embedded to file
|
||||
│ + add pending to empty buffer
|
||||
│ │
|
||||
└───────────────┬───────────────┘
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 4. Check snapshot trigger │
|
||||
│ (operationFiles > 50 OR │
|
||||
│ total ops > 5000) │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
┌───────────────┴───────────────┐
|
||||
▼ ▼
|
||||
Trigger snapshot No snapshot needed
|
||||
│ │
|
||||
└───────────────┬───────────────┘
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 5. Upload manifest.json │
|
||||
└───────────────────────────────┘
|
||||
```
|
||||
|
||||
**Detailed Steps:**
|
||||
|
||||
1. **Download Manifest:** Fetch `manifest.json` (or create empty v2 manifest if not found).
|
||||
2. **Detect Remote Changes:**
|
||||
- Compare `manifest.frontierClock` with local `lastSyncedClock`.
|
||||
- If remote has unseen changes → download and apply before uploading (prevents lost updates).
|
||||
3. **Evaluate Buffer:**
|
||||
- `BUFFER_LIMIT = 50` operations (configurable)
|
||||
- `BUFFER_SIZE_LIMIT = 100KB` payload size (prevents manifest bloat)
|
||||
4. **Strategy Selection:**
|
||||
- **Scenario A (Append):** If `embedded.length + pending.length < BUFFER_LIMIT`:
|
||||
- Append `pendingOps` to `manifest.embeddedOperations`.
|
||||
- Update `manifest.frontierClock` with merged clocks.
|
||||
- **Result:** 1 Write (manifest). Fast path.
|
||||
- **Scenario B (Overflow):** If buffer would exceed limit:
|
||||
- Upload `manifest.embeddedOperations` to new file `ops/overflow_TIMESTAMP.json`.
|
||||
- Add file reference to `manifest.operationFiles`.
|
||||
- Place `pendingOps` into now-empty `manifest.embeddedOperations`.
|
||||
- **Result:** 1 Upload (overflow file) + 1 Write (manifest).
|
||||
5. **Upload Manifest:** Write updated `manifest.json`.
|
||||
|
||||
### 3.2. Download (Read Path)
|
||||
|
||||
When a client checks for updates:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Download Flow │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 1. Download manifest.json │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 2. Quick-check: any changes? │
|
||||
│ Compare frontierClock │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
┌───────────────┴───────────────┐
|
||||
▼ ▼
|
||||
No changes (clocks equal) Changes detected
|
||||
│ │
|
||||
▼ ▼
|
||||
Done ┌────────────────────────┐
|
||||
│ 3. Need snapshot? │
|
||||
│ (local behind snapshot)│
|
||||
└────────────────────────┘
|
||||
│
|
||||
┌───────────────┴───────────────┐
|
||||
▼ ▼
|
||||
Download snapshot Skip to ops
|
||||
+ apply as base │
|
||||
│ │
|
||||
└───────────────┬───────────────┘
|
||||
▼
|
||||
┌────────────────────────┐
|
||||
│ 4. Download new op │
|
||||
│ files (filter seen) │
|
||||
└────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────────┐
|
||||
│ 5. Apply embedded ops │
|
||||
│ (filter by op.id) │
|
||||
└────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────────┐
|
||||
│ 6. Update local │
|
||||
│ lastSyncedClock │
|
||||
└────────────────────────┘
|
||||
```
|
||||
|
||||
**Detailed Steps:**
|
||||
|
||||
1. **Download Manifest:** Fetch `manifest.json`.
|
||||
2. **Quick-Check Changes:**
|
||||
- Compare `manifest.frontierClock` against local `lastSyncedClock`.
|
||||
- If clocks are equal → no changes, done.
|
||||
3. **Check Snapshot Needed:**
|
||||
- If local state is older than `manifest.lastSnapshot.vectorClock` → download snapshot first.
|
||||
- Apply snapshot as base state (replaces local state).
|
||||
4. **Download Operation Files:**
|
||||
- Filter `manifest.operationFiles` to only files with `maxSeq > localLastAppliedSeq`.
|
||||
- Download and parse each file.
|
||||
- Collect all operations.
|
||||
5. **Apply Embedded Operations:**
|
||||
- Filter `manifest.embeddedOperations` by `op.id` (skip already-applied).
|
||||
- Add to collected operations.
|
||||
6. **Apply All Operations:**
|
||||
- Sort by `vectorClock` (causal order).
|
||||
- Detect conflicts using existing `detectConflicts()` logic.
|
||||
- Apply non-conflicting ops; present conflicts to user.
|
||||
7. **Update Tracking:**
|
||||
- Set `localLastSyncedClock = manifest.frontierClock`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Snapshotting (Compaction)
|
||||
|
||||
To prevent unbounded growth of operation files, any client can trigger a snapshot.
|
||||
|
||||
### 4.1. Triggers
|
||||
|
||||
| Condition | Threshold | Rationale |
|
||||
| ------------------------------- | --------- | -------------------------------------- |
|
||||
| External `operationFiles` count | > 50 | Prevent WebDAV directory bloat |
|
||||
| Total operations since snapshot | > 5000 | Bound replay time for fresh installs |
|
||||
| Time since last snapshot | > 7 days | Ensure periodic cleanup |
|
||||
| Manifest size | > 500KB | Prevent manifest from becoming too big |
|
||||
|
||||
### 4.2. Process
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Snapshot Flow │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 1. Ensure full sync complete │
|
||||
│ (no pending local/remote) │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 2. Read current state from │
|
||||
│ NgRx (authoritative) │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 3. Generate snapshot file │
|
||||
│ + compute checksum │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 4. Upload snapshot file │
|
||||
│ (atomic, verify success) │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 5. Update manifest │
|
||||
│ - Set lastSnapshot │
|
||||
│ - Clear operationFiles │
|
||||
│ - Clear embeddedOperations │
|
||||
│ - Reset frontierClock │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 6. Upload manifest │
|
||||
└───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌───────────────────────────────┐
|
||||
│ 7. Cleanup (async, best- │
|
||||
│ effort): delete old files │
|
||||
└───────────────────────────────┘
|
||||
```
|
||||
|
||||
### 4.3. Snapshot Atomicity
|
||||
|
||||
**Problem:** If the client crashes between uploading snapshot and updating manifest, other clients won't see the new snapshot.
|
||||
|
||||
**Solution:** Snapshot files are immutable and safe to leave orphaned. The manifest is the source of truth. Cleanup is best-effort.
|
||||
|
||||
**Invariant:** Never delete the current `lastSnapshot` file until a new snapshot is confirmed.
|
||||
|
||||
---
|
||||
|
||||
## 5. Conflict Handling
|
||||
|
||||
The hybrid manifest doesn't change conflict detection - it still uses vector clocks. However, the `frontierClock` in the manifest enables **early conflict detection**.
|
||||
|
||||
### 5.1. Early Conflict Detection
|
||||
|
||||
Before downloading all operations, compare clocks:
|
||||
|
||||
```typescript
|
||||
const comparison = compareVectorClocks(localFrontierClock, manifest.frontierClock);
|
||||
|
||||
switch (comparison) {
|
||||
case VectorClockComparison.LESS_THAN:
|
||||
// Remote is ahead - safe to download
|
||||
break;
|
||||
case VectorClockComparison.GREATER_THAN:
|
||||
// Local is ahead - upload our changes
|
||||
break;
|
||||
case VectorClockComparison.CONCURRENT:
|
||||
// Potential conflicts - download ops for detailed analysis
|
||||
break;
|
||||
case VectorClockComparison.EQUAL:
|
||||
// No changes - skip download
|
||||
break;
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2. Conflict Resolution
|
||||
|
||||
When conflicts are detected at the operation level, the existing `ConflictResolutionService` handles them. The hybrid manifest doesn't change this flow.
|
||||
|
||||
---
|
||||
|
||||
## 6. Edge Cases & Failure Modes
|
||||
|
||||
### 6.1. Concurrent Uploads (Race Condition)
|
||||
|
||||
**Scenario:** Two clients download the manifest simultaneously, both append ops, both upload.
|
||||
|
||||
**Problem:** Second upload overwrites first client's operations.
|
||||
|
||||
**Solution:** Use provider-specific mechanisms:
|
||||
|
||||
| Provider | Mechanism |
|
||||
| ----------- | ------------------------------------------- |
|
||||
| **Dropbox** | Use `update` mode with `rev` parameter |
|
||||
| **WebDAV** | Use `If-Match` header with ETag |
|
||||
| **Local** | File locking (already implemented in PFAPI) |
|
||||
|
||||
**Implementation:**
|
||||
|
||||
```typescript
|
||||
interface HybridManifest {
|
||||
// ... existing fields
|
||||
|
||||
// Optimistic concurrency control
|
||||
etag?: string; // Server-assigned revision (Dropbox rev, WebDAV ETag)
|
||||
}
|
||||
|
||||
async uploadManifest(manifest: HybridManifest, expectedEtag?: string): Promise<void> {
|
||||
// If expectedEtag provided, use conditional upload
|
||||
// On conflict (412 Precondition Failed), re-download and retry
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2. Manifest Corruption
|
||||
|
||||
**Scenario:** Manifest JSON is invalid (partial write, encoding issue).
|
||||
|
||||
**Recovery Strategy:**
|
||||
|
||||
1. Attempt to parse manifest.
|
||||
2. On parse failure, check for backup manifest (`manifest.json.bak`).
|
||||
3. If no backup, reconstruct from operation files using `listFiles()`.
|
||||
4. If reconstruction fails, fall back to snapshot-only state.
|
||||
|
||||
```typescript
|
||||
async loadManifestWithRecovery(): Promise<HybridManifest> {
|
||||
try {
|
||||
return await this._loadRemoteManifest();
|
||||
} catch (parseError) {
|
||||
PFLog.warn('Manifest corrupted, attempting recovery...');
|
||||
|
||||
// Try backup
|
||||
try {
|
||||
return await this._loadBackupManifest();
|
||||
} catch {
|
||||
// Reconstruct from files
|
||||
return await this._reconstructManifestFromFiles();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 6.3. Snapshot File Missing
|
||||
|
||||
**Scenario:** Manifest references a snapshot that doesn't exist on the server.
|
||||
|
||||
**Recovery Strategy:**
|
||||
|
||||
1. Log error and notify user.
|
||||
2. Fall back to replaying all available operation files.
|
||||
3. If operation files also reference missing ops, show data loss warning.
|
||||
|
||||
### 6.4. Schema Version Mismatch
|
||||
|
||||
**Scenario:** Snapshot was created with schema version 3, but local app is version 2.
|
||||
|
||||
**Handling:**
|
||||
|
||||
- If `snapshot.schemaVersion > CURRENT_SCHEMA_VERSION + MAX_VERSION_SKIP`:
|
||||
- Reject snapshot, prompt user to update app.
|
||||
- If `snapshot.schemaVersion > CURRENT_SCHEMA_VERSION`:
|
||||
- Load with warning (some fields may be stripped by Typia).
|
||||
- If `snapshot.schemaVersion < CURRENT_SCHEMA_VERSION`:
|
||||
- Run migrations on loaded state.
|
||||
|
||||
### 6.5. Large Pending Operations
|
||||
|
||||
**Scenario:** User was offline for a week, has 500 pending operations.
|
||||
|
||||
**Handling:**
|
||||
|
||||
- Don't try to embed all 500 in manifest.
|
||||
- Batch into multiple overflow files (100 ops each).
|
||||
- Upload files first, then update manifest once.
|
||||
|
||||
```typescript
|
||||
const BATCH_SIZE = 100;
|
||||
const chunks = chunkArray(pendingOps, BATCH_SIZE);
|
||||
|
||||
for (const chunk of chunks) {
|
||||
await this._uploadOverflowFile(chunk);
|
||||
}
|
||||
|
||||
// Single manifest update at the end
|
||||
await this._uploadManifest(manifest);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Advantages Summary
|
||||
|
||||
| Metric | Current (v1) | Hybrid Manifest (v2) |
|
||||
| :---------------------- | :----------------------------------- | :---------------------------------------------------- |
|
||||
| **Requests per Sync** | 3 (Upload Op + Read Man + Write Man) | **1-2** (Read Man, optional Write) |
|
||||
| **Files on Server** | Unbounded growth | **Bounded** (1 Manifest + 0-50 Op Files + 1 Snapshot) |
|
||||
| **Fresh Install Speed** | O(n) - replay all ops | **O(1)** - load snapshot + small delta |
|
||||
| **Conflict Detection** | Must parse all ops | **Quick check** via frontierClock |
|
||||
| **Bandwidth per Sync** | ~2KB (op file) + manifest overhead | **~1KB** (manifest only for small changes) |
|
||||
| **Offline Resilience** | Good | **Same** (operations buffered locally) |
|
||||
|
||||
---
|
||||
|
||||
## 8. Implementation Status
|
||||
|
||||
All phases have been implemented as of December 2025:
|
||||
|
||||
### ✅ Phase 1: Core Infrastructure (Complete)
|
||||
|
||||
1. **Types** (`operation.types.ts`):
|
||||
|
||||
- `HybridManifest`, `SnapshotReference`, `OperationFileReference` interfaces defined
|
||||
- Backward compatibility maintained with existing `OperationLogManifest`
|
||||
|
||||
2. **Manifest Handling** (`operation-log-manifest.service.ts`):
|
||||
|
||||
- `loadManifest()` handles v1 and v2 formats
|
||||
- Automatic v1 to v2 migration on first write
|
||||
- Buffer/overflow logic in upload services
|
||||
|
||||
3. **FrontierClock Tracking**:
|
||||
- Vector clocks merged when adding embedded operations
|
||||
- `lastSyncedFrontierClock` stored locally for quick-check
|
||||
|
||||
### ✅ Phase 2: Snapshot Support (Complete)
|
||||
|
||||
4. **Snapshot Operations** (in `operation-log-upload.service.ts` and `operation-log-download.service.ts`):
|
||||
|
||||
- Snapshot generation with current state serialization
|
||||
- Upload with retry logic
|
||||
- Download + validate + apply
|
||||
|
||||
5. **Snapshot Triggers**:
|
||||
- Automatic triggers based on file count and operation count
|
||||
- Remote file cleanup after 14 days (`REMOTE_OP_FILE_RETENTION_MS`)
|
||||
|
||||
### ✅ Phase 3: Robustness (Complete)
|
||||
|
||||
6. **Concurrency Control**:
|
||||
|
||||
- Provider-specific revision checking (Dropbox rev, WebDAV ETag)
|
||||
- Retry-on-conflict logic implemented
|
||||
|
||||
7. **Recovery Logic**:
|
||||
- Manifest corruption recovery with file listing fallback
|
||||
- Missing file handling with graceful degradation
|
||||
|
||||
### ✅ Phase 4: Testing (Complete)
|
||||
|
||||
8. **Tests**:
|
||||
- Unit tests in `operation-log-manifest.service.spec.ts`
|
||||
- Integration tests in `sync-scenarios.integration.spec.ts`
|
||||
- E2E tests in `supersync.spec.ts`
|
||||
|
||||
### Key Implementation Files
|
||||
|
||||
| File | Purpose |
|
||||
| ----------------------------------- | ------------------------------------------- |
|
||||
| `operation-log-manifest.service.ts` | Manifest loading, saving, buffer management |
|
||||
| `operation-log-upload.service.ts` | Upload with buffer/overflow logic |
|
||||
| `operation-log-download.service.ts` | Download with snapshot support |
|
||||
| `operation.types.ts` | Type definitions |
|
||||
|
||||
---
|
||||
|
||||
## 9. Configuration Constants
|
||||
|
||||
```typescript
|
||||
// Buffer limits
|
||||
const EMBEDDED_OP_LIMIT = 50; // Max operations in manifest buffer
|
||||
const EMBEDDED_SIZE_LIMIT_KB = 100; // Max payload size in KB
|
||||
|
||||
// Snapshot triggers
|
||||
const SNAPSHOT_FILE_THRESHOLD = 50; // Trigger when operationFiles exceeds this
|
||||
const SNAPSHOT_OP_THRESHOLD = 5000; // Trigger when total ops exceed this
|
||||
const SNAPSHOT_AGE_DAYS = 7; // Trigger if no snapshot in N days
|
||||
|
||||
// Batching
|
||||
const UPLOAD_BATCH_SIZE = 100; // Ops per overflow file
|
||||
|
||||
// Retry
|
||||
const MAX_UPLOAD_RETRIES = 3;
|
||||
const RETRY_DELAY_MS = 1000;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Resolved Design Questions
|
||||
|
||||
The following questions were resolved during implementation:
|
||||
|
||||
1. **Encryption:** Snapshots use the same encryption as operation files (via `EncryptAndCompressHandlerService`).
|
||||
|
||||
2. **Compression:** Snapshots are compressed using the same compression scheme as other sync files.
|
||||
|
||||
3. **Checksum Verification:** Currently using timestamp-based validation; checksums can be added if needed.
|
||||
|
||||
4. **Clock Drift:** Vector clocks handle ordering; timestamps are informational only.
|
||||
|
||||
---
|
||||
|
||||
## 11. File Reference
|
||||
|
||||
### Remote Storage Layout (v2)
|
||||
|
||||
```
|
||||
/ (or /DEV/ in development)
|
||||
├── manifest.json # HybridManifest (buffer + references)
|
||||
├── ops/
|
||||
│ ├── ops_CLIENT1_170123.json # Flushed operations
|
||||
│ └── ops_CLIENT2_170456.json
|
||||
└── snapshots/
|
||||
└── snap_170789.json # Full state snapshot (if present)
|
||||
```
|
||||
|
||||
### Code Files
|
||||
|
||||
```
|
||||
src/app/core/persistence/operation-log/
|
||||
├── operation.types.ts # HybridManifest, SnapshotReference types
|
||||
├── store/
|
||||
│ └── operation-log-manifest.service.ts # Manifest management
|
||||
├── sync/
|
||||
│ ├── operation-log-upload.service.ts # Upload with buffer/overflow
|
||||
│ └── operation-log-download.service.ts # Download with snapshot support
|
||||
└── docs/
|
||||
└── hybrid-manifest-architecture.md # This document
|
||||
```
|
||||
1863
docs/op-log/operation-log-architecture-diagrams.md
Normal file
1863
docs/op-log/operation-log-architecture-diagrams.md
Normal file
File diff suppressed because it is too large
Load diff
2038
docs/op-log/operation-log-architecture.md
Normal file
2038
docs/op-log/operation-log-architecture.md
Normal file
File diff suppressed because it is too large
Load diff
202
docs/op-log/operation-payload-optimization-discussion.md
Normal file
202
docs/op-log/operation-payload-optimization-discussion.md
Normal file
|
|
@ -0,0 +1,202 @@
|
|||
# Operation Payload Optimization Discussion
|
||||
|
||||
**Date:** December 5, 2025
|
||||
**Context:** Analysis of operation payload sizes and optimization opportunities
|
||||
|
||||
---
|
||||
|
||||
## Initial Analysis
|
||||
|
||||
We analyzed the codebase for occasions when many or very large operations are produced.
|
||||
|
||||
### Issues Identified
|
||||
|
||||
| Issue | Severity | Impact |
|
||||
| --------------------------------- | -------- | ---------------------------------- |
|
||||
| Tag deletion cascade | **High** | Creates N+1 operations for N tasks |
|
||||
| Full payload storage | **High** | Large payloads stored repeatedly |
|
||||
| batchUpdateForProject nesting | Medium | Single op contains nested array |
|
||||
| Archive operations | Medium | One bulk op for many tasks |
|
||||
| Single operations per bulk entity | Medium | N operations instead of 1 |
|
||||
|
||||
### Fixes Implemented
|
||||
|
||||
1. **Payload size monitoring** - Added `LARGE_PAYLOAD_WARNING_THRESHOLD_BYTES` (10KB) and logging when exceeded
|
||||
2. **Bulk task-repeat-cfg operations** - Tag deletion now uses bulk delete instead of N individual operations
|
||||
3. **Batch operation chunking** - `batchUpdateForProject` now chunks large operations into batches of `MAX_BATCH_OPERATIONS_SIZE` (50)
|
||||
|
||||
---
|
||||
|
||||
## Archive Operation Deep Dive
|
||||
|
||||
The `moveToArchive` action was identified as having large payloads (~2KB per task). We explored multiple optimization approaches.
|
||||
|
||||
### The Core Problem
|
||||
|
||||
Two sync systems exist:
|
||||
|
||||
1. **Operation Log (SuperSync)** - Real-time operation sync
|
||||
2. **PFAPI** - Model file sync (daily for archive files)
|
||||
|
||||
When Client A archives tasks:
|
||||
|
||||
- Operation syncs immediately
|
||||
- `archiveYoung` model file syncs later (daily)
|
||||
|
||||
When Client B receives the operation:
|
||||
|
||||
- Must write tasks to local archive
|
||||
- But tasks are deleted from originating client's state
|
||||
- Archive file hasn't synced yet
|
||||
|
||||
**The operation must carry full task data.**
|
||||
|
||||
### Solutions Explored
|
||||
|
||||
#### Option A: Hybrid Payload with Private Field
|
||||
|
||||
```typescript
|
||||
moveToArchive: {
|
||||
taskIds: string[], // Persisted
|
||||
_tasks: TaskWithSubTasks[] // Stripped before storage
|
||||
}
|
||||
```
|
||||
|
||||
**Problem:** Remote operations won't have `_tasks` - still need full data for sync.
|
||||
|
||||
#### Option B: Meta-Reducer Enrichment
|
||||
|
||||
Capture tasks from state before deletion, attach to action for effect.
|
||||
|
||||
**Why it seemed possible:**
|
||||
|
||||
- Dependency resolution ensures `addTask` ops applied before `moveToArchive`
|
||||
- Tasks exist in remote client's state when operation arrives
|
||||
- Meta-reducer runs before main reducer
|
||||
|
||||
**Problems:**
|
||||
|
||||
- Complex action mutation
|
||||
- Meta-reducers should be pure
|
||||
- Awkward async queue from sync reducer
|
||||
|
||||
#### Option C: Two-Phase Archive
|
||||
|
||||
Split into `writeToArchive` (full data) + `deleteTasks` (IDs only).
|
||||
|
||||
**Problem:** Same total payload size. Just added complexity without benefit.
|
||||
|
||||
#### Option D: Operation-Derived Archive Store
|
||||
|
||||
Archive becomes a separate IndexedDB store populated entirely by operations:
|
||||
|
||||
```typescript
|
||||
archiveTask: { taskIds: string[] } // IDs only
|
||||
```
|
||||
|
||||
Meta-reducer moves task data from active state to archive before deletion.
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- Tiny payloads
|
||||
- Single source of truth
|
||||
- No PFAPI archive sync needed
|
||||
|
||||
**Drawbacks:**
|
||||
|
||||
1. Migration complexity (years of existing archive data)
|
||||
2. Initial sync must replay ALL archive ops (20K+ for heavy users)
|
||||
3. Operation log growth (archive ops span years)
|
||||
4. Compaction complexity (must preserve archive state)
|
||||
5. Two storage systems to coordinate
|
||||
6. PFAPI compatibility during transition
|
||||
7. Query performance for 20K+ tasks
|
||||
|
||||
### The Scale Concern
|
||||
|
||||
> "There can be more than 20,000 archived tasks"
|
||||
|
||||
If archive was in NgRx store:
|
||||
|
||||
- Selectors iterate 20K+ entities
|
||||
- Entity adapter operations slow down
|
||||
- Memory bloat on app start
|
||||
- DevTools unusable
|
||||
|
||||
This ruled out simple "add isArchived flag" approaches.
|
||||
|
||||
### Key Insight: Dependency Resolution
|
||||
|
||||
Operations have causal ordering. When remote client receives `moveToArchive`:
|
||||
|
||||
1. `addTask` operations already applied (dependency)
|
||||
2. Task exists in remote client's active state
|
||||
3. Could theoretically look up from state before deletion
|
||||
|
||||
But the effect runs AFTER the reducer deletes entities. The timing makes this approach impractical without complex meta-reducer side effects.
|
||||
|
||||
---
|
||||
|
||||
## Final Decision
|
||||
|
||||
**Keep the current full-payload approach.**
|
||||
|
||||
### Rationale
|
||||
|
||||
1. **It works correctly** - Already implemented, tested, documented
|
||||
2. **Sync reliability** - No edge cases or timing issues
|
||||
3. **Simplicity** - Single action, clear semantics
|
||||
4. **Acceptable size** - ~100KB for 50 tasks is manageable
|
||||
5. **Infrequent operation** - Archiving happens at end of day, not constantly
|
||||
|
||||
### Mitigation
|
||||
|
||||
For very large archives, chunk operations:
|
||||
|
||||
```typescript
|
||||
const ARCHIVE_CHUNK_SIZE = 25;
|
||||
|
||||
async moveToArchive(tasks: TaskWithSubTasks[]): Promise<void> {
|
||||
const chunks = chunkArray(parentTasks, ARCHIVE_CHUNK_SIZE);
|
||||
for (const chunk of chunks) {
|
||||
this._store.dispatch(TaskSharedActions.moveToArchive({ tasks: chunk }));
|
||||
}
|
||||
await this._archiveService.moveTasksToArchiveAndFlushArchiveIfDue(parentTasks);
|
||||
}
|
||||
```
|
||||
|
||||
### Trade-off Summary
|
||||
|
||||
| Approach | Payload Size | Complexity | Reliability |
|
||||
| ------------------------- | ----------------- | ---------- | ----------- |
|
||||
| Full payload (current) | Large (~2KB/task) | Low | High |
|
||||
| Meta-reducer enrichment | Small | High | Medium |
|
||||
| Two-phase archive | Same as current | Higher | High |
|
||||
| Operation-derived archive | Small | Very High | Medium |
|
||||
|
||||
**The payload size reduction doesn't justify the added complexity.**
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- `archive-operation-redesign.md` - Detailed analysis of archive options
|
||||
- `code-audit.md` - Overall operation compliance audit
|
||||
- `operation-size-analysis.md` - Initial payload size analysis
|
||||
|
||||
---
|
||||
|
||||
## Future Considerations
|
||||
|
||||
If payload size becomes a real problem (not theoretical), revisit Option D (operation-derived archive) with:
|
||||
|
||||
1. Proper migration plan for existing PFAPI data
|
||||
2. Compaction strategy for long-lived archive operations
|
||||
3. Performance testing with 20K+ tasks
|
||||
4. PFAPI compatibility during transition
|
||||
|
||||
**Alternative Optimization:**
|
||||
|
||||
5. **Payload Compression**: Since task data (text/JSON) compresses extremely well (often >90%), we could compress the `_tasks` payload within the `moveToArchive` operation (e.g., using LZ-string or GZIP) before sending. This would solve the size concern without requiring the architectural overhaul of Option D.
|
||||
|
||||
Until then, current approach is the right balance.
|
||||
215
docs/op-log/operation-rules.md
Normal file
215
docs/op-log/operation-rules.md
Normal file
|
|
@ -0,0 +1,215 @@
|
|||
# Operation Log: Design Rules & Guidelines
|
||||
|
||||
**Last Updated:** December 2025
|
||||
**Related:** [Operation Log Architecture](./operation-log-architecture.md)
|
||||
|
||||
This document establishes the core rules and principles for designing the Operation Log store and defining new Operations. Adherence to these rules ensures data integrity, synchronization reliability, and system performance.
|
||||
|
||||
## 1. Store Design Rules
|
||||
|
||||
### 1.1 Append-Only Persistence
|
||||
|
||||
- **Rule:** The `ops` table in the store must be strictly **append-only** for active operations.
|
||||
- **Reasoning:** History preservation is critical for event sourcing and conflict resolution.
|
||||
- **Exception:** Operations can only be deleted by the **Compaction Service**, and only if they are:
|
||||
1. Older than the retention window.
|
||||
2. Successfully synced (`syncedAt` is set).
|
||||
3. "Baked" into a secure snapshot.
|
||||
|
||||
### 1.2 Immutable History
|
||||
|
||||
- **Rule:** Once an operation is written to `SUP_OPS`, it **MUST NOT** be modified.
|
||||
- **Reasoning:** Modifying history breaks the cryptographic chain (if implemented later) and confuses sync peers who have already received the operation.
|
||||
- **Correction:** If an operation was incorrect, append a new _compensating operation_ (e.g., an undo or correction op) rather than editing the old one.
|
||||
|
||||
### 1.3 Single Source of Truth
|
||||
|
||||
- **Rule:** The Operation Log (`SUP_OPS`) is the ultimate source of truth for the application state.
|
||||
- **Context:** The `state_cache` and runtime NgRx store are _projections_ derived from the log.
|
||||
- **Implication:** If the runtime state disagrees with the log replay, the log wins.
|
||||
|
||||
### 1.4 Snapshot Mandate
|
||||
|
||||
- **Rule:** The store must maintain a valid `state_cache` (snapshot).
|
||||
- **Frequency:** Snapshots must be updated based on configurable thresholds:
|
||||
- **Operation count:** After N operations (default: 500, configurable).
|
||||
- **Time-based:** After T minutes of inactivity following changes.
|
||||
- **Size-based:** When tail ops exceed S kilobytes.
|
||||
- **Event-triggered:** Immediately after significant events (large imports, sync completion).
|
||||
- **Recovery:** The system must be able to rebuild the state entirely from `Snapshot + Tail Ops`.
|
||||
|
||||
## 2. Operation Design Rules
|
||||
|
||||
### 2.1 Granularity & Atomicity
|
||||
|
||||
- **Rule:** Operations should be **atomic** and focused on a **single entity** where possible.
|
||||
- **Good:** `UPDATE_TASK { id: "A", changes: { title: "New" } }`
|
||||
- **Bad:** `UPDATE_ALL_TASKS { [ ... entire tasks array ... ] }`
|
||||
- **Reasoning:** Granular ops reduce conflict probability. Large "dump" ops cause massive conflicts during sync.
|
||||
- **Exception:** `SYNC_IMPORT` and `BACKUP_IMPORT` are allowed to replace large chunks of state but must be treated as special "reset" events.
|
||||
|
||||
### 2.2 Idempotency
|
||||
|
||||
- **Rule:** Applying the same operation twice must be safe.
|
||||
- **Implementation:**
|
||||
- Use explicit IDs (UUID v7) for creation. `CREATE` with an existing ID must be **ignored** (not merged or updated). If updates are needed, a separate `UPDATE` operation must follow.
|
||||
- `DELETE` on a missing entity should be a no-op.
|
||||
- `UPDATE` on a missing entity should be queued for retry (see 3.4 Dependency Awareness).
|
||||
|
||||
### 2.3 Serializable Payload
|
||||
|
||||
- **Rule:** Operation payloads must be **Pure JSON**.
|
||||
- **Forbidden:**
|
||||
- `Date` objects (use `timestamp` numbers).
|
||||
- Functions or class instances.
|
||||
- `undefined` (use `null` or omit the key, depending on semantics).
|
||||
- Circular references.
|
||||
|
||||
### 2.4 Causality Tracking
|
||||
|
||||
- **Rule:** Every operation **MUST** carry a `vectorClock`.
|
||||
- **Purpose:** To determine if the operation is concurrent with others or if it causally follows them.
|
||||
- **Responsibility:** The `OperationLogEffects` (or equivalent creator) captures the clock at the moment of creation.
|
||||
|
||||
### 2.5 Schema Versioning
|
||||
|
||||
- **Rule:** Every operation **MUST** carry a `schemaVersion`.
|
||||
- **Purpose:** To allow future versions of the app to migrate or interpret old operations correctly.
|
||||
- **Default:** Use `CURRENT_SCHEMA_VERSION` from `SchemaMigrationService` at the time of creation.
|
||||
|
||||
### 2.6 Explicit Intent (OpType)
|
||||
|
||||
- **Rule:** Use specific `OpType`s (`CRT`, `UPD`, `DEL`, `MOV`) rather than a generic `CHANGE`.
|
||||
- **Reasoning:** Specific types allow for smarter conflict resolution and UI feedback (e.g., "Task was deleted remotely" vs "Task was moved").
|
||||
|
||||
## 3. Interaction & Safety Rules
|
||||
|
||||
### 3.1 Validation First
|
||||
|
||||
- **Rule:** Validate operation payloads **before** appending to the log.
|
||||
- **Checkpoint:** Structural validation (required fields) happens at the boundary. Deep semantic validation happens during application/replay.
|
||||
- **Failure:** Reject malformed operations immediately; do not corrupt the log.
|
||||
|
||||
### 3.2 Robust Replay
|
||||
|
||||
- **Rule:** The replay mechanism (Hydrator) **MUST NOT CRASH** on invalid operations.
|
||||
- **Behavior:** If an operation fails to apply (e.g., referencing a missing parent):
|
||||
1. Log a warning.
|
||||
2. Skip the operation (or queue for retry).
|
||||
3. Continue replaying the rest.
|
||||
4. Trigger a `REPAIR` cycle at the end if needed.
|
||||
|
||||
### 3.3 Sync Isolation
|
||||
|
||||
- **Rule:** The `OperationLogStore` should not contain logic specific to a sync provider (Dropbox, WebDAV).
|
||||
- **Separation:** The store manages _persistence_. The Sync Services manage _transport_.
|
||||
- **Interface:** The store exposes `getUnsynced()`, `markSynced()`, `markRejected()` as generic methods.
|
||||
|
||||
### 3.4 Dependency Awareness
|
||||
|
||||
- **Rule:** Operations creating dependent entities (e.g., Subtask) must ensure the dependency (Parent Task) exists.
|
||||
- **Handling:** If a parent is missing during sync, the child creation op should be buffered in a `DependencyQueue` until the parent arrives.
|
||||
- **Safeguards:**
|
||||
- **Cycle detection:** Before queuing, verify the dependency graph is acyclic. Reject operations that would create circular dependencies.
|
||||
- **Buffer limits:** The queue must enforce a maximum depth (default: 1000 pending ops) and timeout (default: 5 minutes). Operations exceeding limits should be logged and dropped.
|
||||
- **Retry policy:** Queued operations should be retried after each batch of new operations is applied, with exponential backoff for repeated failures.
|
||||
|
||||
### 3.5 Deletion & Tombstones
|
||||
|
||||
> **Status (December 2025):** Tombstones are **DEFERRED**. After comprehensive evaluation, the current event-sourced architecture provides sufficient safeguards without explicit tombstones. See `todo.md` Item 1 for the full evaluation.
|
||||
|
||||
- **Current Implementation:** Deletions use **DELETE operations** in the event log (immutable events, not destructive).
|
||||
- **Alternative Safeguards in Place:**
|
||||
- Vector clocks detect concurrent delete+update conflicts; user resolution UI is presented.
|
||||
- Tag sanitization filters non-existent taskIds at reducer level.
|
||||
- Subtask cascading deletes include all child tasks.
|
||||
- Auto-repair removes orphaned references and creates REPAIR operations.
|
||||
- **When to Revisit:**
|
||||
- If undo/restore functionality is needed.
|
||||
- If audit compliance requires explicit "entity deleted at time X" records.
|
||||
- If cross-version sync (A.7.11) reveals edge cases not handled by current safeguards.
|
||||
|
||||
### 3.6 Operation Batching
|
||||
|
||||
- **Rule:** Normal operations should be batched with reasonable limits.
|
||||
- **Limits:**
|
||||
- **Max batch size:** 100 operations per batch for normal sync uploads.
|
||||
- **Max payload size:** 1 MB per batch to prevent timeout issues.
|
||||
- **Exception:** `SYNC_IMPORT` and `BACKUP_IMPORT` bypass these limits but must be clearly marked as bulk operations and trigger immediate snapshot creation afterward.
|
||||
|
||||
## 4. Effect Rules
|
||||
|
||||
### 4.1 LOCAL_ACTIONS for Side Effects
|
||||
|
||||
- **Rule:** All NgRx effects that perform side effects MUST use `inject(LOCAL_ACTIONS)` instead of `inject(Actions)`.
|
||||
- **Reasoning:** Effects should NEVER run for remote sync operations. Side effects (snackbars, API calls, sounds) happen exactly once on the originating client.
|
||||
- **Exception:** Effects that only dispatch state-modifying actions (not side effects) may use regular `Actions`.
|
||||
|
||||
**Example:**
|
||||
|
||||
```typescript
|
||||
@Injectable()
|
||||
export class MyEffects {
|
||||
private _actions$ = inject(LOCAL_ACTIONS); // ✅ Correct for side effects
|
||||
|
||||
showSnack$ = createEffect(
|
||||
() =>
|
||||
this._actions$.pipe(
|
||||
ofType(completeTask),
|
||||
tap(() => this.snackService.show('Task completed!')),
|
||||
),
|
||||
{ dispatch: false },
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
### 4.2 Avoid Selector-Based Effects That Dispatch Actions
|
||||
|
||||
- **Rule:** Prefer action-based effects (`this._actions$.pipe(ofType(...))`) over selector-based effects (`this._store$.select(...)`).
|
||||
- **Reasoning:** Selector-based effects fire whenever the store changes, including during hydration and sync replay, bypassing `LOCAL_ACTIONS` filtering.
|
||||
- **Workaround:** If you must use a selector-based effect that dispatches actions, guard it with `HydrationStateService.isApplyingRemoteOps()`.
|
||||
|
||||
### 4.3 Archive Side Effects
|
||||
|
||||
- **Rule:** Archive operations (writing to IndexedDB) are handled by `ArchiveOperationHandler`, NOT by regular effects.
|
||||
- **Local operations:** `ArchiveOperationHandlerEffects` routes through `ArchiveOperationHandler` (via LOCAL_ACTIONS)
|
||||
- **Remote operations:** `OperationApplierService` calls `ArchiveOperationHandler` directly after dispatch
|
||||
|
||||
## 5. Multi-Entity Operation Rules
|
||||
|
||||
### 5.1 Use Meta-Reducers for Atomic Changes
|
||||
|
||||
- **Rule:** When one action affects multiple entities, use **meta-reducers** instead of effects.
|
||||
- **Reasoning:** Meta-reducers ensure all changes happen in a single reducer pass, creating one operation in the sync log and preventing partial sync.
|
||||
- **Example:** Deleting a tag also removes it from tasks → handled in `tagSharedMetaReducer`, not in an effect.
|
||||
|
||||
### 5.2 Capture Multi-Entity Changes
|
||||
|
||||
- **Rule:** The `OperationCaptureService` automatically captures all entity changes from a single action.
|
||||
- **Implementation:** The `operation-capture.meta-reducer` calls `OperationCaptureService.enqueue()` with the action.
|
||||
- **Result:** Single operation with `entityChanges[]` array containing all affected entities.
|
||||
|
||||
## 6. Configuration Constants
|
||||
|
||||
See `operation-log.const.ts` for all configurable values:
|
||||
|
||||
| Constant | Value | Description |
|
||||
| ----------------------------------- | -------- | ----------------------------------------- |
|
||||
| `COMPACTION_TRIGGER` | 500 ops | Operations before automatic compaction |
|
||||
| `COMPACTION_RETENTION_MS` | 7 days | Synced ops older than this may be deleted |
|
||||
| `EMERGENCY_COMPACTION_RETENTION_MS` | 1 day | Shorter retention for quota exceeded |
|
||||
| `MAX_COMPACTION_FAILURES` | 3 | Failures before user notification |
|
||||
| `MAX_DOWNLOAD_OPS_IN_MEMORY` | 50,000 | Bounds memory during API download |
|
||||
| `REMOTE_OP_FILE_RETENTION_MS` | 14 days | Server-side operation file retention |
|
||||
| `PENDING_OPERATION_EXPIRY_MS` | 24 hours | Pending ops older than this are rejected |
|
||||
|
||||
## 7. Quick Reference Checklist
|
||||
|
||||
When adding a new persistent action:
|
||||
|
||||
- [ ] Add `meta.isPersistent: true` to the action
|
||||
- [ ] Add `meta.entityType` and `meta.opType`
|
||||
- [ ] Ensure related entity changes are in a meta-reducer (not effects)
|
||||
- [ ] Effects with side effects use `LOCAL_ACTIONS`
|
||||
- [ ] Archive operations route through `ArchiveOperationHandler`
|
||||
- [ ] Add action to `ACTION_AFFECTED_ENTITIES` if multi-entity
|
||||
860
docs/op-log/pfapi-sync-persistence-architecture.md
Normal file
860
docs/op-log/pfapi-sync-persistence-architecture.md
Normal file
|
|
@ -0,0 +1,860 @@
|
|||
# PFAPI Sync and Persistence Architecture
|
||||
|
||||
This document describes the architecture and implementation of the persistence and synchronization system (PFAPI) in Super Productivity.
|
||||
|
||||
## Overview
|
||||
|
||||
PFAPI (Persistence Framework API) is a comprehensive system for:
|
||||
|
||||
1. **Local Persistence**: Storing application data in IndexedDB
|
||||
2. **Cross-Device Synchronization**: Syncing data across devices via multiple cloud providers
|
||||
3. **Conflict Detection**: Using vector clocks for distributed conflict detection
|
||||
4. **Data Validation & Migration**: Ensuring data integrity across versions
|
||||
|
||||
## Architecture Layers
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Angular Application │
|
||||
│ (Components & Services) │
|
||||
└────────────────────────────┬────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ PfapiService (Angular) │
|
||||
│ - Injectable wrapper around Pfapi │
|
||||
│ - Exposes RxJS Observables for UI integration │
|
||||
│ - Manages sync provider activation │
|
||||
└────────────────────────────┬────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Pfapi (Core) │
|
||||
│ - Main orchestrator for all persistence operations │
|
||||
│ - Coordinates Database, Models, Sync, and Migration │
|
||||
└────────────────────────────┬────────────────────────────────────┘
|
||||
│
|
||||
┌────────────────────┼────────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ Database │ │ SyncService │ │ Migration │
|
||||
│ (IndexedDB) │ │ (Orchestrator)│ │ Service │
|
||||
└───────────────┘ └───────┬───────┘ └───────────────┘
|
||||
│
|
||||
┌────────────┼────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌──────────┐ ┌───────────┐ ┌───────────┐
|
||||
│ Meta │ │ Model │ │ Encrypt/ │
|
||||
│ Sync │ │ Sync │ │ Compress │
|
||||
└──────────┘ └───────────┘ └───────────┘
|
||||
│ │
|
||||
└────────────┼────────────┐
|
||||
│ │
|
||||
▼ ▼
|
||||
┌───────────────────────────┐
|
||||
│ SyncProvider Interface │
|
||||
└───────────────┬───────────┘
|
||||
│
|
||||
┌───────────────────────────┼───────────────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ Dropbox │ │ WebDAV │ │ Local File │
|
||||
└───────────────┘ └───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
src/app/pfapi/
|
||||
├── pfapi.service.ts # Angular service wrapper
|
||||
├── pfapi-config.ts # Model and provider configuration
|
||||
├── pfapi-helper.ts # RxJS integration helpers
|
||||
├── api/
|
||||
│ ├── pfapi.ts # Main API class
|
||||
│ ├── pfapi.model.ts # Type definitions
|
||||
│ ├── pfapi.const.ts # Enums and constants
|
||||
│ ├── db/ # Database abstraction
|
||||
│ │ ├── database.ts # Database wrapper with locking
|
||||
│ │ ├── database-adapter.model.ts
|
||||
│ │ └── indexed-db-adapter.ts # IndexedDB implementation
|
||||
│ ├── model-ctrl/ # Model controllers
|
||||
│ │ ├── model-ctrl.ts # Generic model controller
|
||||
│ │ └── meta-model-ctrl.ts # Metadata controller
|
||||
│ ├── sync/ # Sync orchestration
|
||||
│ │ ├── sync.service.ts # Main sync orchestrator
|
||||
│ │ ├── meta-sync.service.ts # Metadata sync
|
||||
│ │ ├── model-sync.service.ts # Model sync
|
||||
│ │ ├── sync-provider.interface.ts
|
||||
│ │ ├── encrypt-and-compress-handler.service.ts
|
||||
│ │ └── providers/ # Provider implementations
|
||||
│ ├── migration/ # Data migration
|
||||
│ ├── util/ # Utilities (vector-clock, etc.)
|
||||
│ └── errors/ # Custom error types
|
||||
├── migrate/ # Cross-model migrations
|
||||
├── repair/ # Data repair utilities
|
||||
└── validate/ # Validation functions
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. Database Layer
|
||||
|
||||
#### Database Class (`api/db/database.ts`)
|
||||
|
||||
The `Database` class wraps the storage adapter and provides:
|
||||
|
||||
- **Locking mechanism**: Prevents concurrent writes during sync
|
||||
- **Error handling**: Centralized error management
|
||||
- **CRUD operations**: `load`, `save`, `remove`, `loadAll`, `clearDatabase`
|
||||
|
||||
```typescript
|
||||
class Database {
|
||||
lock(): void; // Prevents writes
|
||||
unlock(): void; // Re-enables writes
|
||||
load<T>(key: string): Promise<T>;
|
||||
save<T>(key: string, data: T, isIgnoreDBLock?: boolean): Promise<void>;
|
||||
remove(key: string): Promise<unknown>;
|
||||
}
|
||||
```
|
||||
|
||||
The database is locked during sync operations to prevent race conditions.
|
||||
|
||||
#### IndexedDB Adapter (`api/db/indexed-db-adapter.ts`)
|
||||
|
||||
Implements `DatabaseAdapter` interface using IndexedDB:
|
||||
|
||||
- Database name: `'pf'`
|
||||
- Main store: `'main'`
|
||||
- Uses the `idb` library for async IndexedDB operations
|
||||
|
||||
```typescript
|
||||
class IndexedDbAdapter implements DatabaseAdapter {
|
||||
async init(): Promise<IDBPDatabase>; // Opens/creates database
|
||||
async load<T>(key: string): Promise<T>; // db.get(store, key)
|
||||
async save<T>(key: string, data: T): Promise<void>; // db.put(store, data, key)
|
||||
async remove(key: string): Promise<unknown>; // db.delete(store, key)
|
||||
async loadAll<A>(): Promise<A>; // Returns all entries as object
|
||||
async clearDatabase(): Promise<void>; // db.clear(store)
|
||||
}
|
||||
```
|
||||
|
||||
## Local Storage Structure (IndexedDB)
|
||||
|
||||
All data is stored in a single IndexedDB database with one object store. Each entry is keyed by a string identifier.
|
||||
|
||||
### IndexedDB Keys
|
||||
|
||||
#### System Keys
|
||||
|
||||
| Key | Content | Description |
|
||||
| --------------------- | ------------------------- | ------------------------------------------------------- |
|
||||
| `__meta_` | `LocalMeta` | Sync metadata (vector clock, revMap, timestamps) |
|
||||
| `__client_id_` | `string` | Unique client identifier (e.g., `"BCL1234567890_12_5"`) |
|
||||
| `__sp_cred_Dropbox` | `DropboxPrivateCfg` | Dropbox credentials |
|
||||
| `__sp_cred_WebDAV` | `WebdavPrivateCfg` | WebDAV credentials |
|
||||
| `__sp_cred_LocalFile` | `LocalFileSyncPrivateCfg` | Local file sync config |
|
||||
| `__TMP_BACKUP` | `AllSyncModels` | Temporary backup during imports |
|
||||
|
||||
#### Model Keys (all defined in `pfapi-config.ts`)
|
||||
|
||||
| Key | Content | Main File | Description |
|
||||
| ---------------- | --------------------- | --------- | ----------------------------- |
|
||||
| `task` | `TaskState` | Yes | Tasks data (EntityState) |
|
||||
| `timeTracking` | `TimeTrackingState` | Yes | Time tracking records |
|
||||
| `project` | `ProjectState` | Yes | Projects (EntityState) |
|
||||
| `tag` | `TagState` | Yes | Tags (EntityState) |
|
||||
| `simpleCounter` | `SimpleCounterState` | Yes | Simple counters (EntityState) |
|
||||
| `note` | `NoteState` | Yes | Notes (EntityState) |
|
||||
| `taskRepeatCfg` | `TaskRepeatCfgState` | Yes | Recurring task configs |
|
||||
| `reminders` | `Reminder[]` | Yes | Reminder array |
|
||||
| `planner` | `PlannerState` | Yes | Planner state |
|
||||
| `boards` | `BoardsState` | Yes | Kanban boards |
|
||||
| `menuTree` | `MenuTreeState` | No | Menu structure |
|
||||
| `globalConfig` | `GlobalConfigState` | No | User settings |
|
||||
| `issueProvider` | `IssueProviderState` | No | Issue tracker configs |
|
||||
| `metric` | `MetricState` | No | Metrics (EntityState) |
|
||||
| `improvement` | `ImprovementState` | No | Improvements (EntityState) |
|
||||
| `obstruction` | `ObstructionState` | No | Obstructions (EntityState) |
|
||||
| `pluginUserData` | `PluginUserDataState` | No | Plugin user data |
|
||||
| `pluginMetadata` | `PluginMetaDataState` | No | Plugin metadata |
|
||||
| `archiveYoung` | `ArchiveModel` | No | Recent archived tasks |
|
||||
| `archiveOld` | `ArchiveModel` | No | Old archived tasks |
|
||||
|
||||
### Local Storage Diagram
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ IndexedDB: "pf" │
|
||||
│ Store: "main" │
|
||||
├──────────────────────┬───────────────────────────────────────────┤
|
||||
│ Key │ Value │
|
||||
├──────────────────────┼───────────────────────────────────────────┤
|
||||
│ __meta_ │ { lastUpdate, vectorClock, revMap, ... } │
|
||||
│ __client_id_ │ "BCLm1abc123_12_5" │
|
||||
│ __sp_cred_Dropbox │ { accessToken, refreshToken, encryptKey } │
|
||||
│ __sp_cred_WebDAV │ { url, username, password, encryptKey } │
|
||||
├──────────────────────┼───────────────────────────────────────────┤
|
||||
│ task │ { ids: [...], entities: {...} } │
|
||||
│ project │ { ids: [...], entities: {...} } │
|
||||
│ tag │ { ids: [...], entities: {...} } │
|
||||
│ note │ { ids: [...], entities: {...} } │
|
||||
│ globalConfig │ { misc: {...}, keyboard: {...}, ... } │
|
||||
│ timeTracking │ { ... } │
|
||||
│ planner │ { ... } │
|
||||
│ boards │ { ... } │
|
||||
│ archiveYoung │ { task: {...}, timeTracking: {...} } │
|
||||
│ archiveOld │ { task: {...}, timeTracking: {...} } │
|
||||
│ ... │ ... │
|
||||
└──────────────────────┴───────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### How Models Are Saved Locally
|
||||
|
||||
When a model is saved via `ModelCtrl.save()`:
|
||||
|
||||
```typescript
|
||||
// 1. Data is validated
|
||||
if (modelCfg.validate) {
|
||||
const result = modelCfg.validate(data);
|
||||
if (!result.success && modelCfg.repair) {
|
||||
data = modelCfg.repair(data); // Auto-repair if possible
|
||||
}
|
||||
}
|
||||
|
||||
// 2. Metadata is updated (if requested via isUpdateRevAndLastUpdate)
|
||||
// Always:
|
||||
vectorClock = incrementVectorClock(vectorClock, clientId);
|
||||
lastUpdate = Date.now();
|
||||
|
||||
// Only for NON-main-file models (isMainFileModel: false):
|
||||
if (!modelCfg.isMainFileModel) {
|
||||
revMap[modelId] = Date.now().toString();
|
||||
}
|
||||
// Main file models are tracked via mainModelData in the meta file, not revMap
|
||||
|
||||
// 3. Data is saved to IndexedDB
|
||||
await db.put('main', data, modelId); // e.g., key='task', value=TaskState
|
||||
```
|
||||
|
||||
**Important distinction:**
|
||||
|
||||
- **Main file models** (`isMainFileModel: true`): Vector clock is incremented, but `revMap` is NOT updated. These models are embedded in `mainModelData` within the meta file.
|
||||
- **Separate model files** (`isMainFileModel: false`): Both vector clock and `revMap` are updated. The `revMap` entry tracks the revision of the individual remote file.
|
||||
|
||||
### 2. Model Control Layer
|
||||
|
||||
#### ModelCtrl (`api/model-ctrl/model-ctrl.ts`)
|
||||
|
||||
Generic controller for each data model (tasks, projects, tags, etc.):
|
||||
|
||||
```typescript
|
||||
class ModelCtrl<MT extends ModelBase> {
|
||||
save(
|
||||
data: MT,
|
||||
options?: {
|
||||
isUpdateRevAndLastUpdate: boolean;
|
||||
isIgnoreDBLock?: boolean;
|
||||
},
|
||||
): Promise<unknown>;
|
||||
|
||||
load(): Promise<MT>;
|
||||
remove(): Promise<unknown>;
|
||||
}
|
||||
```
|
||||
|
||||
Key behaviors:
|
||||
|
||||
- **Validation on save**: Uses Typia for runtime type checking
|
||||
- **Auto-repair**: Attempts to repair invalid data if `repair` function is provided
|
||||
- **In-memory caching**: Keeps data in memory for fast reads
|
||||
- **Revision tracking**: Updates metadata on save when `isUpdateRevAndLastUpdate` is true
|
||||
|
||||
#### MetaModelCtrl (`api/model-ctrl/meta-model-ctrl.ts`)
|
||||
|
||||
Manages synchronization metadata:
|
||||
|
||||
```typescript
|
||||
interface LocalMeta {
|
||||
lastUpdate: number; // Timestamp of last local change
|
||||
lastSyncedUpdate: number | null; // Timestamp of last sync
|
||||
metaRev: string | null; // Remote metadata revision
|
||||
vectorClock: VectorClock; // Client-specific clock values
|
||||
lastSyncedVectorClock: VectorClock | null;
|
||||
revMap: RevMap; // Model ID -> revision mapping
|
||||
crossModelVersion: number; // Data schema version
|
||||
}
|
||||
```
|
||||
|
||||
Key responsibilities:
|
||||
|
||||
- **Client ID management**: Generates and stores unique client identifiers
|
||||
- **Vector clock updates**: Increments on local changes
|
||||
- **Revision map tracking**: Tracks which model versions are synced
|
||||
|
||||
### 3. Sync Service Layer
|
||||
|
||||
#### SyncService (`api/sync/sync.service.ts`)
|
||||
|
||||
Main sync orchestrator. The `sync()` method:
|
||||
|
||||
1. **Check readiness**: Verify sync provider is configured and authenticated
|
||||
2. **Operation log sync**: Upload/download operation logs (new feature)
|
||||
3. **Early return check**: If `lastSyncedUpdate === lastUpdate` and meta revision matches, return `InSync`
|
||||
4. **Download remote metadata**: Get current remote state
|
||||
5. **Determine sync direction**: Compare local and remote states using `getSyncStatusFromMetaFiles`
|
||||
6. **Execute sync**: Upload, download, or report conflict
|
||||
|
||||
```typescript
|
||||
async sync(): Promise<{ status: SyncStatus; conflictData?: ConflictData }>
|
||||
```
|
||||
|
||||
Possible sync statuses:
|
||||
|
||||
- `InSync` - No changes needed
|
||||
- `UpdateLocal` - Download needed (remote is newer)
|
||||
- `UpdateRemote` - Upload needed (local is newer)
|
||||
- `UpdateLocalAll` / `UpdateRemoteAll` - Full sync needed
|
||||
- `Conflict` - Concurrent changes detected
|
||||
- `NotConfigured` - No sync provider set
|
||||
|
||||
#### MetaSyncService (`api/sync/meta-sync.service.ts`)
|
||||
|
||||
Handles metadata file operations:
|
||||
|
||||
- `download()`: Gets remote metadata, checks for locks
|
||||
- `upload()`: Uploads metadata with encryption
|
||||
- `lock()`: Creates a lock file during multi-file upload
|
||||
- `getRev()`: Gets remote metadata revision
|
||||
|
||||
#### ModelSyncService (`api/sync/model-sync.service.ts`)
|
||||
|
||||
Handles individual model file operations:
|
||||
|
||||
- `upload()`: Uploads a model with encryption
|
||||
- `download()`: Downloads a model with revision verification
|
||||
- `remove()`: Deletes a remote model file
|
||||
- `getModelIdsToUpdateFromRevMaps()`: Determines which models need syncing
|
||||
|
||||
### 4. Vector Clock System
|
||||
|
||||
#### Purpose
|
||||
|
||||
Vector clocks provide **causality-based conflict detection** for distributed systems. Unlike simple timestamps:
|
||||
|
||||
- They detect **concurrent changes** (true conflicts)
|
||||
- They preserve **happened-before relationships**
|
||||
- They work without synchronized clocks
|
||||
|
||||
#### Implementation (`api/util/vector-clock.ts`)
|
||||
|
||||
```typescript
|
||||
interface VectorClock {
|
||||
[clientId: string]: number; // Maps client ID to update count
|
||||
}
|
||||
|
||||
enum VectorClockComparison {
|
||||
EQUAL, // Same state
|
||||
LESS_THAN, // A happened before B
|
||||
GREATER_THAN, // B happened before A
|
||||
CONCURRENT, // True conflict - both changed independently
|
||||
}
|
||||
```
|
||||
|
||||
Key operations:
|
||||
|
||||
- `incrementVectorClock(clock, clientId)` - Increment on local change
|
||||
- `mergeVectorClocks(a, b)` - Take max of each component
|
||||
- `compareVectorClocks(a, b)` - Determine relationship
|
||||
- `hasVectorClockChanges(current, reference)` - Check for local changes
|
||||
- `limitVectorClockSize(clock, clientId)` - Prune to max 50 clients
|
||||
|
||||
#### Sync Status Determination (`api/util/get-sync-status-from-meta-files.ts`)
|
||||
|
||||
```typescript
|
||||
function getSyncStatusFromMetaFiles(remote: RemoteMeta, local: LocalMeta) {
|
||||
// 1. Check for empty local/remote
|
||||
// 2. Compare vector clocks
|
||||
// 3. Return appropriate SyncStatus
|
||||
}
|
||||
```
|
||||
|
||||
The algorithm (simplified - actual implementation has more nuances):
|
||||
|
||||
1. **Empty data checks:**
|
||||
|
||||
- If remote has no data (`isRemoteDataEmpty`), return `UpdateRemoteAll`
|
||||
- If local has no data (`isLocalDataEmpty`), return `UpdateLocalAll`
|
||||
|
||||
2. **Vector clock validation:**
|
||||
|
||||
- If either local or remote lacks a vector clock, return `Conflict` with reason `NoLastSync`
|
||||
- Both `vectorClock` and `lastSyncedVectorClock` must be present
|
||||
|
||||
3. **Change detection using `hasVectorClockChanges`:**
|
||||
|
||||
- Local changes: Compare current `vectorClock` vs `lastSyncedVectorClock`
|
||||
- Remote changes: Compare remote `vectorClock` vs local `lastSyncedVectorClock`
|
||||
|
||||
4. **Sync status determination:**
|
||||
- No local changes + no remote changes -> `InSync`
|
||||
- Local changes only -> `UpdateRemote`
|
||||
- Remote changes only -> `UpdateLocal`
|
||||
- Both have changes -> `Conflict` with reason `BothNewerLastSync`
|
||||
|
||||
**Note:** The actual implementation also handles edge cases like minimal-update bootstrap scenarios and validates that clocks are properly initialized.
|
||||
|
||||
### 5. Sync Providers
|
||||
|
||||
#### Interface (`api/sync/sync-provider.interface.ts`)
|
||||
|
||||
```typescript
|
||||
interface SyncProviderServiceInterface<PID extends SyncProviderId> {
|
||||
id: PID;
|
||||
isUploadForcePossible?: boolean;
|
||||
isLimitedToSingleFileSync?: boolean;
|
||||
maxConcurrentRequests: number;
|
||||
|
||||
getFileRev(targetPath: string, localRev: string | null): Promise<FileRevResponse>;
|
||||
downloadFile(targetPath: string): Promise<FileDownloadResponse>;
|
||||
uploadFile(
|
||||
targetPath: string,
|
||||
dataStr: string,
|
||||
revToMatch: string | null,
|
||||
isForceOverwrite?: boolean,
|
||||
): Promise<FileRevResponse>;
|
||||
removeFile(targetPath: string): Promise<void>;
|
||||
listFiles?(targetPath: string): Promise<string[]>;
|
||||
isReady(): Promise<boolean>;
|
||||
setPrivateCfg(privateCfg): Promise<void>;
|
||||
}
|
||||
```
|
||||
|
||||
#### Available Providers
|
||||
|
||||
| Provider | Description | Force Upload | Max Concurrent |
|
||||
| ------------- | --------------------------- | ------------ | -------------- |
|
||||
| **Dropbox** | OAuth2 PKCE authentication | Yes | 4 |
|
||||
| **WebDAV** | Nextcloud, ownCloud, etc. | No | 10 |
|
||||
| **LocalFile** | Electron/Android filesystem | No | 10 |
|
||||
| **SuperSync** | WebDAV-based custom sync | No | 10 |
|
||||
|
||||
### 6. Data Encryption & Compression
|
||||
|
||||
#### EncryptAndCompressHandlerService
|
||||
|
||||
Handles data transformation before upload/after download:
|
||||
|
||||
- **Compression**: Uses compression algorithms to reduce data size
|
||||
- **Encryption**: AES encryption with user-provided key
|
||||
|
||||
Data format prefix: `pf_` indicates processed data.
|
||||
|
||||
### 7. Migration System
|
||||
|
||||
#### MigrationService (`api/migration/migration.service.ts`)
|
||||
|
||||
Handles data schema evolution:
|
||||
|
||||
- Checks version on app startup
|
||||
- Applies cross-model migrations sequentially in order
|
||||
- **Only supports forward (upgrade) migrations** - throws `CanNotMigrateMajorDownError` if data version is higher than code version (major version mismatch)
|
||||
|
||||
```typescript
|
||||
interface CrossModelMigrations {
|
||||
[version: number]: (fullData) => transformedData;
|
||||
}
|
||||
```
|
||||
|
||||
**Migration behavior:**
|
||||
|
||||
- If `dataVersion === codeVersion`: No migration needed
|
||||
- If `dataVersion < codeVersion`: Run all migrations from `dataVersion` to `codeVersion`
|
||||
- If `dataVersion > codeVersion` (major version differs): Throws error - downgrade not supported
|
||||
|
||||
Current version: `4.4` (from `pfapi-config.ts`)
|
||||
|
||||
### 8. Validation & Repair
|
||||
|
||||
#### Validation
|
||||
|
||||
Uses **Typia** for runtime type validation:
|
||||
|
||||
- Each model can define a `validate` function
|
||||
- Returns `IValidation<T>` with success flag and errors
|
||||
|
||||
#### Repair
|
||||
|
||||
Auto-repair system for corrupted data:
|
||||
|
||||
- Each model can define a `repair` function
|
||||
- Applied when validation fails
|
||||
- Falls back to error if repair fails
|
||||
|
||||
## Sync Flow Diagrams
|
||||
|
||||
### Normal Sync Flow
|
||||
|
||||
```
|
||||
┌─────────┐ ┌─────────┐ ┌─────────┐
|
||||
│ Device A│ │ Remote │ │ Device B│
|
||||
└────┬────┘ └────┬────┘ └────┬────┘
|
||||
│ │ │
|
||||
│ 1. sync() │ │
|
||||
├────────────────►│ │
|
||||
│ │ │
|
||||
│ 2. download │ │
|
||||
│ metadata │ │
|
||||
│◄────────────────┤ │
|
||||
│ │ │
|
||||
│ 3. compare │ │
|
||||
│ vector clocks │ │
|
||||
│ │ │
|
||||
│ 4. upload │ │
|
||||
│ changes │ │
|
||||
├────────────────►│ │
|
||||
│ │ │
|
||||
│ │ 5. sync() │
|
||||
│ │◄────────────────┤
|
||||
│ │ │
|
||||
│ │ 6. download │
|
||||
│ │ metadata │
|
||||
│ ├────────────────►│
|
||||
│ │ │
|
||||
│ │ 7. download │
|
||||
│ │ changed │
|
||||
│ │ models │
|
||||
│ ├────────────────►│
|
||||
```
|
||||
|
||||
### Conflict Detection Flow
|
||||
|
||||
```
|
||||
┌─────────┐ ┌─────────┐
|
||||
│ Device A│ │ Device B│
|
||||
│ VC: {A:5, B:3} │ VC: {A:4, B:5}
|
||||
└────┬────┘ └────┬────┘
|
||||
│ │
|
||||
│ Both made changes offline │
|
||||
│ │
|
||||
│ ┌─────────────────────────┼───────────────────────────┐
|
||||
│ │ Compare: CONCURRENT │ │
|
||||
│ │ A has A:5 (higher) │ B has B:5 (higher) │
|
||||
│ │ Neither dominates │ │
|
||||
│ └─────────────────────────┴───────────────────────────┘
|
||||
│ │
|
||||
│ Conflict! │
|
||||
│ User must choose which │
|
||||
│ version to keep │
|
||||
```
|
||||
|
||||
### Multi-File Upload with Locking
|
||||
|
||||
```
|
||||
┌─────────┐ ┌─────────┐
|
||||
│ Client │ │ Remote │
|
||||
└────┬────┘ └────┬────┘
|
||||
│ │
|
||||
│ 1. Create lock │
|
||||
│ (upload lock │
|
||||
│ content) │
|
||||
├────────────────►│
|
||||
│ │
|
||||
│ 2. Upload │
|
||||
│ model A │
|
||||
├────────────────►│
|
||||
│ │
|
||||
│ 3. Upload │
|
||||
│ model B │
|
||||
├────────────────►│
|
||||
│ │
|
||||
│ 4. Upload │
|
||||
│ metadata │
|
||||
│ (replaces lock)│
|
||||
├────────────────►│
|
||||
│ │
|
||||
│ Lock released │
|
||||
```
|
||||
|
||||
## Remote Storage Structure
|
||||
|
||||
The remote storage (Dropbox, WebDAV, local folder) contains multiple files. The structure is designed to optimize sync performance by separating frequently-changed small data from large archives.
|
||||
|
||||
### Remote Files Overview
|
||||
|
||||
```
|
||||
/ (or /DEV/ in development)
|
||||
├── __meta_ # Metadata file (REQUIRED - always synced first)
|
||||
├── globalConfig # User settings
|
||||
├── menuTree # Menu structure
|
||||
├── issueProvider # Issue tracker configurations
|
||||
├── metric # Metrics data
|
||||
├── improvement # Improvement entries
|
||||
├── obstruction # Obstruction entries
|
||||
├── pluginUserData # Plugin user data
|
||||
├── pluginMetadata # Plugin metadata
|
||||
├── archiveYoung # Recent archived tasks (can be large)
|
||||
└── archiveOld # Old archived tasks (can be very large)
|
||||
```
|
||||
|
||||
### The Meta File (`__meta_`)
|
||||
|
||||
The meta file is the **central coordination file** for sync. It contains:
|
||||
|
||||
1. **Sync metadata** (vector clock, timestamps, version)
|
||||
2. **Revision map** (`revMap`) - tracks which revision each model file has
|
||||
3. **Main file model data** - frequently-accessed data embedded directly
|
||||
|
||||
```typescript
|
||||
interface RemoteMeta {
|
||||
// Sync coordination
|
||||
lastUpdate: number; // When data was last changed
|
||||
crossModelVersion: number; // Schema version (e.g., 4.4)
|
||||
vectorClock: VectorClock; // For conflict detection
|
||||
revMap: RevMap; // Model ID -> revision string
|
||||
|
||||
// Embedded data (main file models)
|
||||
mainModelData: {
|
||||
task: TaskState;
|
||||
project: ProjectState;
|
||||
tag: TagState;
|
||||
note: NoteState;
|
||||
timeTracking: TimeTrackingState;
|
||||
simpleCounter: SimpleCounterState;
|
||||
taskRepeatCfg: TaskRepeatCfgState;
|
||||
reminders: Reminder[];
|
||||
planner: PlannerState;
|
||||
boards: BoardsState;
|
||||
};
|
||||
|
||||
// For single-file sync providers
|
||||
isFullData?: boolean; // If true, all data is in this file
|
||||
}
|
||||
```
|
||||
|
||||
### Main File Models vs Separate Model Files
|
||||
|
||||
Models are categorized into two types:
|
||||
|
||||
#### Main File Models (`isMainFileModel: true`)
|
||||
|
||||
These are embedded in the `__meta_` file's `mainModelData` field:
|
||||
|
||||
| Model | Reason |
|
||||
| --------------- | ------------------------------------- |
|
||||
| `task` | Frequently accessed, relatively small |
|
||||
| `project` | Core data, always needed |
|
||||
| `tag` | Small, frequently referenced |
|
||||
| `note` | Often viewed together with tasks |
|
||||
| `timeTracking` | Frequently updated |
|
||||
| `simpleCounter` | Small, frequently updated |
|
||||
| `taskRepeatCfg` | Needed for task creation |
|
||||
| `reminders` | Small array, time-critical |
|
||||
| `planner` | Viewed on app startup |
|
||||
| `boards` | Part of main UI |
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- Single HTTP request to get all core data
|
||||
- Atomic update of related models
|
||||
- Faster initial sync
|
||||
|
||||
#### Separate Model Files (`isMainFileModel: false` or undefined)
|
||||
|
||||
These are stored as individual files:
|
||||
|
||||
| Model | Reason |
|
||||
| -------------------------------------- | ------------------------------------------- |
|
||||
| `globalConfig` | User-specific, rarely synced |
|
||||
| `menuTree` | UI state, not critical |
|
||||
| `issueProvider` | Contains credentials, separate for security |
|
||||
| `metric`, `improvement`, `obstruction` | Historical data, can grow large |
|
||||
| `archiveYoung` | Can be large, changes infrequently |
|
||||
| `archiveOld` | Very large, rarely accessed |
|
||||
| `pluginUserData`, `pluginMetadata` | Plugin-specific, isolated |
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- Only download what changed (via `revMap` comparison)
|
||||
- Large files (archives) don't slow down regular sync
|
||||
- Can sync individual models independently
|
||||
|
||||
### RevMap: Tracking Model Versions
|
||||
|
||||
The `revMap` tracks which version of each separate model file is on the remote:
|
||||
|
||||
```typescript
|
||||
interface RevMap {
|
||||
[modelId: string]: string; // Model ID -> revision/timestamp
|
||||
}
|
||||
|
||||
// Example
|
||||
{
|
||||
"globalConfig": "1701234567890",
|
||||
"menuTree": "1701234567891",
|
||||
"archiveYoung": "1701234500000",
|
||||
"archiveOld": "1701200000000",
|
||||
// ... (main file models NOT included - they're in mainModelData)
|
||||
}
|
||||
```
|
||||
|
||||
When syncing:
|
||||
|
||||
1. Download `__meta_` file
|
||||
2. Compare remote `revMap` with local `revMap`
|
||||
3. Only download model files where revision differs
|
||||
|
||||
### Upload Flow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ UPLOAD FLOW │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
1. Determine what changed (compare local/remote revMaps)
|
||||
local.revMap: { archiveYoung: "100", globalConfig: "200" }
|
||||
remote.revMap: { archiveYoung: "100", globalConfig: "150" }
|
||||
→ globalConfig needs upload
|
||||
|
||||
2. For multi-file upload, create lock:
|
||||
Upload to __meta_: "SYNC_IN_PROGRESS__BCLm1abc123_12_5"
|
||||
|
||||
3. Upload changed model files:
|
||||
Upload to globalConfig: { encrypted/compressed data }
|
||||
→ Get new revision: "250"
|
||||
|
||||
4. Upload metadata (replaces lock):
|
||||
Upload to __meta_: {
|
||||
lastUpdate: 1701234567890,
|
||||
vectorClock: { "BCLm1abc123_12_5": 42 },
|
||||
revMap: { archiveYoung: "100", globalConfig: "250" },
|
||||
mainModelData: { task: {...}, project: {...}, ... }
|
||||
}
|
||||
```
|
||||
|
||||
### Download Flow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ DOWNLOAD FLOW │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
1. Download __meta_ file
|
||||
→ Get mainModelData (task, project, tag, etc.)
|
||||
→ Get revMap for separate files
|
||||
|
||||
2. Compare revMaps:
|
||||
remote.revMap: { archiveYoung: "300", globalConfig: "250" }
|
||||
local.revMap: { archiveYoung: "100", globalConfig: "250" }
|
||||
→ archiveYoung needs download
|
||||
|
||||
3. Download changed model files (parallel with load balancing):
|
||||
Download archiveYoung → decrypt/decompress → save locally
|
||||
|
||||
4. Update local metadata:
|
||||
- Save all mainModelData to IndexedDB
|
||||
- Save downloaded models to IndexedDB
|
||||
- Update local revMap to match remote
|
||||
- Merge vector clocks
|
||||
- Set lastSyncedUpdate = lastUpdate
|
||||
```
|
||||
|
||||
### Single-File Sync Mode
|
||||
|
||||
Some providers (or configurations) use `isLimitedToSingleFileSync: true`. In this mode:
|
||||
|
||||
- **All data** is stored in the `__meta_` file
|
||||
- `mainModelData` contains ALL models, not just main file models
|
||||
- `isFullData: true` flag is set
|
||||
- No separate model files are created
|
||||
- Simpler but less efficient for large datasets
|
||||
|
||||
### File Content Format
|
||||
|
||||
All files are stored as JSON strings with optional encryption/compression:
|
||||
|
||||
```
|
||||
Raw: { "ids": [...], "entities": {...} }
|
||||
↓ (if compression enabled)
|
||||
Compressed: <binary compressed data>
|
||||
↓ (if encryption enabled)
|
||||
Encrypted: <AES encrypted data>
|
||||
↓
|
||||
Prefixed: "pf_" + <cross_model_version> + "__" + <base64 encoded data>
|
||||
```
|
||||
|
||||
The `pf_` prefix indicates the data has been processed and needs decryption/decompression.
|
||||
|
||||
## Data Model Configurations
|
||||
|
||||
From `pfapi-config.ts`:
|
||||
|
||||
| Model | Main File | Description |
|
||||
| ---------------- | --------- | ---------------------- |
|
||||
| `task` | Yes | Tasks data |
|
||||
| `timeTracking` | Yes | Time tracking records |
|
||||
| `project` | Yes | Projects |
|
||||
| `tag` | Yes | Tags |
|
||||
| `simpleCounter` | Yes | Simple Counters |
|
||||
| `note` | Yes | Notes |
|
||||
| `taskRepeatCfg` | Yes | Recurring task configs |
|
||||
| `reminders` | Yes | Reminders |
|
||||
| `planner` | Yes | Planner data |
|
||||
| `boards` | Yes | Kanban boards |
|
||||
| `menuTree` | No | Menu structure |
|
||||
| `globalConfig` | No | User settings |
|
||||
| `issueProvider` | No | Issue tracker configs |
|
||||
| `metric` | No | Metrics data |
|
||||
| `improvement` | No | Metric improvements |
|
||||
| `obstruction` | No | Metric obstructions |
|
||||
| `pluginUserData` | No | Plugin user data |
|
||||
| `pluginMetadata` | No | Plugin metadata |
|
||||
| `archiveYoung` | No | Recent archive |
|
||||
| `archiveOld` | No | Old archive |
|
||||
|
||||
**Main file models** are stored in the metadata file itself for faster sync of frequently-accessed data.
|
||||
|
||||
## Error Handling
|
||||
|
||||
Custom error types in `api/errors/errors.ts`:
|
||||
|
||||
- **API Errors**: `NoRevAPIError`, `RemoteFileNotFoundAPIError`, `AuthFailSPError`
|
||||
- **Sync Errors**: `LockPresentError`, `LockFromLocalClientPresentError`, `UnknownSyncStateError`
|
||||
- **Data Errors**: `DataValidationFailedError`, `ModelValidationError`, `DataRepairNotPossibleError`
|
||||
|
||||
## Event System
|
||||
|
||||
```typescript
|
||||
type PfapiEvents =
|
||||
| 'syncDone' // Sync completed
|
||||
| 'syncStart' // Sync starting
|
||||
| 'syncError' // Sync failed
|
||||
| 'syncStatusChange' // Status changed
|
||||
| 'metaModelChange' // Metadata updated
|
||||
| 'providerChange' // Provider switched
|
||||
| 'providerReady' // Provider authenticated
|
||||
| 'providerPrivateCfgChange' // Provider credentials updated
|
||||
| 'onBeforeUpdateLocal'; // About to download changes
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **Encryption**: Optional AES encryption with user-provided key
|
||||
2. **No tracking**: All data stays local unless explicitly synced
|
||||
3. **Credential storage**: Provider credentials stored in IndexedDB with prefix `__sp_cred_`
|
||||
4. **OAuth security**: Dropbox uses PKCE flow
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
1. **Vector clocks over timestamps**: More reliable conflict detection in distributed systems
|
||||
2. **Main file models**: Frequently accessed data bundled with metadata for faster sync
|
||||
3. **Database locking**: Prevents corruption during sync operations
|
||||
4. **Adapter pattern**: Easy to add new storage backends
|
||||
5. **Provider abstraction**: Consistent interface across Dropbox, WebDAV, local files
|
||||
6. **Typia validation**: Runtime type safety without heavy dependencies
|
||||
|
||||
## Future Considerations
|
||||
|
||||
The system has been extended with **Operation Log Sync** for more granular synchronization at the operation level rather than full model replacement. See `operation-log-architecture.md` for details.
|
||||
205
docs/op-log/tiered-archive-proposal.md
Normal file
205
docs/op-log/tiered-archive-proposal.md
Normal file
|
|
@ -0,0 +1,205 @@
|
|||
# Tiered Archive Model Proposal
|
||||
|
||||
**Date:** December 5, 2025
|
||||
**Status:** Draft
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Introduce a tiered archive system that bounds the operation log to a configurable time window, making full op-log sync viable while preserving historical time tracking data.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Active Tasks (~500) │ Op-log synced (real-time)
|
||||
├─────────────────────────────────────────────────────────┤
|
||||
│ Recent Archive (0-3 years) │ Op-log synced (full data)
|
||||
├─────────────────────────────────────────────────────────┤
|
||||
│ Old Archive (3+ years) │ Compressed to time stats
|
||||
│ │ Device-local only
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Tiers
|
||||
|
||||
| Tier | Age | Data | Sync Method |
|
||||
| -------------- | --------- | ------------------ | ------------------ |
|
||||
| Active | Current | Full task data | Op-log (real-time) |
|
||||
| Recent Archive | 0-3 years | Full task data | Op-log (real-time) |
|
||||
| Old Archive | 3+ years | Time tracking only | Device-local |
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
```typescript
|
||||
interface ArchiveConfig {
|
||||
// Years of full task data to keep synced
|
||||
// Tasks older than this are converted to time tracking records
|
||||
recentArchiveYears: number; // Default: 3
|
||||
}
|
||||
```
|
||||
|
||||
### Rationale for 3-Year Default
|
||||
|
||||
- Covers most practical use cases (searching recent work)
|
||||
- Bounds synced task count to ~5,500 tasks (assuming 5 tasks/day)
|
||||
- Keeps op-log manageable for initial sync
|
||||
- Still preserves time tracking data indefinitely
|
||||
|
||||
---
|
||||
|
||||
## Data Model
|
||||
|
||||
### Recent Archive (Synced)
|
||||
|
||||
Full `TaskWithSubTasks` data, same as today.
|
||||
|
||||
### Old Archive (Compressed)
|
||||
|
||||
```typescript
|
||||
interface TimeTrackingRecord {
|
||||
date: string; // YYYY-MM-DD
|
||||
projectId?: string;
|
||||
tagIds: string[];
|
||||
timeSpent: number; // milliseconds
|
||||
}
|
||||
|
||||
interface OldArchiveModel {
|
||||
// Aggregated time tracking data
|
||||
timeTracking: TimeTrackingRecord[];
|
||||
|
||||
// Summary stats
|
||||
totalTasksConverted: number;
|
||||
oldestConvertedDate: string;
|
||||
}
|
||||
```
|
||||
|
||||
### Size Comparison
|
||||
|
||||
| Model | 10 Years of Data |
|
||||
| ---------------------------------- | ----------------------- |
|
||||
| Full tasks (current) | ~40MB (20K tasks × 2KB) |
|
||||
| Tiered (3yr full + 7yr compressed) | ~12MB + ~250KB |
|
||||
|
||||
---
|
||||
|
||||
## Implementation
|
||||
|
||||
### Conversion Trigger
|
||||
|
||||
Run during daily archive flush:
|
||||
|
||||
```typescript
|
||||
async flushArchive(): Promise<void> {
|
||||
// Existing flush logic...
|
||||
|
||||
// After flush, check for tasks to convert
|
||||
await this.convertOldArchiveTasks();
|
||||
}
|
||||
|
||||
async convertOldArchiveTasks(): Promise<void> {
|
||||
const cutoffDate = subYears(new Date(), config.recentArchiveYears);
|
||||
const tasksToConvert = await this.getTasksArchivedBefore(cutoffDate);
|
||||
|
||||
if (tasksToConvert.length === 0) return;
|
||||
|
||||
// Extract time tracking data
|
||||
const timeRecords = tasksToConvert.flatMap(task =>
|
||||
Object.entries(task.timeSpentOnDay).map(([date, ms]) => ({
|
||||
date,
|
||||
projectId: task.projectId,
|
||||
tagIds: task.tagIds,
|
||||
timeSpent: ms,
|
||||
}))
|
||||
);
|
||||
|
||||
// Append to old archive
|
||||
await this.appendToOldArchive(timeRecords);
|
||||
|
||||
// Remove from recent archive
|
||||
await this.removeFromRecentArchive(tasksToConvert.map(t => t.id));
|
||||
}
|
||||
```
|
||||
|
||||
### Op-Log Compaction
|
||||
|
||||
With bounded recent archive, compaction becomes straightforward:
|
||||
|
||||
1. Snapshot current state (active + recent archive)
|
||||
2. Discard all ops older than snapshot
|
||||
3. Old archive is excluded from op-log entirely
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
### Phase 1: Implement Tiered Model
|
||||
|
||||
- Add `OldArchiveModel` storage
|
||||
- Implement conversion logic
|
||||
- Add configuration option
|
||||
|
||||
### Phase 2: Enable by Default
|
||||
|
||||
- Set 3-year default
|
||||
- Run initial conversion on existing archives
|
||||
|
||||
### Phase 3: Op-Log Optimization
|
||||
|
||||
- Exclude old archive from op-log
|
||||
- Implement efficient compaction
|
||||
|
||||
---
|
||||
|
||||
## Trade-offs
|
||||
|
||||
### What Users Lose (for 3+ year old tasks)
|
||||
|
||||
- Task titles and details
|
||||
- Notes and attachments
|
||||
- Issue links
|
||||
- Ability to restore individual tasks
|
||||
|
||||
### What Users Keep
|
||||
|
||||
- Time tracking per day/project/tag (for reports)
|
||||
- Summary statistics
|
||||
|
||||
### Mitigation
|
||||
|
||||
- 3-year default is generous
|
||||
- Configurable for users who need more
|
||||
- Time tracking data (the main value) is preserved
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Should old archive sync via PFAPI?**
|
||||
|
||||
- Pro: Data available on all devices
|
||||
- Con: Adds complexity, defeats purpose of bounding sync
|
||||
- Recommendation: Device-local only (users can export/import manually)
|
||||
|
||||
2. **Count-based alternative?**
|
||||
|
||||
- Instead of years, keep last N tasks (e.g., 5000)
|
||||
- More predictable performance characteristics
|
||||
- Could offer both options
|
||||
|
||||
3. **What about subtasks?**
|
||||
- Convert parent and subtasks together
|
||||
- Aggregate time tracking at parent level?
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
- Op-log initial sync < 10 seconds for typical users
|
||||
- Archive operation payload < 100KB
|
||||
- Memory usage stable regardless of total historical tasks
|
||||
|
|
@ -2,6 +2,6 @@
|
|||
|
||||
> **Note:** This document has been moved to the canonical location. Please see:
|
||||
>
|
||||
> **[/src/app/core/persistence/operation-log/docs/hybrid-manifest-architecture.md](/src/app/core/persistence/operation-log/docs/hybrid-manifest-architecture.md)**
|
||||
> **[/docs/op-log/hybrid-manifest-architecture.md](/docs/op-log/hybrid-manifest-architecture.md)**
|
||||
|
||||
This redirect exists for historical reference. All updates should be made to the canonical document.
|
||||
156
docs/sync/pfapi-sync-overview.md
Normal file
156
docs/sync/pfapi-sync-overview.md
Normal file
|
|
@ -0,0 +1,156 @@
|
|||
# Sync System Overview (PFAPI)
|
||||
|
||||
**Last Updated:** December 2025
|
||||
|
||||
This directory contains the **legacy PFAPI** synchronization implementation for Super Productivity. This system enables data sync across devices through file-based providers (Dropbox, WebDAV, Local File).
|
||||
|
||||
> **Note:** Super Productivity now has **two sync systems** running in parallel:
|
||||
>
|
||||
> 1. **PFAPI Sync** (this directory) - File-based sync via Dropbox/WebDAV
|
||||
> 2. **Operation Log Sync** - Operation-based sync via SuperSync Server
|
||||
>
|
||||
> See [Operation Log Architecture](/docs/op-log/operation-log-architecture.md) for the newer operation-based system.
|
||||
|
||||
## Key Components
|
||||
|
||||
### Core Services
|
||||
|
||||
- **`sync.service.ts`** - Main orchestrator for sync operations
|
||||
- **`meta-sync.service.ts`** - Handles sync metadata file operations
|
||||
- **`model-sync.service.ts`** - Manages individual model synchronization
|
||||
- **`conflict-handler.service.ts`** - User interface for conflict resolution
|
||||
|
||||
### Sync Providers
|
||||
|
||||
Located in `sync-providers/`:
|
||||
|
||||
- Dropbox
|
||||
- WebDAV
|
||||
- Local File System
|
||||
|
||||
### Sync Algorithm
|
||||
|
||||
The sync system uses vector clocks for accurate conflict detection:
|
||||
|
||||
1. **Physical Timestamps** (`lastUpdate`) - For ordering events
|
||||
2. **Vector Clocks** (`vectorClock`) - For accurate causality tracking and conflict detection
|
||||
3. **Sync State** (`lastSyncedUpdate`, `lastSyncedVectorClock`) - To track last successful sync
|
||||
|
||||
## How Sync Works
|
||||
|
||||
### 1. Change Detection
|
||||
|
||||
When a user modifies data:
|
||||
|
||||
```typescript
|
||||
// In meta-model-ctrl.ts
|
||||
lastUpdate = Date.now();
|
||||
vectorClock[clientId] = vectorClock[clientId] + 1;
|
||||
```
|
||||
|
||||
### 2. Sync Status Determination
|
||||
|
||||
The system compares local and remote metadata to determine:
|
||||
|
||||
- **InSync**: No changes needed
|
||||
- **UpdateLocal**: Download remote changes
|
||||
- **UpdateRemote**: Upload local changes
|
||||
- **Conflict**: Both have changes (requires user resolution)
|
||||
|
||||
### 3. Conflict Detection
|
||||
|
||||
Uses vector clocks for accurate detection:
|
||||
|
||||
```typescript
|
||||
const comparison = compareVectorClocks(localVector, remoteVector);
|
||||
if (comparison === VectorClockComparison.CONCURRENT) {
|
||||
// True conflict - changes were made independently
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Data Transfer
|
||||
|
||||
- **Upload**: Sends changed models and updated metadata
|
||||
- **Download**: Retrieves and merges remote changes
|
||||
- **Conflict Resolution**: User chooses which version to keep
|
||||
|
||||
## Key Files
|
||||
|
||||
### Metadata Structure
|
||||
|
||||
```typescript
|
||||
interface LocalMeta {
|
||||
lastUpdate: number; // Physical timestamp
|
||||
lastSyncedUpdate: number; // Last synced timestamp
|
||||
vectorClock?: VectorClock; // Causality tracking
|
||||
lastSyncedVectorClock?: VectorClock; // Last synced vector clock
|
||||
revMap: RevMap; // Model revision map
|
||||
crossModelVersion: number; // Schema version
|
||||
}
|
||||
```
|
||||
|
||||
### Important Considerations
|
||||
|
||||
1. **Vector Clocks**: Each client maintains its own counter for accurate causality tracking
|
||||
2. **Backwards Compatibility**: Supports migration from older versions
|
||||
3. **Conflict Minimization**: Vector clocks eliminate false conflicts
|
||||
4. **Atomic Operations**: Meta file serves as transaction coordinator
|
||||
|
||||
## Common Sync Scenarios
|
||||
|
||||
### Scenario 1: Simple Update
|
||||
|
||||
1. Device A makes changes
|
||||
2. Device A uploads to cloud
|
||||
3. Device B downloads changes
|
||||
4. Both devices now in sync
|
||||
|
||||
### Scenario 2: Conflict Resolution
|
||||
|
||||
1. Device A and B both make changes
|
||||
2. Device A syncs first
|
||||
3. Device B detects conflict
|
||||
4. User chooses which version to keep
|
||||
5. Chosen version propagates to all devices
|
||||
|
||||
### Scenario 3: Multiple Devices
|
||||
|
||||
1. Devices A, B, C all synced
|
||||
2. Device A makes changes while offline
|
||||
3. Device B makes different changes
|
||||
4. Device C acts as intermediary
|
||||
5. Vector clocks ensure proper ordering
|
||||
|
||||
## Debugging Sync Issues
|
||||
|
||||
1. Enable verbose logging in `pfapi/api/util/log.ts`
|
||||
2. Check vector clock states in sync status
|
||||
3. Verify client IDs are stable
|
||||
4. Ensure metadata files are properly updated
|
||||
|
||||
## Integration with Operation Log
|
||||
|
||||
When using file-based sync (Dropbox, WebDAV), the Operation Log system maintains compatibility through:
|
||||
|
||||
1. **Vector Clock Updates**: `VectorClockFacadeService` updates the PFAPI meta-model's vector clock when operations are persisted locally
|
||||
2. **State Source**: PFAPI reads current state from NgRx store (not from operation log IndexedDB)
|
||||
3. **Bridge Effect**: `updateModelVectorClock$` in `operation-log.effects.ts` ensures clocks stay in sync
|
||||
|
||||
This allows users to:
|
||||
|
||||
- Use file-based sync (Dropbox/WebDAV) while benefiting from Operation Log's local persistence
|
||||
- Migrate between sync providers without data loss
|
||||
|
||||
## Future Direction
|
||||
|
||||
The PFAPI sync system is **stable but not receiving new features**. New sync features are being developed in the Operation Log system:
|
||||
|
||||
- ✅ Entity-level conflict resolution (Operation Log)
|
||||
- ✅ Incremental sync (Operation Log)
|
||||
- 📋 Planned: Deprecate file-based sync in favor of Operation Log with file fallback
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Vector Clocks](./vector-clocks.md) - Conflict detection implementation
|
||||
- [Operation Log Architecture](/docs/op-log/operation-log-architecture.md) - Newer operation-based sync
|
||||
- [Hybrid Manifest Architecture](/docs/op-log/hybrid-manifest-architecture.md) - File-based Operation Log sync
|
||||
|
|
@ -2,6 +2,6 @@
|
|||
|
||||
> **Note:** This document has been moved to the canonical location. Please see:
|
||||
>
|
||||
> **[/src/app/core/persistence/operation-log/docs/pfapi-sync-persistence-architecture.md](/src/app/core/persistence/operation-log/docs/pfapi-sync-persistence-architecture.md)**
|
||||
> **[/docs/op-log/pfapi-sync-persistence-architecture.md](/docs/op-log/pfapi-sync-persistence-architecture.md)**
|
||||
|
||||
This redirect exists for historical reference. All updates should be made to the canonical document.
|
||||
|
|
@ -8,8 +8,8 @@ Super Productivity uses vector clocks to provide accurate conflict detection and
|
|||
|
||||
> **Related Documentation:**
|
||||
>
|
||||
> - [Operation Log Architecture](/src/app/core/persistence/operation-log/docs/operation-log-architecture.md) - How vector clocks are used in the operation log
|
||||
> - [Operation Log Diagrams](/src/app/core/persistence/operation-log/docs/operation-log-architecture-diagrams.md) - Visual diagrams including conflict detection
|
||||
> - [Operation Log Architecture](/docs/op-log/operation-log-architecture.md) - How vector clocks are used in the operation log
|
||||
> - [Operation Log Diagrams](/docs/op-log/operation-log-architecture-diagrams.md) - Visual diagrams including conflict detection
|
||||
|
||||
## Table of Contents
|
||||
|
||||
|
|
@ -302,4 +302,4 @@ The Operation Log system uses vector clocks in several ways:
|
|||
3. **Conflict Detection**: `detectConflicts()` compares clocks between pending local ops and remote ops
|
||||
4. **SYNC_IMPORT Handling**: Vector clock dominance filtering determines which ops to replay after full state imports
|
||||
|
||||
For detailed information, see [Operation Log Architecture - Part C: Server Sync](/src/app/core/persistence/operation-log/docs/operation-log-architecture.md#part-c-server-sync).
|
||||
For detailed information, see [Operation Log Architecture - Part C: Server Sync](/docs/op-log/operation-log-architecture.md#part-c-server-sync).
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue