mirror of
https://github.com/photoprism/photoprism.git
synced 2026-01-22 18:18:39 +00:00
|
|
||
|---|---|---|
| .. | ||
| duf | ||
| fastwalk | ||
| testdata | ||
| buffer_pool.go | ||
| bytes.go | ||
| cache.go | ||
| cache_test.go | ||
| canonical.go | ||
| canonical_test.go | ||
| case.go | ||
| case_test.go | ||
| codec.go | ||
| config.go | ||
| config_test.go | ||
| const.go | ||
| copy_move.go | ||
| copy_move_test.go | ||
| directories.go | ||
| directories_test.go | ||
| done.go | ||
| done_test.go | ||
| errors.go | ||
| ext_list.go | ||
| ext_list_test.go | ||
| file_ext.go | ||
| file_ext_test.go | ||
| file_exts.go | ||
| file_exts_test.go | ||
| file_info.go | ||
| file_type.go | ||
| file_type_animated.go | ||
| file_type_test.go | ||
| file_types.go | ||
| file_types_ext.go | ||
| fileinfo.go | ||
| fileinfo_test.go | ||
| filepath.go | ||
| filepath_test.go | ||
| fs.go | ||
| fs_test.go | ||
| hash.go | ||
| hash_test.go | ||
| id.go | ||
| id_test.go | ||
| ignore.go | ||
| ignore_test.go | ||
| mime.go | ||
| mime_test.go | ||
| mode.go | ||
| mode_test.go | ||
| modtime.go | ||
| modtime_test.go | ||
| name.go | ||
| name_test.go | ||
| purge.go | ||
| purge_test.go | ||
| readlines.go | ||
| README.md | ||
| resolve.go | ||
| resolve_test.go | ||
| stat_test.go | ||
| symlink.go | ||
| symlink_test.go | ||
| walk.go | ||
| walk_test.go | ||
| write.go | ||
| write_test.go | ||
| zip.go | ||
| zip_test.go | ||
PhotoPrism — pkg/fs
Last Updated: November 25, 2025
Overview
pkg/fs provides safe, cross-platform filesystem helpers used across PhotoPrism. It supplies permission constants, copy/move utilities with force-aware semantics, safe path joins, archive extraction with size limits, MIME and extension lookups, hashing, canonical path casing, and fast directory walking with ignore lists.
Goals
- Offer reusable, side-effect-safe filesystem helpers that other packages can call without importing
internal/*. - Enforce shared permission defaults (
ModeDir,ModeFile,ModeConfigFile,ModeSecretFile,ModeBackupFile). - Protect against common filesystem attacks (path traversal, overwrite of non-empty files without
force, unsafe zip extraction). - Provide consistent file-type detection (extensions/MIME), hashing, and fast walkers with skip logic for caches and
.ppstoragemarkers.
Non-Goals
- Database migrations or metadata parsing (handled elsewhere).
- Edition-specific behavior; all helpers are edition-agnostic.
Package Layout (Code Map)
- Permissions & paths:
mode.go,filepath.go,canonical.go,case.go. - Copy/Move & write helpers:
copy_move.go,write.go,cache.go,purge.go. - Archive extraction:
zip.go(size limits, safe join), tests inzip_test.go. - File info & types:
file_type*.go,mime.go,file_ext*.go,name.go. - Hashing & IDs:
hash.go,id.go. - Walkers & ignore rules:
walk.go,ignore.go,done.go. - Utilities:
bytes.go,resolve.go,symlink.go,modtime.go,readlines.go.
Usage & Test Guidelines
- Overwrite semantics: pass
force=trueonly when the caller explicitly confirmed replacement; empty files may be replaced withoutforce. - Permissions: use provided mode constants; do not mix with stdlib
io/fsbits. - Zip extraction: always set
fileSizeLimit/totalSizeLimitinUnzipfor untrusted inputs; ensure tests cover path traversal and size caps (seezip_test.go). - Focused tests:
go test ./pkg/fs -run 'Copy|Move|Unzip|Write' -count=1keeps feedback quick; full package:go test ./pkg/fs -count=1.
Recent Changes & Improvements
- Hardened
safeJoin: normalize\\//, usefilepath.Relto reject paths escapingbaseDir, and keep volume/absolute checks. - Added optional max-entries guard in
Unzipand treattotalSizeLimit=0as “no limit” while documenting-1as unlimited. - Added pool copy buffers (128–256 KiB) that use
io.CopyBufferinCopy,Hash,Checksum,WriteFileFromReaderto cut allocations/GC.
Pool Copy Buffers
- Read/write iterations per 4 GiB file:
- Before: ~131,072 iterations (4 GiB / 32 KiB).
- After: 16,384 iterations (4 GiB / 256 KiB).
- ~8× fewer syscalls and loop bookkeeping.
- Latency saved (order-of-magnitude):
- If each read+write pair costs ~2 µs of syscall/loop overhead, skipping ~115k iterations saves ≈0.23 s on a 4 GiB stream.
- On SSD/NVMe where disk I/O dominates, expect ~5–10% throughput gain; on spinning disks or network mounts with higher syscall cost, closer to 10–20% is realistic.
- CPU-bound hashing (SHA-1) sees mostly overhead reduction; the hash itself stays the dominant cost, but you still avoid ~8× buffer boundary checks and syscalls, so a few percent improvement is typical.
- Allocation/GC savings:
- Before: each call allocated a fresh 32 KiB buffer; hashing and copy both did this per invocation.
- After: pooled 256 KiB buffer reused; effectively zero steady-state allocations for these paths, which is most noticeable when hashing or copying many files in a batch (less GC pressure, fewer pauses).
- Net effect on large video files (several GB):
- Wall-clock improvement: modest but measurable (sub‑second on SSDs; up to a couple of seconds on slower media per 4 GiB).
- CPU usage: a few percentage points lower due to fewer syscalls and eliminated buffer allocations.
- GC: reduced minor-GC churn during bulk imports/hashes.