AI: Use OLLAMA_API_KEY as API auth token if specified #5361

Signed-off-by: Michael Mayer <michael@photoprism.app>
2026-01-23 02:24:24 +00:00 · 2025-12-03 10:47:08 +01:00 · 2025-12-03 10:47:08 +01:00 · 2660bacdec
commit 2660bacdec
parent d4aef5cf49
14 changed files with 367 additions and 199 deletions
--- a/internal/ai/vision/README.md
+++ b/internal/ai/vision/README.md
@ -53,41 +53,59 @@ The `vision.yml` file is usually kept in the `storage/config` directory (overrid

 The model `Options` adjust model parameters such as temperature, top-p, and schema constraints when using [Ollama](ollama/README.md) or [OpenAI](openai/README.md):

-| Option            | Default                                                                                 | Description                                                                             |
-|-------------------|-----------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|
-| `Temperature`     | engine default (`0.1` for Ollama)                                                       | Controls randomness with a value between `0.01` and `2.0`; not used for OpenAI's GPT-5. |
-| `TopK`            | engine default (model-specific)                                                         | Limits sampling to the top K tokens to reduce rare or noisy outputs.                    |
-| `TopP`            | engine default (`0.9` for some Ollama label defaults; unset for OpenAI)                 | Nucleus sampling; keeps the smallest token set whose cumulative probability ≥ `p`.      |
-| `MinP`            | engine default (unset unless provided)                                                  | Drops tokens whose probability mass is below `p`, trimming the long tail.               |
-| `TypicalP`        | engine default (unset unless provided)                                                  | Keeps tokens with typicality under the threshold; combine with TopP/MinP for flow.      |
-| `Seed`            | random per run (unless set)                                                             | Fix for reproducible outputs; unset for more variety between runs.                      |
-| `RepeatLastN`     | engine default (model-specific)                                                         | Number of recent tokens considered for repetition penalties.                            |
-| `RepeatPenalty`   | engine default (model-specific)                                                         | Multiplier >1 discourages repeating the same tokens or phrases.                         |
-| `NumPredict`      | engine default (Ollama only)                                                            | Ollama-specific max output tokens; synonymous intent with `MaxOutputTokens`.            |
-| `MaxOutputTokens` | engine default (OpenAI caption 512, labels 1024)                                        | Upper bound on generated tokens; adapters raise low values to defaults.                 |
-| `ForceJson`       | engine-specific (`true` for OpenAI labels; `false` for Ollama labels; captions `false`) | Forces structured output when enabled.                                                  |
-| `SchemaVersion`   | derived from schema name                                                                | Override when coordinating schema migrations.                                           |
-| `Stop`            | engine default                                                                          | Array of stop sequences (e.g., `["\\n\\n"]`).                                           |
-| `NumThread`       | runtime auto                                                                            | Caps CPU threads for local engines.                                                     |
-| `NumCtx`          | engine default                                                                          | Context window length (tokens).                                                         |
+| Option             | Default                                                                                 | Description                                                                             |
+|--------------------|-----------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|
+| `Temperature`      | engine default (`0.1` for Ollama)                                                       | Controls randomness with a value between `0.01` and `2.0`; not used for OpenAI's GPT-5. |
+| `TopK`             | engine default (model-specific)                                                         | Limits sampling to the top K tokens to reduce rare or noisy outputs.                    |
+| `TopP`             | engine default (`0.9` for some Ollama label defaults; unset for OpenAI)                 | Nucleus sampling; keeps the smallest token set whose cumulative probability ≥ `p`.      |
+| `MinP`             | engine default (unset unless provided)                                                  | Drops tokens whose probability mass is below `p`, trimming the long tail.               |
+| `TypicalP`         | engine default (unset unless provided)                                                  | Keeps tokens with typicality under the threshold; combine with TopP/MinP for flow.      |
+| `Seed`             | random per run (unless set)                                                             | Fix for reproducible outputs; unset for more variety between runs.                      |
+| `RepeatLastN`      | engine default (model-specific)                                                         | Number of recent tokens considered for repetition penalties.                            |
+| `RepeatPenalty`    | engine default (model-specific)                                                         | Multiplier >1 discourages repeating the same tokens or phrases.                         |
+| `PenalizeNewline`  | engine default                                                                          | Whether to apply repetition penalties to newline tokens.                                |
+| `PresencePenalty`  | engine default (OpenAI-style)                                                           | Increases the likelihood of introducing new tokens by penalizing existing ones.         |
+| `FrequencyPenalty` | engine default (OpenAI-style)                                                           | Penalizes tokens in proportion to their frequency so far.                               |
+| `TfsZ`             | engine default                                                                          | Tail free sampling parameter; lower values reduce repetition.                           |
+| `NumKeep`          | engine default (Ollama)                                                                 | How many tokens to keep from the prompt before sampling starts.                         |
+| `NumPredict`       | engine default (Ollama only)                                                            | Ollama-specific max output tokens; synonymous intent with `MaxOutputTokens`.            |
+| `MaxOutputTokens`  | engine default (OpenAI caption 512, labels 1024)                                        | Upper bound on generated tokens; adapters raise low values to defaults.                 |
+| `ForceJson`        | engine-specific (`true` for OpenAI labels; `false` for Ollama labels; captions `false`) | Forces structured output when enabled.                                                  |
+| `SchemaVersion`    | derived from schema name                                                                | Override when coordinating schema migrations.                                           |
+| `Stop`             | engine default                                                                          | Array of stop sequences (e.g., `["\\n\\n"]`).                                           |
+| `NumThread`        | runtime auto                                                                            | Caps CPU threads for local engines.                                                     |
+| `NumCtx`           | engine default                                                                          | Context window length (tokens).                                                         |
+| `Mirostat`         | engine default (Ollama)                                                                 | Enables Mirostat sampling (`0` off, `1/2` modes).                                       |
+| `MirostatTau`      | engine default                                                                          | Controls surprise target for Mirostat sampling.                                         |
+| `MirostatEta`      | engine default                                                                          | Learning rate for Mirostat adaptation.                                                  |
+| `NumBatch`         | engine default (Ollama)                                                                 | Batch size for prompt processing.                                                       |
+| `NumGpu`           | engine default (Ollama)                                                                 | Number of GPUs to distribute work across.                                               |
+| `MainGpu`          | engine default (Ollama)                                                                 | Primary GPU index when multiple GPUs are present.                                       |
+| `LowVram`          | engine default (Ollama)                                                                 | Enable VRAM-saving mode; may reduce performance.                                        |
+| `VocabOnly`        | engine default (Ollama)                                                                 | Load vocabulary only for quick metadata inspection.                                     |
+| `UseMmap`          | engine default (Ollama)                                                                 | Memory map model weights instead of fully loading them.                                 |
+| `UseMlock`         | engine default (Ollama)                                                                 | Lock model weights in RAM to reduce paging.                                             |
+| `Numa`             | engine default (Ollama)                                                                 | Enable NUMA-aware allocations when available.                                           |
+| `Detail`           | engine default (OpenAI)                                                                 | Controls OpenAI vision detail level (`low`, `high`, `auto`).                            |
+| `CombineOutputs`   | engine default (OpenAI multi-output)                                                    | Controls whether multi-output models combine results automatically.                     |

 #### Model Service

 Configures the endpoint URL, method, format, and authentication for [Ollama](ollama/README.md), [OpenAI](openai/README.md), and other engines that perform remote HTTP requests:

-| Field                              | Default                                  | Notes                                                |
-|------------------------------------|------------------------------------------|------------------------------------------------------|
-| `Uri`                              | required for remote                      | Endpoint base. Empty keeps model local (TensorFlow). |
-| `Method`                           | `POST`                                   | Override verb if provider needs it.                  |
-| `Key`                              | `""`                                     | Bearer token; prefer env expansion.                  |
-| `Username` / `Password`            | `""`                                     | Injected as basic auth when URI lacks userinfo.      |
-| `Model`                            | `""`                                     | Endpoint-specific override; wins over model/name.    |
-| `Org` / `Project`                  | `""`                                     | OpenAI headers (org/proj IDs)                        |
-| `RequestFormat` / `ResponseFormat` | set by engine alias                      | Explicit values win over alias defaults.             |
-| `FileScheme`                       | set by engine alias (`data` or `base64`) | Controls image transport.                            |
-| `Disabled`                         | `false`                                  | Disable the endpoint without removing the model.     |
+| Field                              | Default                                  | Notes                                                                                    |
+|------------------------------------|------------------------------------------|------------------------------------------------------------------------------------------|
+| `Uri`                              | required for remote                      | Endpoint base. Empty keeps model local (TensorFlow).                                     |
+| `Method`                           | `POST`                                   | Override verb if provider needs it.                                                      |
+| `Key`                              | `""`                                     | Bearer token; prefer env expansion (OpenAI: `OPENAI_API_KEY`, Ollama: `OLLAMA_API_KEY`). |
+| `Username` / `Password`            | `""`                                     | Injected as basic auth when URI lacks userinfo.                                          |
+| `Model`                            | `""`                                     | Endpoint-specific override; wins over model/name.                                        |
+| `Org` / `Project`                  | `""`                                     | OpenAI headers (org/proj IDs)                                                            |
+| `RequestFormat` / `ResponseFormat` | set by engine alias                      | Explicit values win over alias defaults.                                                 |
+| `FileScheme`                       | set by engine alias (`data` or `base64`) | Controls image transport.                                                                |
+| `Disabled`                         | `false`                                  | Disable the endpoint without removing the model.                                         |

-> **Authentication:** All credentials and identifiers support `${ENV_VAR}` expansion. `Service.Key` sets `Authorization: Bearer <token>`; `Username`/`Password` injects HTTP basic authentication into the service URI when it is not already present.
+> **Authentication:** All credentials and identifiers support `${ENV_VAR}` expansion. `Service.Key` sets `Authorization: Bearer <token>`; `Username`/`Password` injects HTTP basic authentication into the service URI when it is not already present. When `Service.Key` is empty, PhotoPrism defaults to `OPENAI_API_KEY` (OpenAI engine) or `OLLAMA_API_KEY` (Ollama engine), also honoring their `_FILE` counterparts.
 
 ### Field Behavior & Precedence

--- a/internal/ai/vision/engine.go
+++ b/internal/ai/vision/engine.go
@ -5,7 +5,6 @@ import (
 	"strings"
 	"sync"

-	"github.com/photoprism/photoprism/internal/ai/vision/openai"
 	"github.com/photoprism/photoprism/pkg/http/scheme"
 )

@ -61,14 +60,6 @@ func init() {
 		FileScheme:        scheme.Data,
 		DefaultResolution: DefaultResolution,
 	})
-
-	RegisterEngineAlias(openai.EngineName, EngineInfo{
-		Uri:               "https://api.openai.com/v1/responses",
-		RequestFormat:     ApiFormatOpenAI,
-		ResponseFormat:    ApiFormatOpenAI,
-		FileScheme:        scheme.Data,
-		DefaultResolution: openai.DefaultResolution,
-	})
 }

 // RegisterEngine adds/overrides an engine implementation for a specific API format.
@ -85,6 +76,7 @@ type EngineInfo struct {
 	ResponseFormat    ApiFormat
 	FileScheme        string
 	DefaultResolution int
+	DefaultKey        string // Optional placeholder key (e.g., ${OPENAI_API_KEY}); applied only when Service.Key is empty.
 }

 // RegisterEngineAlias maps a logical engine name (e.g., "ollama") to a
--- a/internal/ai/vision/engine_ollama.go
+++ b/internal/ai/vision/engine_ollama.go
@ -30,6 +30,7 @@ func init() {
 		ResponseFormat:    ApiFormatOllama,
 		FileScheme:        scheme.Base64,
 		DefaultResolution: ollama.DefaultResolution,
+		DefaultKey:        ollama.APIKeyPlaceholder,
 	})

 	CaptionModel.Engine = ollama.EngineName
--- a/internal/ai/vision/engine_openai.go
+++ b/internal/ai/vision/engine_openai.go
@ -28,6 +28,15 @@ func init() {
 		Parser:   openaiParser{},
 		Defaults: openaiDefaults{},
 	})
+
+	RegisterEngineAlias(openai.EngineName, EngineInfo{
+		Uri:               "https://api.openai.com/v1/responses",
+		RequestFormat:     ApiFormatOpenAI,
+		ResponseFormat:    ApiFormatOpenAI,
+		FileScheme:        scheme.Data,
+		DefaultResolution: openai.DefaultResolution,
+		DefaultKey:        openai.APIKeyPlaceholder,
+	})
 }

 // SystemPrompt returns the default OpenAI system prompt for the specified model type.
--- a/internal/ai/vision/model.go
+++ b/internal/ai/vision/model.go
@ -491,10 +491,10 @@ func (m *Model) ApplyEngineDefaults() {
 		if info.DefaultResolution > 0 && m.Resolution <= 0 {
 			m.Resolution = info.DefaultResolution
 		}
-	}

-	if engine == openai.EngineName && strings.TrimSpace(m.Service.Key) == "" {
-		m.Service.Key = "${OPENAI_API_KEY}"
+		if strings.TrimSpace(m.Service.Key) == "" && strings.TrimSpace(info.DefaultKey) != "" {
+			m.Service.Key = info.DefaultKey
+		}
 	}

 	m.Engine = engine
--- a/internal/ai/vision/model_options.go
+++ b/internal/ai/vision/model_options.go
@ -1,38 +1,39 @@
 package vision

 // ModelOptions represents additional model parameters listed in the documentation.
+// Comments note which engines currently honor each field.
 type ModelOptions struct {
-	NumKeep          int      `yaml:"NumKeep,omitempty" json:"num_keep,omitempty"` // Ollama ↓
-	Seed             int      `yaml:"Seed,omitempty" json:"seed,omitempty"`
-	NumPredict       int      `yaml:"NumPredict,omitempty" json:"num_predict,omitempty"`
-	Temperature      float64  `yaml:"Temperature,omitempty" json:"temperature,omitempty"`
-	TopK             int      `yaml:"TopK,omitempty" json:"top_k,omitempty"`
-	TopP             float64  `yaml:"TopP,omitempty" json:"top_p,omitempty"`
-	MinP             float64  `yaml:"MinP,omitempty" json:"min_p,omitempty"`
-	TypicalP         float64  `yaml:"TypicalP,omitempty" json:"typical_p,omitempty"`
-	TfsZ             float64  `yaml:"TfsZ,omitempty" json:"tfs_z,omitempty"`
-	RepeatLastN      int      `yaml:"RepeatLastN,omitempty" json:"repeat_last_n,omitempty"`
-	RepeatPenalty    float64  `yaml:"RepeatPenalty,omitempty" json:"repeat_penalty,omitempty"`
-	PresencePenalty  float64  `yaml:"PresencePenalty,omitempty" json:"presence_penalty,omitempty"`
-	FrequencyPenalty float64  `yaml:"FrequencyPenalty,omitempty" json:"frequency_penalty,omitempty"`
-	Mirostat         int      `yaml:"Mirostat,omitempty" json:"mirostat,omitempty"`
-	MirostatTau      float64  `yaml:"MirostatTau,omitempty" json:"mirostat_tau,omitempty"`
-	MirostatEta      float64  `yaml:"MirostatEta,omitempty" json:"mirostat_eta,omitempty"`
-	PenalizeNewline  bool     `yaml:"PenalizeNewline,omitempty" json:"penalize_newline,omitempty"`
-	Stop             []string `yaml:"Stop,omitempty" json:"stop,omitempty"`
-	Numa             bool     `yaml:"Numa,omitempty" json:"numa,omitempty"`
-	NumCtx           int      `yaml:"NumCtx,omitempty" json:"num_ctx,omitempty"`
-	NumBatch         int      `yaml:"NumBatch,omitempty" json:"num_batch,omitempty"`
-	NumGpu           int      `yaml:"NumGpu,omitempty" json:"num_gpu,omitempty"`
-	MainGpu          int      `yaml:"MainGpu,omitempty" json:"main_gpu,omitempty"`
-	LowVram          bool     `yaml:"LowVram,omitempty" json:"low_vram,omitempty"`
-	VocabOnly        bool     `yaml:"VocabOnly,omitempty" json:"vocab_only,omitempty"`
-	UseMmap          bool     `yaml:"UseMmap,omitempty" json:"use_mmap,omitempty"`
-	UseMlock         bool     `yaml:"UseMlock,omitempty" json:"use_mlock,omitempty"`
-	NumThread        int      `yaml:"NumThread,omitempty" json:"num_thread,omitempty"`
-	MaxOutputTokens  int      `yaml:"MaxOutputTokens,omitempty" json:"max_output_tokens,omitempty"` // OpenAI ↓
-	Detail           string   `yaml:"Detail,omitempty" json:"detail,omitempty"`
-	ForceJson        bool     `yaml:"ForceJson,omitempty" json:"force_json,omitempty"`
-	SchemaVersion    string   `yaml:"SchemaVersion,omitempty" json:"schema_version,omitempty"`
-	CombineOutputs   string   `yaml:"CombineOutputs,omitempty" json:"combine_outputs,omitempty"`
+	Temperature      float64  `yaml:"Temperature,omitempty" json:"temperature,omitempty"`            // Ollama, OpenAI
+	TopK             int      `yaml:"TopK,omitempty" json:"top_k,omitempty"`                         // Ollama
+	TopP             float64  `yaml:"TopP,omitempty" json:"top_p,omitempty"`                         // Ollama, OpenAI
+	MinP             float64  `yaml:"MinP,omitempty" json:"min_p,omitempty"`                         // Ollama
+	TypicalP         float64  `yaml:"TypicalP,omitempty" json:"typical_p,omitempty"`                 // Ollama
+	TfsZ             float64  `yaml:"TfsZ,omitempty" json:"tfs_z,omitempty"`                         // Ollama
+	Seed             int      `yaml:"Seed,omitempty" json:"seed,omitempty"`                          // Ollama
+	NumKeep          int      `yaml:"NumKeep,omitempty" json:"num_keep,omitempty"`                   // Ollama
+	RepeatLastN      int      `yaml:"RepeatLastN,omitempty" json:"repeat_last_n,omitempty"`          // Ollama
+	RepeatPenalty    float64  `yaml:"RepeatPenalty,omitempty" json:"repeat_penalty,omitempty"`       // Ollama
+	PresencePenalty  float64  `yaml:"PresencePenalty,omitempty" json:"presence_penalty,omitempty"`   // OpenAI
+	FrequencyPenalty float64  `yaml:"FrequencyPenalty,omitempty" json:"frequency_penalty,omitempty"` // OpenAI
+	PenalizeNewline  bool     `yaml:"PenalizeNewline,omitempty" json:"penalize_newline,omitempty"`   // Ollama
+	Stop             []string `yaml:"Stop,omitempty" json:"stop,omitempty"`                          // Ollama, OpenAI
+	Mirostat         int      `yaml:"Mirostat,omitempty" json:"mirostat,omitempty"`                  // Ollama
+	MirostatTau      float64  `yaml:"MirostatTau,omitempty" json:"mirostat_tau,omitempty"`           // Ollama
+	MirostatEta      float64  `yaml:"MirostatEta,omitempty" json:"mirostat_eta,omitempty"`           // Ollama
+	NumPredict       int      `yaml:"NumPredict,omitempty" json:"num_predict,omitempty"`             // Ollama
+	MaxOutputTokens  int      `yaml:"MaxOutputTokens,omitempty" json:"max_output_tokens,omitempty"`  // Ollama, OpenAI
+	ForceJson        bool     `yaml:"ForceJson,omitempty" json:"force_json,omitempty"`               // Ollama, OpenAI
+	SchemaVersion    string   `yaml:"SchemaVersion,omitempty" json:"schema_version,omitempty"`       // Ollama, OpenAI
+	CombineOutputs   string   `yaml:"CombineOutputs,omitempty" json:"combine_outputs,omitempty"`     // OpenAI
+	Detail           string   `yaml:"Detail,omitempty" json:"detail,omitempty"`                      // OpenAI
+	NumCtx           int      `yaml:"NumCtx,omitempty" json:"num_ctx,omitempty"`                     // Ollama, OpenAI
+	NumThread        int      `yaml:"NumThread,omitempty" json:"num_thread,omitempty"`               // Ollama
+	NumBatch         int      `yaml:"NumBatch,omitempty" json:"num_batch,omitempty"`                 // Ollama
+	NumGpu           int      `yaml:"NumGpu,omitempty" json:"num_gpu,omitempty"`                     // Ollama
+	MainGpu          int      `yaml:"MainGpu,omitempty" json:"main_gpu,omitempty"`                   // Ollama
+	LowVram          bool     `yaml:"LowVram,omitempty" json:"low_vram,omitempty"`                   // Ollama
+	VocabOnly        bool     `yaml:"VocabOnly,omitempty" json:"vocab_only,omitempty"`               // Ollama
+	UseMmap          bool     `yaml:"UseMmap,omitempty" json:"use_mmap,omitempty"`                   // Ollama
+	UseMlock         bool     `yaml:"UseMlock,omitempty" json:"use_mlock,omitempty"`                 // Ollama
+	Numa             bool     `yaml:"Numa,omitempty" json:"numa,omitempty"`                          // Ollama
 }
--- a/internal/ai/vision/model_test.go
+++ b/internal/ai/vision/model_test.go
@ -226,6 +226,20 @@ func TestModelApplyEngineDefaultsSetsServiceDefaults(t *testing.T) {
 		assert.Equal(t, ApiFormatOpenAI, model.Service.RequestFormat)
 		assert.Equal(t, ApiFormatOpenAI, model.Service.ResponseFormat)
 		assert.Equal(t, scheme.Data, model.Service.FileScheme)
+		assert.Equal(t, openai.APIKeyPlaceholder, model.Service.Key)
+	})
+	t.Run("OllamaEngineDefaults", func(t *testing.T) {
+		model := &Model{
+			Type:   ModelTypeLabels,
+			Engine: ollama.EngineName,
+		}
+
+		model.ApplyEngineDefaults()
+
+		assert.Equal(t, ApiFormatOllama, model.Service.RequestFormat)
+		assert.Equal(t, ApiFormatOllama, model.Service.ResponseFormat)
+		assert.Equal(t, scheme.Base64, model.Service.FileScheme)
+		assert.Equal(t, ollama.APIKeyPlaceholder, model.Service.Key)
 	})
 	t.Run("PreserveExistingService", func(t *testing.T) {
 		model := &Model{
@ -235,6 +249,7 @@ func TestModelApplyEngineDefaultsSetsServiceDefaults(t *testing.T) {
 				Uri:           "https://custom.example",
 				FileScheme:    scheme.Base64,
 				RequestFormat: ApiFormatOpenAI,
+				Key:           "custom-key",
 			},
 		}

@ -242,6 +257,7 @@ func TestModelApplyEngineDefaultsSetsServiceDefaults(t *testing.T) {

 		assert.Equal(t, "https://custom.example", model.Service.Uri)
 		assert.Equal(t, scheme.Base64, model.Service.FileScheme)
+		assert.Equal(t, "custom-key", model.Service.Key)
 	})
 }

@ -295,6 +311,38 @@ func TestModelEndpointKeyOpenAIFallbacks(t *testing.T) {
 	})
 }

+func TestModelEndpointKeyOllamaFallbacks(t *testing.T) {
+	t.Run("EnvFile", func(t *testing.T) {
+		dir := t.TempDir()
+		path := filepath.Join(dir, "ollama.key")
+		if err := os.WriteFile(path, []byte("ollama-from-file\n"), 0o600); err != nil {
+			t.Fatalf("write key file: %v", err)
+		}
+
+		ensureEnvOnce = sync.Once{}
+
+		t.Setenv("OLLAMA_API_KEY", "")
+		t.Setenv("OLLAMA_API_KEY_FILE", path)
+
+		model := &Model{Type: ModelTypeCaption, Engine: ollama.EngineName}
+		model.ApplyEngineDefaults()
+
+		if got := model.EndpointKey(); got != "ollama-from-file" {
+			t.Fatalf("expected file key, got %q", got)
+		}
+	})
+	t.Run("EnvVariable", func(t *testing.T) {
+		t.Setenv("OLLAMA_API_KEY", "ollama-env")
+
+		model := &Model{Type: ModelTypeCaption, Engine: ollama.EngineName}
+		model.ApplyEngineDefaults()
+
+		if got := model.EndpointKey(); got != "ollama-env" {
+			t.Fatalf("expected env key, got %q", got)
+		}
+	})
+}
+
 func TestModelGetSource(t *testing.T) {
 	t.Run("NilModel", func(t *testing.T) {
 		var model *Model
@ -347,7 +395,7 @@ func TestModelApplyService(t *testing.T) {
 }

 func TestModel_IsDefault(t *testing.T) {
-	nasnetCopy := *NasnetModel //nolint:govet // copy for test inspection only
+	nasnetCopy := NasnetModel.Clone() //nolint:govet // copy for test inspection only
 	nasnetCopy.Default = false

 	cases := []struct {
@ -362,7 +410,7 @@ func TestModel_IsDefault(t *testing.T) {
 		},
 		{
 			name:  "NasnetCopy",
-			model: &nasnetCopy,
+			model: nasnetCopy,
 			want:  true,
 		},
 		{
--- a/internal/ai/vision/ollama/README.md
+++ b/internal/ai/vision/ollama/README.md
@ -72,6 +72,7 @@ This package provides PhotoPrism’s native adapter for Ollama-compatible multim
 - `PHOTOPRISM_VISION_LABEL_SCHEMA_FILE` — Absolute path to a JSON snippet that overrides the default label schema (applies to every Ollama label model).  
 - `PHOTOPRISM_VISION_YAML` — Custom `vision.yml` path. Keep it synced in Git if you automate deployments.  
 - `OLLAMA_HOST`, `OLLAMA_MODELS`, `OLLAMA_MAX_QUEUE`, `OLLAMA_NUM_PARALLEL`, etc. — Provided in `compose*.yaml` to tune the Ollama daemon. Adjust `OLLAMA_KEEP_ALIVE` if you want models to stay loaded between worker batches.  
+- `OLLAMA_API_KEY` / `OLLAMA_API_KEY_FILE` — Default bearer token picked up when `Service.Key` is empty; useful for hosted Ollama services (e.g., Ollama Cloud).  
 - `PHOTOPRISM_LOG_LEVEL=trace` — Enables verbose request/response previews (truncated to avoid leaking images). Use temporarily when debugging parsing issues.

 #### `vision.yml` Example
--- a/internal/ai/vision/ollama/const.go
+++ b/internal/ai/vision/ollama/const.go
@ -5,4 +5,10 @@ const (
 	EngineName = "ollama"
 	// ApiFormat identifies Ollama-compatible request and response payloads.
 	ApiFormat = "ollama"
+	// APIKeyEnv defines the environment variable used for Ollama API tokens.
+	APIKeyEnv = "OLLAMA_API_KEY" //nolint:gosec // environment variable name, not a secret
+	// APIKeyFileEnv defines the file-based fallback environment variable for Ollama API tokens.
+	APIKeyFileEnv = "OLLAMA_API_KEY_FILE" //nolint:gosec // environment variable name, not a secret
+	// APIKeyPlaceholder is the `${VAR}` form injected when no explicit key is provided.
+	APIKeyPlaceholder = "${" + APIKeyEnv + "}"
 )
--- a/internal/ai/vision/openai/const.go
+++ b/internal/ai/vision/openai/const.go
@ -5,4 +5,10 @@ const (
 	EngineName = "openai"
 	// ApiFormat identifies OpenAI-compatible request and response payloads.
 	ApiFormat = "openai"
+	// APIKeyEnv defines the environment variable used for OpenAI API tokens.
+	APIKeyEnv = "OPENAI_API_KEY" //nolint:gosec // environment variable name, not a secret
+	// APIKeyFileEnv defines the file-based fallback environment variable for OpenAI API tokens.
+	APIKeyFileEnv = "OPENAI_API_KEY_FILE" //nolint:gosec // environment variable name, not a secret
+	// APIKeyPlaceholder is the `${VAR}` form injected when no explicit key is provided.
+	APIKeyPlaceholder = "${" + APIKeyEnv + "}"
 )
--- a/internal/ai/vision/testdata/vision.yml
+++ b/internal/ai/vision/testdata/vision.yml
@ -71,6 +71,7 @@ Models:
  Resolution: 720
  Service:
    Uri: http://ollama:11434/api/generate
+    Key: ${OLLAMA_API_KEY}
    FileScheme: base64
    RequestFormat: ollama
    ResponseFormat: ollama
--- a/internal/ai/vision/vision.go
+++ b/internal/ai/vision/vision.go
@ -29,6 +29,8 @@ import (
 	"strings"
 	"sync"

+	"github.com/photoprism/photoprism/internal/ai/vision/ollama"
+	"github.com/photoprism/photoprism/internal/ai/vision/openai"
 	"github.com/photoprism/photoprism/internal/event"
 	"github.com/photoprism/photoprism/pkg/clean"
 	"github.com/photoprism/photoprism/pkg/fs"
@ -39,21 +41,33 @@ var log = event.Log
 var ensureEnvOnce sync.Once

 // ensureEnv loads environment-backed credentials once so adapters can look up
-// OPENAI_API_KEY even when operators rely on OPENAI_API_KEY_FILE. Future engine
-// integrations can reuse this hook to normalise additional secrets.
+// OPENAI_API_KEY / OLLAMA_API_KEY even when operators rely on *_FILE fallbacks.
+// Future engine integrations can reuse this hook to normalise additional
+// secrets.
 func ensureEnv() {
 	ensureEnvOnce.Do(func() {
-		if os.Getenv("OPENAI_API_KEY") != "" {
-			return
-		}
-
-		if path := strings.TrimSpace(os.Getenv("OPENAI_API_KEY_FILE")); fs.FileExistsNotEmpty(path) {
-			// #nosec G304 path provided via env
-			if data, err := os.ReadFile(path); err == nil {
-				if key := clean.Auth(string(data)); key != "" {
-					_ = os.Setenv("OPENAI_API_KEY", key)
-				}
-			}
-		}
+		loadEnvKeyFromFile(openai.APIKeyEnv, openai.APIKeyFileEnv)
+		loadEnvKeyFromFile(ollama.APIKeyEnv, ollama.APIKeyFileEnv)
 	})
 }
+
+// loadEnvKeyFromFile populates envVar from fileVar when the environment value
+// is empty and the referenced file exists and is non-empty.
+func loadEnvKeyFromFile(envVar, fileVar string) {
+	if os.Getenv(envVar) != "" {
+		return
+	}
+
+	filePath := strings.TrimSpace(os.Getenv(fileVar))
+
+	if !fs.FileExistsNotEmpty(filePath) {
+		return
+	}
+
+	// #nosec G304 path provided via env
+	if data, err := os.ReadFile(filePath); err == nil {
+		if key := clean.Auth(string(data)); key != "" {
+			_ = os.Setenv(envVar, key)
+		}
+	}
+}
--- a/internal/ai/vision/vision_env_test.go
+++ b/internal/ai/vision/vision_env_test.go
@ -0,0 +1,38 @@
+package vision
+
+import (
+	"os"
+	"path/filepath"
+	"testing"
+)
+
+// TestLoadEnvKeyFromFile verifies that loadEnvKeyFromFile reads API keys from
+// *_FILE variables when the primary env var is empty.
+func TestLoadEnvKeyFromFile(t *testing.T) {
+	t.Run("ReadsFileWhenUnset", func(t *testing.T) {
+		dir := t.TempDir()
+		path := filepath.Join(dir, "key.txt")
+		if err := os.WriteFile(path, []byte("file-secret\n"), 0o600); err != nil {
+			t.Fatalf("write key file: %v", err)
+		}
+
+		t.Setenv("TEST_KEY", "")
+		t.Setenv("TEST_KEY_FILE", path)
+
+		loadEnvKeyFromFile("TEST_KEY", "TEST_KEY_FILE")
+
+		if got := os.Getenv("TEST_KEY"); got != "file-secret" {
+			t.Fatalf("expected file-secret, got %q", got)
+		}
+	})
+	t.Run("EnvWinsOverFile", func(t *testing.T) {
+		t.Setenv("TEST_KEY", "keep-env")
+		t.Setenv("TEST_KEY_FILE", "/nonexistent")
+
+		loadEnvKeyFromFile("TEST_KEY", "TEST_KEY_FILE")
+
+		if got := os.Getenv("TEST_KEY"); got != "keep-env" {
+			t.Fatalf("expected keep-env, got %q", got)
+		}
+	})
+}
--- a/internal/api/swagger.json
+++ b/internal/api/swagger.json
@ -4540,7 +4540,7 @@
                    "type": "string"
                },
                "options": {
-                    "$ref": "#/definitions/vision.ApiRequestOptions"
+                    "$ref": "#/definitions/vision.ModelOptions"
                },
                "org": {
                    "type": "string"
@ -4575,113 +4575,6 @@
            },
            "type": "object"
        },
-        "vision.ApiRequestOptions": {
-            "properties": {
-                "combine_outputs": {
-                    "type": "string"
-                },
-                "detail": {
-                    "type": "string"
-                },
-                "force_json": {
-                    "type": "boolean"
-                },
-                "frequency_penalty": {
-                    "type": "number"
-                },
-                "low_vram": {
-                    "type": "boolean"
-                },
-                "main_gpu": {
-                    "type": "integer"
-                },
-                "max_output_tokens": {
-                    "type": "integer"
-                },
-                "min_p": {
-                    "type": "number"
-                },
-                "mirostat": {
-                    "type": "integer"
-                },
-                "mirostat_eta": {
-                    "type": "number"
-                },
-                "mirostat_tau": {
-                    "type": "number"
-                },
-                "num_batch": {
-                    "type": "integer"
-                },
-                "num_ctx": {
-                    "type": "integer"
-                },
-                "num_gpu": {
-                    "type": "integer"
-                },
-                "num_keep": {
-                    "type": "integer"
-                },
-                "num_predict": {
-                    "type": "integer"
-                },
-                "num_thread": {
-                    "type": "integer"
-                },
-                "numa": {
-                    "type": "boolean"
-                },
-                "penalize_newline": {
-                    "type": "boolean"
-                },
-                "presence_penalty": {
-                    "type": "number"
-                },
-                "repeat_last_n": {
-                    "type": "integer"
-                },
-                "repeat_penalty": {
-                    "type": "number"
-                },
-                "schema_version": {
-                    "type": "string"
-                },
-                "seed": {
-                    "type": "integer"
-                },
-                "stop": {
-                    "items": {
-                        "type": "string"
-                    },
-                    "type": "array"
-                },
-                "temperature": {
-                    "type": "number"
-                },
-                "tfs_z": {
-                    "type": "number"
-                },
-                "top_k": {
-                    "type": "integer"
-                },
-                "top_p": {
-                    "type": "number"
-                },
-                "typical_p": {
-                    "type": "number"
-                },
-                "use_mlock": {
-                    "type": "boolean"
-                },
-                "use_mmap": {
-                    "type": "boolean"
-                },
-                "vocab_only": {
-                    "type": "boolean"
-                }
-            },
-            "type": "object"
-        },
        "vision.ApiResponse": {
            "properties": {
                "code": {
@ -4810,7 +4703,7 @@
                    "type": "string"
                },
                "options": {
-                    "$ref": "#/definitions/vision.ApiRequestOptions"
+                    "$ref": "#/definitions/vision.ModelOptions"
                },
                "prompt": {
                    "type": "string"
@ -4855,6 +4748,146 @@
                "EngineLocal"
            ]
        },
+        "vision.ModelOptions": {
+            "properties": {
+                "combine_outputs": {
+                    "description": "OpenAI",
+                    "type": "string"
+                },
+                "detail": {
+                    "description": "OpenAI",
+                    "type": "string"
+                },
+                "force_json": {
+                    "description": "Ollama, OpenAI",
+                    "type": "boolean"
+                },
+                "frequency_penalty": {
+                    "description": "OpenAI",
+                    "type": "number"
+                },
+                "low_vram": {
+                    "description": "Ollama",
+                    "type": "boolean"
+                },
+                "main_gpu": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "max_output_tokens": {
+                    "description": "Ollama, OpenAI",
+                    "type": "integer"
+                },
+                "min_p": {
+                    "description": "Ollama",
+                    "type": "number"
+                },
+                "mirostat": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "mirostat_eta": {
+                    "description": "Ollama",
+                    "type": "number"
+                },
+                "mirostat_tau": {
+                    "description": "Ollama",
+                    "type": "number"
+                },
+                "num_batch": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "num_ctx": {
+                    "description": "Ollama, OpenAI",
+                    "type": "integer"
+                },
+                "num_gpu": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "num_keep": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "num_predict": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "num_thread": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "numa": {
+                    "description": "Ollama",
+                    "type": "boolean"
+                },
+                "penalize_newline": {
+                    "description": "Ollama",
+                    "type": "boolean"
+                },
+                "presence_penalty": {
+                    "description": "OpenAI",
+                    "type": "number"
+                },
+                "repeat_last_n": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "repeat_penalty": {
+                    "description": "Ollama",
+                    "type": "number"
+                },
+                "schema_version": {
+                    "description": "Ollama, OpenAI",
+                    "type": "string"
+                },
+                "seed": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "stop": {
+                    "description": "Ollama, OpenAI",
+                    "items": {
+                        "type": "string"
+                    },
+                    "type": "array"
+                },
+                "temperature": {
+                    "description": "Ollama, OpenAI",
+                    "type": "number"
+                },
+                "tfs_z": {
+                    "description": "Ollama",
+                    "type": "number"
+                },
+                "top_k": {
+                    "description": "Ollama",
+                    "type": "integer"
+                },
+                "top_p": {
+                    "description": "Ollama, OpenAI",
+                    "type": "number"
+                },
+                "typical_p": {
+                    "description": "Ollama",
+                    "type": "number"
+                },
+                "use_mlock": {
+                    "description": "Ollama",
+                    "type": "boolean"
+                },
+                "use_mmap": {
+                    "description": "Ollama",
+                    "type": "boolean"
+                },
+                "vocab_only": {
+                    "description": "Ollama",
+                    "type": "boolean"
+                }
+            },
+            "type": "object"
+        },
        "vision.ModelType": {
            "enum": [
                "labels",