diff --git a/.env.example b/.env.example index cba2ef7..b24c4a1 100644 --- a/.env.example +++ b/.env.example @@ -1,6 +1,6 @@ OLLAMA_HOST=http://localhost:11434 OLLAMA_API_KEY= -OLLAMA_DEFAULT_MODEL=gemma4:e4b-it-q_8 +OLLAMA_DEFAULT_MODEL=gemma4:31b-cloud OPENAI_API_KEY= ANTHROPIC_API_KEY= diff --git a/README.md b/README.md index ec36d0f..3116cc0 100644 --- a/README.md +++ b/README.md @@ -143,7 +143,7 @@ # LLM OLLAMA_HOST=http://localhost:11434 OLLAMA_API_KEY= -OLLAMA_DEFAULT_MODEL=gemma4:e4b-it-q8_0 +OLLAMA_DEFAULT_MODEL=gemma4:31b-cloud OLLAMA_NUM_CTX=65536 OLLAMA_THINK=true diff --git a/docs/api.md b/docs/api.md index 57c045d..f92a8b9 100644 --- a/docs/api.md +++ b/docs/api.md @@ -34,7 +34,7 @@ "description": "General-purpose assistant", "enabled_tools": ["todo", "web_search", "filesystem", "..."], "llm_backend": "ollama", - "model": "gemma4:26b-a4b-it-q4_K_M" + "model": "gemma4:31b-cloud" } ] ``` diff --git a/docs/config.md b/docs/config.md index 9dabc4c..d452d6a 100644 --- a/docs/config.md +++ b/docs/config.md @@ -8,7 +8,7 @@ |---|---|---|---| | `OLLAMA_HOST` | str | `http://localhost:11434` | Ollama server URL | | `OLLAMA_API_KEY` | str | `""` | Ollama Cloud API key for direct `https://ollama.com` access | -| `OLLAMA_DEFAULT_MODEL` | str | `gemma4:e2b-it-q8_0` | Default model (can be overridden per profile) | +| `OLLAMA_DEFAULT_MODEL` | str | `gemma4:31b-cloud` | Default model (can be overridden per profile) | | `OLLAMA_NUM_CTX` | int | `65536` | Context window size in tokens | | `OLLAMA_THINK` | bool | `true` | Enable extended reasoning (thinking) | | `OPENAI_API_KEY` | str | `""` | OpenAI API key (if using OpenAI backend) | @@ -80,7 +80,7 @@ ```dotenv OLLAMA_HOST=http://localhost:11434 OLLAMA_API_KEY= -OLLAMA_DEFAULT_MODEL=gemma4:e2b-it-q8_0 +OLLAMA_DEFAULT_MODEL=gemma4:31b-cloud OLLAMA_NUM_CTX=65536 OLLAMA_THINK=true diff --git a/docs/index.md b/docs/index.md index 768ef83..2f7ebd8 100644 --- a/docs/index.md +++ b/docs/index.md @@ -48,7 +48,7 @@ - **Web framework**: FastAPI + uvicorn - **LLM**: Ollama (primary), OpenAI-compatible backend wired in -- **Default model**: `gemma4:e2b-it-q8_0` (configurable per profile) +- **Default model**: `gemma4:31b-cloud` (configurable per profile) - **Database**: SQLite via aiosqlite - **Logging**: structlog - **Config**: pydantic-settings (reads `.env`) diff --git a/docs/profiles.md b/docs/profiles.md index a73ca4c..1a88459 100644 --- a/docs/profiles.md +++ b/docs/profiles.md @@ -18,7 +18,7 @@ system_prompt: str # loaded from system_prompt.txt enabled_tools: list[str] # tools available in the main loop llm_backend: str = "ollama" - model: str = "gemma4:26b-a4b-it-q4_K_M" + model: str = "gemma4:31b-cloud" max_iterations: int = 10 temperature: float = 0.7 planning_enabled: bool = False @@ -48,9 +48,9 @@ | ID | Name | Model | Temp | Planning | |---|---|---|---|---| -| `secretary` | Personal Secretary | gemma4:26b-a4b-it-q4_K_M | 0.7 | Yes | -| `server_admin` | Server Administrator | gemma4:26b-a4b-it-q4_K_M | 0.2 | Yes | -| `developer` | Tool Developer | gemma4:26b-a4b-it-q4_K_M | 0.2 | Yes | +| `secretary` | Personal Secretary | gemma4:31b-cloud | 0.7 | Yes | +| `server_admin` | Server Administrator | gemma4:31b-cloud | 0.2 | Yes | +| `developer` | Tool Developer | gemma4:31b-cloud | 0.2 | Yes | All profiles share a base tool set. User tools from `tools/enabled.json` are merged in at runtime. @@ -84,7 +84,7 @@ "name": "My Profile", "description": "...", "short_description": "...", - "model": "gemma4:26b-a4b-it-q4_K_M", + "model": "gemma4:31b-cloud", "temperature": 0.5, "max_iterations": 30, "planning_enabled": true, diff --git a/docs/visual.html b/docs/visual.html index ce00e17..b3b2f24 100644 --- a/docs/visual.html +++ b/docs/visual.html @@ -476,7 +476,7 @@
secretarygemma4:26b-a4b-it-q4_K_Mgemma4:31b-cloudserver_admingemma4:26b-a4b-it-q4_K_Mgemma4:31b-cloudsmart_homegemma4:26b-a4b-it-q4_K_Mgemma4:31b-cloud| Variable | Default | Description |
|---|---|---|
OLLAMA_HOST | http://localhost:11434 | Ollama server URL |
OLLAMA_DEFAULT_MODEL | gemma4:e2b-it-q8_0 | Default model (overridable per profile) |
OLLAMA_DEFAULT_MODEL | gemma4:31b-cloud | Default model (overridable per profile) |
OLLAMA_NUM_CTX | 65536 | Context window size in tokens |
OLLAMA_THINK | true | Enable extended reasoning |