Release overview — June 2026
The Curiosity Team
June has been about visibility and choice: see exactly what your AI is costing, run several embedding models side by side, and let agents collaborate with you mid-task. Here's the month so far.
Know what your AI costs
The headline this month is full LLM cost and usage observability:
- Every chat completion now records token counts, latency and finish reason.
- A new admin dashboard charts LLM usage over time — tokens, requests and cost.
- Per-model pricing configuration with historical cost tracking, and reference prices that refresh automatically every day from the Curiosity pricing service.
"Finally, an answer to 'what does this cost us?'"
Token counts, latency, finish reasons and per-model pricing come together into one place — so you can attribute spend, spot runaway usage, and plan capacity with real numbers.
Multi-model vector indexing
- Index with several embedding models at once. Per-model vector indexing supports multiple embedding models simultaneously, managed from a new Vector Indexing settings page with chunking options.
- A built-in compact, multilingual embedding model ships as the default — no external provider required to get good retrieval.
- Import and export embedding vectors as complete bundles.
Agents that ask
Agents can now pause mid-turn to ask multiple-choice questions, rendered as inline questionnaires. Instead of guessing, an agent can check with you and then continue — a small change that makes complex, multi-step tasks far more reliable.
Search that understands filters
The search box now understands inline filters like @file, @webpage,
filetype:, ext: and src: — so you can narrow results without leaving the
keyboard. Per-signal weights genuinely affect ranking under reciprocal-rank
fusion, and internal node types are excluded from results by default.
Operate with less friction
- In-product software updates for Docker deployments, from a new Software Updates page.
- A Deletion Queue observability page showing nodes scheduled for delayed deletion, and agent-run inspection with reusable admin diagnostic snippets.
- Run scheduled tasks directly from the editor, and a Tasks page that separates custom-code tasks from built-in types.
- Newer OpenAI models (GPT-5.2, plus nano and mini variants) are selected correctly, and custom OpenAI hosts auto-detect their API style.
The little things
- Clear error messages on failed chat replies, with a "Try again" button.
- A built-in local chat-name model generates titles with no external provider.
- Faster loading of large graphs through streaming passes.
- Provider logos shown in Models & Pricing instead of raw codes.
This overview covers builds v26.6.67170 → v26.6.67397 (June 1–13). For the full, build-by-build detail see the changelog.