Template

Files

T

apairon 4020ad62c5 ✨ feat: enhance medialib image handling and add asset URL resolution

- Implemented `resolveApiAssetUrl` function to normalize asset URLs based on API base.
- Updated `MedialibImage` component to utilize new asset URL resolution and added support for alt text and class properties.
- Enhanced image loading behavior with improved width measurement and focal point handling.
- Added placeholder image handling and improved accessibility with alt text.
- Introduced new test script for auditing broken links in skill documentation.
- Expanded seeded test content to include medialib entries and updated related tests for pagebuilder previews.
- Improved global setup and teardown logging for clarity on seeded content management.

2026-05-17 00:52:41 +00:00

5.8 KiB

Raw Blame History

name, description

name	description
search-and-embeddings	Model search and semantic retrieval for tibi website projects. Covers embedding provider configuration, collection search modes, auto-regeneration, regenerate-search admin flows, and how later agents should decide between no search, classic search, ngram search, and vector search.

search-and-embeddings

When to use this skill

Use this skill when:

a project needs explicit search behavior beyond generic CRUD filtering
search should be typo-tolerant, weighted, or semantic
embedding providers must be configured
later agents need a clear yes/no decision for search instead of vague optionality

Goal

Give later agents a practical workflow for deciding whether search is needed and, if yes, which search mode belongs to the project.

This skill is separate from editor AI features. Search and embeddings affect content retrieval, operational setup, and index/regeneration behavior, not just editor assistance.

Source of truth

Use these sources when implementing or reviewing search behavior:

tibi-server/docs/02-configuration.md
tibi-server/docs/04-collections.md
tibi-server/docs/09-llm-integration.md
.agents/skills/nova-ai-editor-features/SKILL.md
.agents/skills/mongodb-and-indexes/SKILL.md

First decision: no search vs explicit search

Do not leave search in an implied state.

Make one explicit decision:

no search in this project
classic keyword search only
fuzzy substring search (ngram)
semantic/vector search
hybrid search with deliberate ranking behavior

If the answer is “not used”, document that clearly so later agents do not accidentally wire providers or regress into half-configured search.

Server-level provider setup

Embedding providers are configured server-side:

embedding:
    providers:
        - name: bge-m3
          type: native
          modelPath: /models/bge-m3
          dimensions: 1024
        - name: openai-embed
          type: openai
          model: text-embedding-3-small
          apiKey: ${EMBEDDING_OPENAI-EMBED_APIKEY}
          baseURL: https://api.openai.com/v1
          dimensions: 1536

Important:

collection search config references the provider by name
embedding secrets and model paths can come from environment variables
vector search is not only a collection concern; the server must actually provide the embedding backend

Collection search modes

Tibi supports multiple search modes via collection search: config:

text
regex
eval
filter
ngram
vector

Use explicit search configs when search is a real product feature. Auto-fallback is useful, but it is not a substitute for a deliberate retrieval model.

Choosing the right mode

`text`

Use when:

MongoDB text indexing is sufficient
exact field ownership of the text index is clear
keyword search is enough

Requires a text index.

`regex`

Use when:

the searchable fields are explicit
case-insensitive matching is enough
weighted field scoring is useful

Good for smaller datasets or precise keyed fields.

`filter` or `eval`

Use when:

search logic depends on auth, project context, or business-specific filtering
plain keyword matching is not the full contract

Treat these as controlled power tools. The resulting filters are still sanitized against blocked operators.

`ngram`

Use when:

typo tolerance or substring matching is needed
users search codes, names, transliterated terms, or partial inputs

This is enrichment-based search. It stores generated _search data and benefits from clear regeneration expectations.

`vector`

Use when:

semantic similarity matters more than literal keyword overlap
the project can support embedding-provider setup and operator cost expectations
search quality justifies added complexity

Vector mode can use:

fields
custom eval transformation
documentPrefix
queryPrefix
overflow: truncate|chunk
rrf tuning for hybrid scoring

Auto-regeneration and admin flows

For ngram and vector, autoRegenerate: true can refresh stale enrichment data after config changes.

If regeneration is needed manually, the admin flow depends on project admin tokens with:

allowRegenerateSearch: true

Treat regeneration as part of the search contract, not as an implementation footnote.

The LLM system and the embedding system are adjacent, but they are not the same thing.

llm.providers drive chat/completion features
embedding.providers drive vector search enrichment
org/user budgets affect LLM usage workflows
search design still needs its own retrieval and operator decisions

Do not assume that enabling editor AI automatically defines a sound search architecture.

Anti-patterns

leaving search unspecified and hoping auto-fallback is “good enough”
enabling vector search without a real provider/runtime plan
forgetting text indexes for mode: text
enabling enrichment modes without a regeneration story
mixing editor AI decisions with search decisions until neither is clear

Verification checklist

After search-related changes, verify all of these:

the project has an explicit yes/no search decision
server-side embedding providers exist when vector search is configured
required text or search indexes exist
?q= and ?qName= behavior matches the intended search contract
regeneration behavior is defined for enrichment-based modes

What an LLM should inspect first

When asked to add or review search on this starter, inspect in this order:

tibi-server/docs/04-collections.md
tibi-server/docs/02-configuration.md
existing collection search: config
whether the project needs keyword, fuzzy, semantic, or no search
operator expectations for regeneration and provider secrets

This prevents over-engineered vector setups and under-specified search behavior.

5.8 KiB Raw Blame History

search-and-embeddings

When to use this skill

Goal

Source of truth

First decision: no search vs explicit search

Server-level provider setup

Collection search modes

Choosing the right mode

text

regex

filter or eval

ngram

vector

Auto-regeneration and admin flows

Search and LLM are related but not identical

Anti-patterns

Verification checklist

What an LLM should inspect first

5.8 KiB

Raw Blame History

`text`

`regex`

`filter` or `eval`

`ngram`

`vector`