Codebase Indexing¶
Mita can index your codebase for retrieval-augmented generation (RAG). When enabled, the agent automatically retrieves relevant code snippets to provide better, context-aware responses.
How It Works¶
- Parse — Tree-sitter extracts semantic chunks (functions, classes) from your code
- Embed — Ollama generates vector embeddings using
nomic-embed-text - Store — Chunks are stored in a local LanceDB database (
.mita/index/) - Retrieve — When you ask a question, relevant chunks are injected into the LLM context
Build the Index¶
If the embedding model isn't installed, Mita will prompt you to pull it.
Use --force to rebuild from scratch:
Search¶
Status¶
Shows chunk count, index size, and last build time.
Clear¶
Deletes the index. Rebuild with mita index build.
Configuration¶
[index]
enabled = true
chunk_size = 512
chunk_overlap = 64
top_k = 10
exclude_patterns = [
"*.lock",
".mita/**",
"node_modules/**",
".git/**",
"*.min.js",
"*.min.css",
"dist/**",
"build/**",
"__pycache__/**",
]
Supported Languages¶
Tree-sitter parsing supports semantic chunking for: Python, JavaScript, TypeScript, Rust, Go, Java, C, C++, Ruby, PHP, and more. Files in unsupported languages fall back to line-based chunking.