No configuration is required to get started. docmd-search uses sensible defaults and a tiered config system that lets you override settings at each level.
Config resolution
Settings are merged in order - each layer overrides the previous:
Defaults
Built-in values that work for most projects.
Global config
~/.docmd-search/config.json - applies to all projects on your system. Created by the first-run wizard.
Project config
.docmd-search/config.json - per-project overrides that live in your repository.
CLI flags
Command-line flags take highest priority and override everything else.
Default values
{
"model": "Xenova/all-MiniLM-L6-v2",
"chunkSize": 256,
"chunkOverlap": 32,
"include": ["**/*.md", "**/*.txt", "**/*.html"],
"exclude": [
"**/node_modules/**",
"**/dist/**",
"**/build/**",
"**/site/**",
"**/.git/**",
"**/.docmd-search/**"
],
"outDir": ".docmd-search",
"incremental": true,
"topK": 10
}
All options
| Option | Type | Default | Description |
|---|---|---|---|
model |
string |
Xenova/all-MiniLM-L6-v2 |
Embedding model identifier (see Model selection below) |
chunkSize |
number |
256 |
Maximum tokens per chunk |
chunkOverlap |
number |
32 |
Overlap between adjacent chunks (in tokens) |
include |
string[] |
["**/*.md", "**/*.txt", "**/*.html"] |
Glob patterns for files to index |
exclude |
string[] |
(see above) | Glob patterns for files to skip |
outDir |
string |
.docmd-search |
Output directory for the index (relative to project root) |
incremental |
boolean |
true |
Only re-index changed files on subsequent runs |
topK |
number |
10 |
Maximum search results to return |
Global config
Located at ~/.docmd-search/config.json. Created automatically by the first-run wizard. Contains your model selection and wizard status.
{
"model": "Xenova/all-MiniLM-L6-v2",
"wizardCompleted": true
}
To reconfigure:
docmd-search --settings
Downloaded models are cached at ~/.docmd-search/models/ - a stable location that survives npm install and package upgrades. Switching models does not delete previous downloads, so you can switch back without re-downloading.
Project config
Create .docmd-search/config.json in your project root to override settings for a specific project.
{
"model": "Xenova/paraphrase-multilingual-MiniLM-L12-v2",
"chunkSize": 512,
"chunkOverlap": 64,
"include": ["docs/**/*.md"],
"exclude": ["docs/drafts/**", "docs/archive/**"]
}
You only need to specify the fields you want to change. Unspecified fields inherit from the global config or defaults.
CLI overrides
The --model flag overrides the model for a single run without modifying any config file:
# Use the multilingual model for a single run
docmd-search ./docs --model Xenova/paraphrase-multilingual-MiniLM-L12-v2
Chunk sizing guide
Chunk size affects search quality and index size:
| Scenario | Recommended chunkSize |
Notes |
|---|---|---|
| Short API docs | 128 |
Smaller chunks = more precise matches |
| General documentation | 256 (default) |
Good balance of precision and context |
| Long-form guides | 512 |
Keeps more context per result |
| Technical references | 128-256 |
Precise matches for specific parameters |
chunkOverlap ensures content near chunk boundaries isn’t lost. A value of 32 (default) works well for most cases. Set to 0 only if your content has very clear section boundaries.
Include/exclude patterns
Patterns follow standard glob syntax:
{
"include": [
"docs/**/*.md",
"guides/**/*.md",
"api/**/*.html"
],
"exclude": [
"**/node_modules/**",
"**/drafts/**",
"**/*.draft.md",
"docs/internal/**"
]
}
The exclude list always includes common system directories (node_modules, .git, dist, build, etc.) by default. Your custom excludes are added on top.
Model selection
All models run in Int8-quantized form (q8) - 4× smaller and 2-3× faster than full precision with minimal quality loss. Models are downloaded once and cached at ~/.docmd-search/models/.
| Model | Size | Languages | Speed | Best for |
|---|---|---|---|---|
Xenova/all-MiniLM-L6-v2 (default) |
~23 MB | English only | ⚡ Fastest | English-only documentation |
Xenova/paraphrase-multilingual-MiniLM-L12-v2 |
~118 MB | 50+ languages | Fast | i18n docs (Chinese, German, etc.) |
Xenova/multilingual-e5-small |
~118 MB | 100+ languages | Fast | Wide language coverage |
Xenova/paraphrase-multilingual-mpnet-base-v2 |
~270 MB | 50+ languages | Medium | Best multilingual quality |
The default model (all-MiniLM-L6-v2) is trained on English text only. If your docs include Chinese, German, French, or other languages, switch to a multilingual model or search quality will be poor for non-English content.
You can use any HuggingFace model with ONNX weights compatible with Transformers.js. Browse at huggingface.co/models and filter by transformers.js library. Ensure the model repo contains an onnx/ folder.
docmd integration
When using docmd-search as a plugin inside a docmd project, configuration happens in your docmd.config.js:
// docmd.config.js
export default {
plugins: {
search: {
semantic: true, // ← enables docmd-search
model: 'Xenova/bge-small-en-v1.5', // optional model override
chunkSize: 512, // optional
chunkOverlap: 64, // optional
}
}
};
How it works
When semantic: true is set:
- docmd’s plugin-search dynamically imports
docmd-searchat build time - The indexer runs over your docs source directory
- The semantic index is written to
<outputDir>/.docmd-search/ - The browser client bundle is served instead of MiniSearch
If docmd-search is not installed, plugin-search falls back to keyword search and prints a helpful install message.
Pre-built index (advanced)
If you’ve already built an index with the standalone CLI, you can tell docmd to use it directly:
// docmd.config.js
export default {
plugins: {
search: {
semantic: true,
indexDir: '/path/to/.docmd-search', // ← use pre-built index
}
}
};
When indexDir is provided and contains a valid manifest.json, plugin-search skips indexing entirely and just serves from that directory. This is how docmd-search --ui works - it builds the index first, then spawns docmd with a config pointing at the pre-built index.