Models
Contents
- Configuration
- Language Detection
- Translation
- Speech-to-Text
Configuration
Models section
Models are configured by name in appsettings.json under Models. Each entry has a Type discriminator and type-specific properties. The name you choose becomes the value of the model field in API requests.
"Models": {
"<name>": {
"Type": "<type>", // required: NLLB | M2M100 | FastText | LibreTranslate | Whisper
// ... type-specific properties (see each model below)
}
}
Multiple entries of the same type are allowed — just use different names. If Type is missing or unknown, the service will fail to start with a configuration error.
Properties for NLLB and M2M100:
| Property | Default | Description |
|---|---|---|
Path |
— | Path to the directory with model files |
EncoderFile |
encoder_model.onnx |
Encoder ONNX filename |
DecoderFile |
decoder_model.onnx |
Decoder ONNX filename |
TokenizerFile |
sentencepiece.bpe.model |
SentencePiece tokenizer filename |
TokenizerConfigFile |
"" |
Secondary tokenizer config (tokenizer.json for NLLB, added_tokens.json for M2M100) |
MaxTokens |
512 |
Maximum tokens per translation |
BeamSize |
1 |
Beam search width (NLLB only; higher = better quality, slower) |
VocabFile |
"" |
BPE vocabulary file (M2M100 only: vocab.json) |
Properties for FastText:
| Property | Default | Description |
|---|---|---|
Path |
— | Path to the model file (.bin or .ftz) |
LabelFormat |
"flores200" |
Format of the model’s output labels. Use "iso639-1" for LID-176 (outputs __label__en), "flores200" for GlotLID (outputs __label__eng_Latn). Also supports "iso639-2", "iso639-3". |
LabelPrefix |
"__label__" |
Prefix to strip from each label before format conversion. |
LabelSuffix |
"" |
Optional suffix to strip from each label. |
Properties for LibreTranslate:
| Property | Default | Description |
|---|---|---|
BaseUrl |
— | URL of the LibreTranslate instance, e.g. http://libretranslate:5000 |
ApiKey |
"" |
API key, if the instance requires one |
Type compatibility
Type identifies the tokenizer format, not the exact model. This means fine-tuned or alternative models are supported as long as the tokenizer is compatible:
| Type | Compatible with |
|---|---|
NLLB |
Any ONNX encoder-decoder model using the NLLB-200 SentencePiece tokenizer with FLORES-200 language tokens |
M2M100 |
Any ONNX encoder-decoder model using the M2M-100 tokenizer (vocab.json + added_tokens.json, ISO 639-1 __lang__ tokens) |
FastText |
Any fastText supervised classification model in .bin or .ftz format |
LibreTranslate |
Any LibreTranslate-compatible HTTP API endpoint |
Whisper |
Any Whisper ggml model file (ggml-*.bin) compatible with whisper.cpp / Whisper.net |
Models not compatible without a new type: MarianMT (OPUS-MT), mBART-50, SeamlessM4T — they use different tokenizer formats.
Translation section
Controls routing and lifecycle of loaded models.
"Translation": {
"DefaultModel": "m2m100_418M", // model used when the request's model field is empty
"AutoDetect": "lid-176-ftz", // name of a FastText model for language auto-detection
"AudioToText": "whisper-small", // name of a Whisper model for speech-to-text; "" = STT disabled
"AllowedModels": ["m2m100_418M"], // allowlist; empty = all configured translation models allowed
"ModelTtlMinutes": 30 // minutes of inactivity before a model is unloaded from memory
}
| Property | Default | Description |
|---|---|---|
DefaultModel |
"" |
Name of the model to use when model is not specified in the request. If empty and the request omits model, the request fails. |
AutoDetect |
"" |
Name of a FastText model used for language auto-detection. Required to use source_language: "auto" in TranslateText or the DetectLanguage RPC. If empty or the model file is missing, falls back to heuristic detection. |
AudioToText |
"" |
Name of a Whisper model entry used for speech-to-text transcription (TranscribeAudio RPC). If empty, the RPC returns FAILED_PRECONDITION. |
AllowedModels |
[] |
Restricts which translation models clients may request by name. Empty list means all configured translation models are accessible. Whisper and FastText entries are not affected by this list. |
ModelTtlMinutes |
30 |
Translation models and the Whisper STT model are kept in memory for this many minutes after last use, then unloaded to free resources. The auto-detect language detector (Translation:AutoDetect) is loaded once at startup and is not affected by this TTL. |
Language Detection
Language detection models are used for automatic source language detection (source_language: "auto" in TranslateText, or the DetectLanguage RPC). They are not translation models and cannot be used as model in translation requests. Configure the active detector via Translation:AutoDetect.
LID-176
176 languages · fastText binary format (.ftz)
Facebook’s compact language identification model. Fast and lightweight (~1 MB). Best choice when you only need common languages and care about startup time.
License: CC-BY-SA-3.0 Commercial use is allowed. You must credit Facebook Research. If you distribute a modified version of the model itself, it must remain under the same license — this does not apply to using it as a backend service.
Download
huggingface-cli download lopatnov/fasttext-language-id \
--local-dir ./models/langdetect/lid176
HuggingFace repo: lopatnov/fasttext-language-id
appsettings.json
"Models": {
"langdetect": {
"Type": "FastText",
"Path": "./models/langdetect/lid176/lid.176.ftz",
"LabelFormat": "iso639-1" // LID-176 outputs ISO 639-1 codes (en, de, fr, …)
}
},
"Translation": {
"AutoDetect": "langdetect"
}
GlotLID v3
1633 language varieties · fastText binary format (.bin)
Covers a much wider range of languages than LID-176, including low-resource and minority languages. Larger model (~1.6 GB).
License: Apache 2.0 Unrestricted commercial use. No attribution required.
Download
huggingface-cli download lopatnov/glotlid \
--local-dir ./models/langdetect/glotlid
HuggingFace repo: lopatnov/glotlid
appsettings.json
"Models": {
"langdetect": {
"Type": "FastText",
"Path": "./models/langdetect/glotlid/model_v3.bin",
"LabelFormat": "flores200" // GlotLID outputs FLORES-200 codes (eng_Latn, ukr_Cyrl, …) — this is the default
}
},
"Translation": {
"AutoDetect": "langdetect"
}
Translation
NLLB-200 (600M distilled)
200 languages · ONNX
Meta’s No Language Left Behind model, distilled to 600M parameters. Good balance of quality and speed. Recommended starting point.
License: CC-BY-NC-4.0 ⚠️ Non-commercial use only. Cannot be used in commercial products or services.
Download
huggingface-cli download lopatnov/nllb-200-distilled-600M-onnx \
--local-dir ./models/nllb-600m
HuggingFace repo: lopatnov/nllb-200-distilled-600M-onnx
appsettings.json
"Models": {
"nllb": {
"Type": "NLLB",
"Path": "./models/nllb-600m",
"EncoderFile": "encoder_model.onnx",
"DecoderFile": "decoder_model.onnx",
"TokenizerFile": "sentencepiece.bpe.model",
"TokenizerConfigFile": "tokenizer.json",
"MaxTokens": 512,
"BeamSize": 1
}
},
"Translation": {
"DefaultModel": "nllb"
}
NLLB-200 1.3B
200 languages · ONNX
Higher quality than the 600M distilled variant at the cost of more memory (~5 GB).
License: CC-BY-NC-4.0 ⚠️ Non-commercial use only.
Download
huggingface-cli download lopatnov/nllb-200-1.3B-onnx \
--local-dir ./models/nllb-1.3b
HuggingFace repo: lopatnov/nllb-200-1.3B-onnx
appsettings.json
Same as 600M distilled — change Path to ./models/nllb-1.3b.
NLLB-200 3.3B
200 languages · ONNX
Highest quality NLLB variant. Requires significant RAM/VRAM (~12 GB).
License: CC-BY-NC-4.0 ⚠️ Non-commercial use only.
Download
huggingface-cli download lopatnov/nllb-200-3.3B-onnx \
--local-dir ./models/nllb-3.3b
HuggingFace repo: lopatnov/nllb-200-3.3B-onnx
appsettings.json
Same as 600M distilled — change Path to ./models/nllb-3.3b.
M2M-100 (418M)
100 languages · ONNX
Facebook’s many-to-many translation model. MIT-licensed — suitable for commercial use. 418M parameter variant, lower memory footprint.
License: MIT ✅ Unrestricted commercial use.
Download
huggingface-cli download lopatnov/m2m100_418M-onnx \
--local-dir ./models/m2m100-418m
HuggingFace repo: lopatnov/m2m100_418M-onnx
appsettings.json
"Models": {
"m2m100": {
"Type": "M2M100",
"Path": "./models/m2m100-418m",
"EncoderFile": "encoder_model.onnx",
"DecoderFile": "decoder_model.onnx",
"TokenizerFile": "sentencepiece.bpe.model",
"TokenizerConfigFile": "added_tokens.json",
"VocabFile": "vocab.json",
"MaxTokens": 512
}
},
"Translation": {
"DefaultModel": "m2m100"
}
M2M-100 (1.2B)
100 languages · ONNX
Higher quality than the 418M variant. Good choice for commercial deployments where translation quality matters.
License: MIT ✅ Unrestricted commercial use.
Download
huggingface-cli download lopatnov/m2m100_1.2B-onnx \
--local-dir ./models/m2m100-1.2b
HuggingFace repo: lopatnov/m2m100_1.2B-onnx
appsettings.json
Same as 418M — change Path to ./models/m2m100-1.2b.
LibreTranslate
Argos Translate backend · HTTP API
An open-source machine translation server. Runs as a separate Docker container alongside the service. Useful as an additional translation option or fallback.
License: AGPL-3.0 (LibreTranslate server) · Argos Translate language packages are MIT/CC-BY licensed. ⚠️ AGPL requires that the source code of any network-accessible service using LibreTranslate is made publicly available. Evaluate whether this is acceptable for your use case before using in production.
Setup
-
In
docker/docker-compose.yml, uncomment thelibretranslate:service block, thedepends_on:section in thetranslate:service, and thelibretranslate-models:volume at the bottom of the file. -
Add to
appsettings.json:
"Models": {
"libretranslate": {
"Type": "LibreTranslate",
"BaseUrl": "http://libretranslate:5000",
"ApiKey": "" // set if your LibreTranslate instance requires a key
}
},
"Translation": {
"DefaultModel": "libretranslate"
}
The BaseUrl http://libretranslate:5000 works when running via Docker Compose. For an external instance, replace with its URL.
Speech-to-Text
Whisper
OpenAI Whisper · ggml format · 99 languages
Runs locally via Whisper.net (a managed wrapper over whisper.cpp). Accepts any WAV file — resampled automatically to 16 kHz mono before inference. The model is loaded lazily on first request and unloaded after ModelTtlMinutes of inactivity.
License: MIT ✅ Unrestricted commercial use.
Download
# small (~500 MB) — recommended default
huggingface-cli download lopatnov/whisper.cpp ggml-small.bin \
--local-dir ./models/audio-to-text/whisper.cpp
# medium (~1.5 GB) — better quality
huggingface-cli download lopatnov/whisper.cpp ggml-medium.bin \
--local-dir ./models/audio-to-text/whisper.cpp
HuggingFace repo: lopatnov/whisper.cpp
appsettings.json
"Models": {
"whisper-small": {
"Type": "Whisper",
"Path": "./models/audio-to-text/whisper.cpp/ggml-small.bin"
},
"whisper-medium": {
"Type": "Whisper",
"Path": "./models/audio-to-text/whisper.cpp/ggml-medium.bin"
}
},
"Translation": {
"AudioToText": "whisper-small" // switch to "whisper-medium" for better quality
}
To switch models: change AudioToText — no code changes needed.
Properties for Whisper:
| Property | Required | Description |
|---|---|---|
Path |
✅ | Path to the .bin ggml model file |