Skip to the content.

Models

Contents


Configuration

Models section

Models are configured by name in appsettings.json under Models. Each entry has a Type discriminator and type-specific properties. The name you choose becomes the value of the model field in API requests.

"Models": {
  "<name>": {
    "Type": "<type>",  // required: NLLB | M2M100 | FastText | LibreTranslate | Whisper | Piper | Redirect
    // ... type-specific properties (see each model below)
  }
}

Multiple entries of the same type are allowed — just use different names. If Type is missing or unknown, the service will fail to start with a configuration error.

Properties for NLLB and M2M100:

Property Default Description
Path Path to the directory with model files
EncoderFile encoder_model.onnx Encoder ONNX filename
DecoderFile decoder_model.onnx Decoder ONNX filename
TokenizerFile sentencepiece.bpe.model SentencePiece tokenizer filename
TokenizerConfigFile "" Secondary tokenizer config (tokenizer.json for NLLB, added_tokens.json for M2M100)
MaxTokens 512 Maximum tokens per translation
BeamSize 1 Beam search width (NLLB only; higher = better quality, slower)
VocabFile "" BPE vocabulary file (M2M100 only: vocab.json)
ExecutionProvider "" (auto) ONNX execution provider. "" or "auto" = probe best available (DirectML on Windows, CUDA on Linux, then CPU). Explicit: "cpu", "directml", "cuda". Falls back to CPU with a warning if the requested provider is unavailable.

Properties for FastText:

Property Default Description
Path Path to the model file (.bin or .ftz)
LabelFormat "flores200" Format of the model’s output labels. Use "iso639-1" for LID-176 (outputs __label__en), "flores200" for GlotLID (outputs __label__eng_Latn). Also supports "iso639-2", "iso639-3".
LabelPrefix "__label__" Prefix to strip from each label before format conversion.
LabelSuffix "" Optional suffix to strip from each label.

Properties for LibreTranslate:

Property Default Description
BaseUrl URL of the LibreTranslate instance, e.g. http://libretranslate:5000
ApiKey "" API key, if the instance requires one

Properties for Redirect:

Forwards translation requests to another Lopatnov.Translate gRPC service instance (e.g. on a different machine). Allows distributing models across servers.

Property Default Description
RedirectUrl gRPC endpoint of the remote service, e.g. http://192.168.1.100:5100
RedirectName "" Model name to request on the remote; defaults to the local key name if empty

Cycle detection is built in: a random x-redirect-id header is propagated through the hop chain. If a request returns to the originating server, a FAILED_PRECONDITION error is returned immediately.

"Models": {
  "remote-m2m100": {
    "Type": "Redirect",
    "RedirectUrl": "http://192.168.1.100:5100",
    "RedirectName": "m2m100_1.2B"   // model key on the remote; "" = same as local key
  }
},
"Translation": {
  "AllowedModels": ["remote-m2m100"]
}

Properties for Piper:

See the Piper TTS section below.

Properties for Whisper:

See the Whisper section below.

Type compatibility

Type identifies the tokenizer format, not the exact model. This means fine-tuned or alternative models are supported as long as the tokenizer is compatible:

Type Compatible with
NLLB Any ONNX encoder-decoder model using the NLLB-200 SentencePiece tokenizer with FLORES-200 language tokens
M2M100 Any ONNX encoder-decoder model using the M2M-100 tokenizer (vocab.json + added_tokens.json, ISO 639-1 __lang__ tokens)
FastText Any fastText supervised classification model in .bin or .ftz format
LibreTranslate Any LibreTranslate-compatible HTTP API endpoint
Whisper Any Whisper ggml model file (ggml-*.bin) compatible with whisper.cpp / Whisper.net
Piper Any Piper TTS ONNX voice with a companion .onnx.json sidecar (phoneme_id_map, sample_rate, espeak voice)
Redirect Another Lopatnov.Translate gRPC service instance (any model type on the remote)

Models not compatible without a new type: MarianMT (OPUS-MT), mBART-50, SeamlessM4T — they use different tokenizer formats.


Translation section

Controls routing and lifecycle of loaded models.

"Translation": {
  "DefaultModel": "m2m100_418M",  // model used when the request's model field is empty
  "AutoDetect": "lid-176-ftz",    // name of a FastText model for language auto-detection
  "AudioToText": "whisper-small", // name of a Whisper model for speech-to-text; "" = STT disabled
  "AllowedModels": ["m2m100_418M"], // allowlist; empty = all configured translation models allowed
  "ModelTtlMinutes": 30,          // minutes of inactivity before a model is unloaded from memory
  "TextToAudio": {                // language code → Piper model key; absent = TTS disabled
    "en": "piper-en-US",
    "ru": "piper-ru-RU",
    "uk": "piper-uk-UA"
  },
  "WarmUp": ["m2m100_418M", "whisper-small"]  // pre-load these models at startup
}
Property Default Description
DefaultModel "" Name of the model to use when model is not specified in the request. If empty and the request omits model, the request fails.
AutoDetect "" Name of a FastText model used for language auto-detection. Required to use source_language: "auto" in TranslateText or the DetectLanguage RPC. If empty or the model file is missing, falls back to heuristic detection.
AudioToText "" Name of a Whisper model entry used for speech-to-text transcription (TranscribeAudio RPC). If empty, the RPC returns FAILED_PRECONDITION.
TextToAudio {} Dictionary of ISO 639-1 language codes → Piper model keys. Used by SynthesizeSpeech and TranslateAudio. If empty or absent, TTS RPCs return FAILED_PRECONDITION.
AllowedModels [] Restricts which translation models clients may request by name. Empty list means all configured translation models are accessible. Whisper, FastText, and Piper entries are not affected by this list.
ModelTtlMinutes 30 Translation models and the Whisper/Piper models are kept in memory for this many minutes after last use, then unloaded to free resources. The auto-detect language detector is loaded once at startup and is not affected by this TTL.
WarmUp [] List of model keys to pre-load at service startup. Runs concurrently with request serving; each model logs elapsed time. Failures are warnings — the service stays up.

Language Detection

Language detection models are used for automatic source language detection (source_language: "auto" in TranslateText, or the DetectLanguage RPC). They are not translation models and cannot be used as model in translation requests. Configure the active detector via Translation:AutoDetect.


LID-176

176 languages · fastText binary format (.ftz)

Facebook’s compact language identification model. Fast and lightweight (~1 MB). Best choice when you only need common languages and care about startup time.

License: CC-BY-SA-3.0 Commercial use is allowed. You must credit Facebook Research. If you distribute a modified version of the model itself, it must remain under the same license — this does not apply to using it as a backend service.

Download

hf download lopatnov/fasttext-language-id \
  --local-dir ./models/langdetect/lid176

HuggingFace repo: lopatnov/fasttext-language-id

appsettings.json

"Models": {
  "langdetect": {
    "Type": "FastText",
    "Path": "./models/langdetect/lid176/lid.176.ftz",
    "LabelFormat": "iso639-1"   // LID-176 outputs ISO 639-1 codes (en, de, fr, …)
  }
},
"Translation": {
  "AutoDetect": "langdetect"
}

GlotLID v3

1633 language varieties · fastText binary format (.bin)

Covers a much wider range of languages than LID-176, including low-resource and minority languages. Larger model (~1.6 GB).

License: Apache 2.0 Unrestricted commercial use. No attribution required.

Download

hf download lopatnov/glotlid \
  --local-dir ./models/langdetect/glotlid

HuggingFace repo: lopatnov/glotlid

appsettings.json

"Models": {
  "langdetect": {
    "Type": "FastText",
    "Path": "./models/langdetect/glotlid/model_v3.bin",
    "LabelFormat": "flores200"  // GlotLID outputs FLORES-200 codes (eng_Latn, ukr_Cyrl, …) — this is the default
  }
},
"Translation": {
  "AutoDetect": "langdetect"
}

Translation


NLLB-200 (600M distilled)

200 languages · ONNX

Meta’s No Language Left Behind model, distilled to 600M parameters. Good balance of quality and speed. Recommended starting point.

License: CC-BY-NC-4.0 ⚠️ Non-commercial use only. Cannot be used in commercial products or services.

Download

hf download lopatnov/nllb-200-distilled-600M-onnx \
  --local-dir ./models/nllb-600m

HuggingFace repo: lopatnov/nllb-200-distilled-600M-onnx

appsettings.json

"Models": {
  "nllb": {
    "Type": "NLLB",
    "Path": "./models/nllb-600m",
    "EncoderFile": "encoder_model.onnx",
    "DecoderFile": "decoder_model.onnx",
    "TokenizerFile": "sentencepiece.bpe.model",
    "TokenizerConfigFile": "tokenizer.json",
    "MaxTokens": 512,
    "BeamSize": 1
  }
},
"Translation": {
  "DefaultModel": "nllb"
}

NLLB-200 1.3B

200 languages · ONNX

Higher quality than the 600M distilled variant at the cost of more memory (~5 GB).

License: CC-BY-NC-4.0 ⚠️ Non-commercial use only.

Download

hf download lopatnov/nllb-200-1.3B-onnx \
  --local-dir ./models/nllb-1.3b

HuggingFace repo: lopatnov/nllb-200-1.3B-onnx

appsettings.json

Same as 600M distilled — change Path to ./models/nllb-1.3b.


NLLB-200 3.3B

200 languages · ONNX

Highest quality NLLB variant. Requires significant RAM/VRAM (~12 GB).

License: CC-BY-NC-4.0 ⚠️ Non-commercial use only.

Download

hf download lopatnov/nllb-200-3.3B-onnx \
  --local-dir ./models/nllb-3.3b

HuggingFace repo: lopatnov/nllb-200-3.3B-onnx

appsettings.json

Same as 600M distilled — change Path to ./models/nllb-3.3b.


M2M-100 (418M)

100 languages · ONNX

Facebook’s many-to-many translation model. MIT-licensed — suitable for commercial use. 418M parameter variant, lower memory footprint.

License: MIT ✅ Unrestricted commercial use.

Download

hf download lopatnov/m2m100_418M-onnx \
  --local-dir ./models/m2m100-418m

HuggingFace repo: lopatnov/m2m100_418M-onnx

appsettings.json

"Models": {
  "m2m100": {
    "Type": "M2M100",
    "Path": "./models/m2m100-418m",
    "EncoderFile": "encoder_model.onnx",
    "DecoderFile": "decoder_model.onnx",
    "TokenizerFile": "sentencepiece.bpe.model",
    "TokenizerConfigFile": "added_tokens.json",
    "VocabFile": "vocab.json",
    "MaxTokens": 512
  }
},
"Translation": {
  "DefaultModel": "m2m100"
}

M2M-100 (1.2B)

100 languages · ONNX

Higher quality than the 418M variant. Good choice for commercial deployments where translation quality matters.

License: MIT ✅ Unrestricted commercial use.

Download

hf download lopatnov/m2m100_1.2B-onnx \
  --local-dir ./models/m2m100-1.2b

HuggingFace repo: lopatnov/m2m100_1.2B-onnx

appsettings.json

Same as 418M — change Path to ./models/m2m100-1.2b.


LibreTranslate

Argos Translate backend · HTTP API

An open-source machine translation server. Runs as a separate Docker container alongside the service. Useful as an additional translation option or fallback.

License: AGPL-3.0 (LibreTranslate server) · Argos Translate language packages are MIT/CC-BY licensed. ⚠️ AGPL requires that the source code of any network-accessible service using LibreTranslate is made publicly available. Evaluate whether this is acceptable for your use case before using in production.

Setup

  1. In docker/docker-compose.yml, uncomment the libretranslate: service block, the depends_on: section in the translate: service, and the libretranslate-models: volume at the bottom of the file.

  2. Add to appsettings.json:

"Models": {
  "libretranslate": {
    "Type": "LibreTranslate",
    "BaseUrl": "http://libretranslate:5000",
    "ApiKey": ""  // set if your LibreTranslate instance requires a key
  }
},
"Translation": {
  "DefaultModel": "libretranslate"
}

The BaseUrl http://libretranslate:5000 works when running via Docker Compose. For an external instance, replace with its URL.


Speech-to-Text


Whisper

OpenAI Whisper · ggml format · 99 languages

Runs locally via Whisper.net (a managed wrapper over whisper.cpp). Accepts any WAV file — resampled automatically to 16 kHz mono before inference. The model is loaded lazily on first request and unloaded after ModelTtlMinutes of inactivity.

License: MIT ✅ Unrestricted commercial use.

Download

# small (~500 MB) — recommended default
hf download lopatnov/whisper.cpp ggml-small.bin \
  --local-dir ./models/audio-to-text/whisper.cpp

# medium (~1.5 GB) — better quality
hf download lopatnov/whisper.cpp ggml-medium.bin \
  --local-dir ./models/audio-to-text/whisper.cpp

HuggingFace repo: lopatnov/whisper.cpp

appsettings.json

"Models": {
  "whisper-small": {
    "Type": "Whisper",
    "Path": "./models/audio-to-text/whisper.cpp/ggml-small.bin"
  },
  "whisper-medium": {
    "Type": "Whisper",
    "Path": "./models/audio-to-text/whisper.cpp/ggml-medium.bin"
  }
},
"Translation": {
  "AudioToText": "whisper-small"   // switch to "whisper-medium" for better quality
}

To switch models: change AudioToText — no code changes needed.

Properties for Whisper:

Property Required Description
Path Path to the .bin ggml model file
ExecutionProvider Whisper.net runtime. "" or "auto" (default) = probe best available in order: Cuda → Cuda12 → Vulkan → CoreML → Cpu. Explicit: "cpu", "cuda", "vulkan", "coreml". The selected runtime is logged at startup.

Text-to-Speech


Piper TTS

Piper · ONNX format · language-specific voices

Runs locally via ONNX Runtime. Each voice is a separate ONNX model trained for one language. Text is phonemised by espeak-ng (must be installed separately) before inference. The model is loaded lazily on first request and unloaded after ModelTtlMinutes of inactivity.

License:

System dependency: espeak-ng

Platform Install command
Debian / Ubuntu (Docker) apt-get install -y espeak-ng
Windows Download MSI from GitHub Releases — add to PATH
macOS brew install espeak-ng

Download voices

Voices are hosted at lopatnov/piper-voices:

# English (en_US-joe-medium)
hf download lopatnov/piper-voices \
  en_US/en_US-joe-medium.onnx en_US/en_US-joe-medium.onnx.json \
  --local-dir ./models/text-to-audio/piper-voices

# Russian (ru_RU-ruslan-medium)
hf download lopatnov/piper-voices \
  ru_RU/ru_RU-ruslan-medium.onnx ru_RU/ru_RU-ruslan-medium.onnx.json \
  --local-dir ./models/text-to-audio/piper-voices

# Ukrainian (uk_UA-ukrainian_tts-medium, 3 speakers: lada / mykyta / tetiana)
hf download lopatnov/piper-voices \
  uk_UA/uk_UA-ukrainian_tts-medium.onnx uk_UA/uk_UA-ukrainian_tts-medium.onnx.json \
  --local-dir ./models/text-to-audio/piper-voices

Each voice requires both the .onnx model file and its .onnx.json sidecar (phoneme map, sample rate, speaker info).

appsettings.json

"Models": {
  "piper-en-US": {
    "Type": "Piper",
    "Path": "./models/text-to-audio/piper-voices/en_US/en_US-joe-medium.onnx"
  },
  "piper-ru-RU": {
    "Type": "Piper",
    "Path": "./models/text-to-audio/piper-voices/ru_RU/ru_RU-ruslan-medium.onnx"
  },
  "piper-uk-UA": {
    "Type": "Piper",
    "Path": "./models/text-to-audio/piper-voices/uk_UA/uk_UA-ukrainian_tts-medium.onnx"
  }
},
"Translation": {
  "TextToAudio": {
    "en": "piper-en-US",   // ISO 639-1 code → model key
    "ru": "piper-ru-RU",
    "uk": "piper-uk-UA"
  }
}

TextToAudio is a dictionary of ISO 639-1 language codes → model keys. If empty or absent, SynthesizeSpeech returns FAILED_PRECONDITION.

Multi-speaker voices

The Ukrainian voice (uk_UA-ukrainian_tts-medium) has 3 speakers. Select a speaker via the voice field in SynthesizeSpeechRequest:

voice value Speaker
lada Female (default when omitted)
mykyta Male
tetiana Female (different style)

Properties for Piper:

Property Required Description
Path Path to the .onnx voice model file. The companion .onnx.json must exist at the same path + .json.
ExecutionProvider ONNX execution provider. "" or "auto" (default) = probe best available. Explicit: "cpu", "directml", "cuda".