Models

Configuration
- Models section
- Translation section
Language Detection
- LID-176
- GlotLID v3
Translation
Speech-to-Text
- Whisper
Text-to-Speech
- Piper TTS

Configuration

Models section

Models are configured by name in appsettings.json under Models. Each entry has a Type discriminator and type-specific properties. The name you choose becomes the value of the model field in API requests.

"Models": {
  "<name>": {
    "Type": "<type>",  // required: NLLB | M2M100 | FastText | LibreTranslate | Whisper | Piper | Redirect
    // ... type-specific properties (see each model below)
  }
}

Multiple entries of the same type are allowed — just use different names. If Type is missing or unknown, the service will fail to start with a configuration error.

Properties for NLLB and M2M100:

Property	Default	Description
`Path`	—	Path to the directory with model files
`EncoderFile`	`encoder_model.onnx`	Encoder ONNX filename
`DecoderFile`	`decoder_model.onnx`	Decoder ONNX filename
`TokenizerFile`	`sentencepiece.bpe.model`	SentencePiece tokenizer filename
`TokenizerConfigFile`	`""`	Secondary tokenizer config (`tokenizer.json` for NLLB, `added_tokens.json` for M2M100)
`MaxTokens`	`512`	Maximum tokens per translation
`BeamSize`	`1`	Beam search width (NLLB only; higher = better quality, slower)
`VocabFile`	`""`	BPE vocabulary file (M2M100 only: `vocab.json`)
`ExecutionProvider`	`""` (auto)	ONNX execution provider. `""` or `"auto"` = probe best available (DirectML on Windows, CUDA on Linux, then CPU). Explicit: `"cpu"`, `"directml"`, `"cuda"`. Falls back to CPU with a warning if the requested provider is unavailable.

Properties for FastText:

Property	Default	Description
`Path`	—	Path to the model file (`.bin` or `.ftz`)
`LabelFormat`	`"flores200"`	Format of the model’s output labels. Use `"iso639-1"` for LID-176 (outputs `__label__en`), `"flores200"` for GlotLID (outputs `__label__eng_Latn`). Also supports `"iso639-2"`, `"iso639-3"`.
`LabelPrefix`	`"__label__"`	Prefix to strip from each label before format conversion.
`LabelSuffix`	`""`	Optional suffix to strip from each label.

Properties for LibreTranslate:

Property	Default	Description
`BaseUrl`	—	URL of the LibreTranslate instance, e.g. `http://libretranslate:5000`
`ApiKey`	`""`	API key, if the instance requires one

Properties for Redirect:

Forwards translation requests to another Lopatnov.Translate gRPC service instance (e.g. on a different machine). Allows distributing models across servers.

Property	Default	Description
`RedirectUrl`	—	gRPC endpoint of the remote service, e.g. `http://192.168.1.100:5100`
`RedirectName`	`""`	Model name to request on the remote; defaults to the local key name if empty

Cycle detection is built in: a random x-redirect-id header is propagated through the hop chain. If a request returns to the originating server, a FAILED_PRECONDITION error is returned immediately.

"Models": {
  "remote-m2m100": {
    "Type": "Redirect",
    "RedirectUrl": "http://192.168.1.100:5100",
    "RedirectName": "m2m100_1.2B"   // model key on the remote; "" = same as local key
  }
},
"Translation": {
  "AllowedModels": ["remote-m2m100"]
}

Properties for Piper:

See the Piper TTS section below.

Properties for Whisper:

See the Whisper section below.

Type compatibility

Type identifies the tokenizer format, not the exact model. This means fine-tuned or alternative models are supported as long as the tokenizer is compatible:

Type	Compatible with
`NLLB`	Any ONNX encoder-decoder model using the NLLB-200 SentencePiece tokenizer with FLORES-200 language tokens
`M2M100`	Any ONNX encoder-decoder model using the M2M-100 tokenizer (`vocab.json` + `added_tokens.json`, ISO 639-1 `__lang__` tokens)
`FastText`	Any fastText supervised classification model in `.bin` or `.ftz` format
`LibreTranslate`	Any LibreTranslate-compatible HTTP API endpoint
`Whisper`	Any Whisper ggml model file (`ggml-*.bin`) compatible with whisper.cpp / Whisper.net
`Piper`	Any Piper TTS ONNX voice with a companion `.onnx.json` sidecar (phoneme_id_map, sample_rate, espeak voice)
`Redirect`	Another Lopatnov.Translate gRPC service instance (any model type on the remote)

Models not compatible without a new type: MarianMT (OPUS-MT), mBART-50, SeamlessM4T — they use different tokenizer formats.

Translation section

Controls routing and lifecycle of loaded models.

"Translation": {
  "DefaultModel": "m2m100_418M",  // model used when the request's model field is empty
  "AutoDetect": "lid-176-ftz",    // name of a FastText model for language auto-detection
  "AudioToText": "whisper-small", // name of a Whisper model for speech-to-text; "" = STT disabled
  "AllowedModels": ["m2m100_418M"], // allowlist; empty = all configured translation models allowed
  "ModelTtlMinutes": 30,          // minutes of inactivity before a model is unloaded from memory
  "TextToAudio": {                // language code → Piper model key; absent = TTS disabled
    "en": "piper-en-US",
    "ru": "piper-ru-RU",
    "uk": "piper-uk-UA"
  },
  "WarmUp": ["m2m100_418M", "whisper-small"]  // pre-load these models at startup
}

Property	Default	Description
`DefaultModel`	`""`	Name of the model to use when `model` is not specified in the request. If empty and the request omits `model`, the request fails.
`AutoDetect`	`""`	Name of a `FastText` model used for language auto-detection. Required to use `source_language: "auto"` in `TranslateText` or the `DetectLanguage` RPC. If empty or the model file is missing, falls back to heuristic detection.
`AudioToText`	`""`	Name of a `Whisper` model entry used for speech-to-text transcription (`TranscribeAudio` RPC). If empty, the RPC returns `FAILED_PRECONDITION`.
`TextToAudio`	`{}`	Dictionary of ISO 639-1 language codes → `Piper` model keys. Used by `SynthesizeSpeech` and `TranslateAudio`. If empty or absent, TTS RPCs return `FAILED_PRECONDITION`.
`AllowedModels`	`[]`	Restricts which translation models clients may request by name. Empty list means all configured translation models are accessible. `Whisper`, `FastText`, and `Piper` entries are not affected by this list.
`ModelTtlMinutes`	`30`	Translation models and the Whisper/Piper models are kept in memory for this many minutes after last use, then unloaded to free resources. The auto-detect language detector is loaded once at startup and is not affected by this TTL.
`WarmUp`	`[]`	List of model keys to pre-load at service startup. Runs concurrently with request serving; each model logs elapsed time. Failures are warnings — the service stays up.

Language Detection

Language detection models are used for automatic source language detection (source_language: "auto" in TranslateText, or the DetectLanguage RPC). They are not translation models and cannot be used as model in translation requests. Configure the active detector via Translation:AutoDetect.

LID-176

176 languages · fastText binary format (.ftz)

Facebook’s compact language identification model. Fast and lightweight (~1 MB). Best choice when you only need common languages and care about startup time.

License: CC-BY-SA-3.0 Commercial use is allowed. You must credit Facebook Research. If you distribute a modified version of the model itself, it must remain under the same license — this does not apply to using it as a backend service.

Download

hf download lopatnov/fasttext-language-id \
  --local-dir ./models/langdetect/lid176

HuggingFace repo: lopatnov/fasttext-language-id

appsettings.json

"Models": {
  "langdetect": {
    "Type": "FastText",
    "Path": "./models/langdetect/lid176/lid.176.ftz",
    "LabelFormat": "iso639-1"   // LID-176 outputs ISO 639-1 codes (en, de, fr, …)
  }
},
"Translation": {
  "AutoDetect": "langdetect"
}

GlotLID v3

1633 language varieties · fastText binary format (.bin)

Covers a much wider range of languages than LID-176, including low-resource and minority languages. Larger model (~1.6 GB).

License: Apache 2.0 Unrestricted commercial use. No attribution required.

Download

hf download lopatnov/glotlid \
  --local-dir ./models/langdetect/glotlid

HuggingFace repo: lopatnov/glotlid

appsettings.json

"Models": {
  "langdetect": {
    "Type": "FastText",
    "Path": "./models/langdetect/glotlid/model_v3.bin",
    "LabelFormat": "flores200"  // GlotLID outputs FLORES-200 codes (eng_Latn, ukr_Cyrl, …) — this is the default
  }
},
"Translation": {
  "AutoDetect": "langdetect"
}

Translation

NLLB-200 (600M distilled)

200 languages · ONNX

Meta’s No Language Left Behind model, distilled to 600M parameters. Good balance of quality and speed. Recommended starting point.

License: CC-BY-NC-4.0 ⚠️ Non-commercial use only. Cannot be used in commercial products or services.

Download

hf download lopatnov/nllb-200-distilled-600M-onnx \
  --local-dir ./models/nllb-600m

HuggingFace repo: lopatnov/nllb-200-distilled-600M-onnx

appsettings.json

"Models": {
  "nllb": {
    "Type": "NLLB",
    "Path": "./models/nllb-600m",
    "EncoderFile": "encoder_model.onnx",
    "DecoderFile": "decoder_model.onnx",
    "TokenizerFile": "sentencepiece.bpe.model",
    "TokenizerConfigFile": "tokenizer.json",
    "MaxTokens": 512,
    "BeamSize": 1
  }
},
"Translation": {
  "DefaultModel": "nllb"
}

NLLB-200 1.3B

200 languages · ONNX

Higher quality than the 600M distilled variant at the cost of more memory (~5 GB).

License: CC-BY-NC-4.0 ⚠️ Non-commercial use only.

Download

hf download lopatnov/nllb-200-1.3B-onnx \
  --local-dir ./models/nllb-1.3b

HuggingFace repo: lopatnov/nllb-200-1.3B-onnx

appsettings.json

Same as 600M distilled — change Path to ./models/nllb-1.3b.

NLLB-200 3.3B

200 languages · ONNX

Highest quality NLLB variant. Requires significant RAM/VRAM (~12 GB).

License: CC-BY-NC-4.0 ⚠️ Non-commercial use only.

Download

hf download lopatnov/nllb-200-3.3B-onnx \
  --local-dir ./models/nllb-3.3b

HuggingFace repo: lopatnov/nllb-200-3.3B-onnx

appsettings.json

Same as 600M distilled — change Path to ./models/nllb-3.3b.

M2M-100 (418M)

100 languages · ONNX

Facebook’s many-to-many translation model. MIT-licensed — suitable for commercial use. 418M parameter variant, lower memory footprint.

License: MIT ✅ Unrestricted commercial use.

Download

hf download lopatnov/m2m100_418M-onnx \
  --local-dir ./models/m2m100-418m

HuggingFace repo: lopatnov/m2m100_418M-onnx

appsettings.json

"Models": {
  "m2m100": {
    "Type": "M2M100",
    "Path": "./models/m2m100-418m",
    "EncoderFile": "encoder_model.onnx",
    "DecoderFile": "decoder_model.onnx",
    "TokenizerFile": "sentencepiece.bpe.model",
    "TokenizerConfigFile": "added_tokens.json",
    "VocabFile": "vocab.json",
    "MaxTokens": 512
  }
},
"Translation": {
  "DefaultModel": "m2m100"
}

M2M-100 (1.2B)

100 languages · ONNX

Higher quality than the 418M variant. Good choice for commercial deployments where translation quality matters.

License: MIT ✅ Unrestricted commercial use.

Download

hf download lopatnov/m2m100_1.2B-onnx \
  --local-dir ./models/m2m100-1.2b

HuggingFace repo: lopatnov/m2m100_1.2B-onnx

appsettings.json

Same as 418M — change Path to ./models/m2m100-1.2b.

LibreTranslate

Argos Translate backend · HTTP API

An open-source machine translation server. Runs as a separate Docker container alongside the service. Useful as an additional translation option or fallback.

License: AGPL-3.0 (LibreTranslate server) · Argos Translate language packages are MIT/CC-BY licensed. ⚠️ AGPL requires that the source code of any network-accessible service using LibreTranslate is made publicly available. Evaluate whether this is acceptable for your use case before using in production.

Setup

In docker/docker-compose.yml, uncomment the libretranslate: service block, the depends_on: section in the translate: service, and the libretranslate-models: volume at the bottom of the file.
Add to appsettings.json:

"Models": {
  "libretranslate": {
    "Type": "LibreTranslate",
    "BaseUrl": "http://libretranslate:5000",
    "ApiKey": ""  // set if your LibreTranslate instance requires a key
  }
},
"Translation": {
  "DefaultModel": "libretranslate"
}

The BaseUrl http://libretranslate:5000 works when running via Docker Compose. For an external instance, replace with its URL.

Speech-to-Text

Whisper

OpenAI Whisper · ggml format · 99 languages

Runs locally via Whisper.net (a managed wrapper over whisper.cpp). Accepts any WAV file — resampled automatically to 16 kHz mono before inference. The model is loaded lazily on first request and unloaded after ModelTtlMinutes of inactivity.

License: MIT ✅ Unrestricted commercial use.

Download

# small (~500 MB) — recommended default
hf download lopatnov/whisper.cpp ggml-small.bin \
  --local-dir ./models/audio-to-text/whisper.cpp

# medium (~1.5 GB) — better quality
hf download lopatnov/whisper.cpp ggml-medium.bin \
  --local-dir ./models/audio-to-text/whisper.cpp

HuggingFace repo: lopatnov/whisper.cpp

appsettings.json

"Models": {
  "whisper-small": {
    "Type": "Whisper",
    "Path": "./models/audio-to-text/whisper.cpp/ggml-small.bin"
  },
  "whisper-medium": {
    "Type": "Whisper",
    "Path": "./models/audio-to-text/whisper.cpp/ggml-medium.bin"
  }
},
"Translation": {
  "AudioToText": "whisper-small"   // switch to "whisper-medium" for better quality
}

To switch models: change AudioToText — no code changes needed.

Properties for Whisper:

Property	Required	Description
`Path`	✅	Path to the `.bin` ggml model file
`ExecutionProvider`	—	Whisper.net runtime. `""` or `"auto"` (default) = probe best available in order: Cuda → Cuda12 → Vulkan → CoreML → Cpu. Explicit: `"cpu"`, `"cuda"`, `"vulkan"`, `"coreml"`. The selected runtime is logged at startup.

Text-to-Speech

Piper TTS

Piper · ONNX format · language-specific voices

Runs locally via ONNX Runtime. Each voice is a separate ONNX model trained for one language. Text is phonemised by espeak-ng (must be installed separately) before inference. The model is loaded lazily on first request and unloaded after ModelTtlMinutes of inactivity.

License:

Piper voices: MIT ✅ Unrestricted commercial use.
espeak-ng (system dependency): GPL v3 — called as a subprocess, does not affect your code’s license, but espeak-ng binaries must be distributed under GPL v3 terms.

System dependency: espeak-ng

Platform	Install command
Debian / Ubuntu (Docker)	`apt-get install -y espeak-ng`
Windows	Download MSI from GitHub Releases — add to PATH
macOS	`brew install espeak-ng`

Download voices

Voices are hosted at lopatnov/piper-voices:

# English (en_US-joe-medium)
hf download lopatnov/piper-voices \
  en_US/en_US-joe-medium.onnx en_US/en_US-joe-medium.onnx.json \
  --local-dir ./models/text-to-audio/piper-voices

# Russian (ru_RU-ruslan-medium)
hf download lopatnov/piper-voices \
  ru_RU/ru_RU-ruslan-medium.onnx ru_RU/ru_RU-ruslan-medium.onnx.json \
  --local-dir ./models/text-to-audio/piper-voices

# Ukrainian (uk_UA-ukrainian_tts-medium, 3 speakers: lada / mykyta / tetiana)
hf download lopatnov/piper-voices \
  uk_UA/uk_UA-ukrainian_tts-medium.onnx uk_UA/uk_UA-ukrainian_tts-medium.onnx.json \
  --local-dir ./models/text-to-audio/piper-voices

Each voice requires both the .onnx model file and its .onnx.json sidecar (phoneme map, sample rate, speaker info).

appsettings.json

"Models": {
  "piper-en-US": {
    "Type": "Piper",
    "Path": "./models/text-to-audio/piper-voices/en_US/en_US-joe-medium.onnx"
  },
  "piper-ru-RU": {
    "Type": "Piper",
    "Path": "./models/text-to-audio/piper-voices/ru_RU/ru_RU-ruslan-medium.onnx"
  },
  "piper-uk-UA": {
    "Type": "Piper",
    "Path": "./models/text-to-audio/piper-voices/uk_UA/uk_UA-ukrainian_tts-medium.onnx"
  }
},
"Translation": {
  "TextToAudio": {
    "en": "piper-en-US",   // ISO 639-1 code → model key
    "ru": "piper-ru-RU",
    "uk": "piper-uk-UA"
  }
}

TextToAudio is a dictionary of ISO 639-1 language codes → model keys. If empty or absent, SynthesizeSpeech returns FAILED_PRECONDITION.

Multi-speaker voices

The Ukrainian voice (uk_UA-ukrainian_tts-medium) has 3 speakers. Select a speaker via the voice field in SynthesizeSpeechRequest:

`voice` value	Speaker
`lada`	Female (default when omitted)
`mykyta`	Male
`tetiana`	Female (different style)

Properties for Piper:

Property	Required	Description
`Path`	✅	Path to the `.onnx` voice model file. The companion `.onnx.json` must exist at the same path + `.json`.
`ExecutionProvider`	—	ONNX execution provider. `""` or `"auto"` (default) = probe best available. Explicit: `"cpu"`, `"directml"`, `"cuda"`.

Models

Self-hosted gRPC service for speech-to-text, text translation, and language detection. Runs entirely offline — no cloud, no API keys. Powered by Whisper, NLLB-200, and M2M-100 via ONNX Runtime on .NET 10. Deploy with a single Docker Compose command.

Models

Contents

Configuration

Models section

Type compatibility

Translation section

Language Detection

LID-176

GlotLID v3

Translation

NLLB-200 (600M distilled)

NLLB-200 1.3B

NLLB-200 3.3B

M2M-100 (418M)

M2M-100 (1.2B)

LibreTranslate

Speech-to-Text

Whisper

Text-to-Speech

Piper TTS