Lopatnov.Translate

Self-hosted speech and text translation service. .NET 10 · gRPC · ONNX Runtime · Docker.

A self-hosted gRPC service for speech-to-text transcription, text translation, text-to-speech synthesis, and end-to-end speech-to-speech translation. All models run locally — no cloud dependencies. Multiple models can be configured by name and selected per request. Optional GPU/NPU acceleration via DirectML (Windows) or CUDA (Linux).

Getting Started

1. Clone

git clone https://github.com/lopatnov/translate.git
cd translate

2. Download models through hf

# Translation model (MIT, 100 languages)
hf download lopatnov/m2m100_418M-onnx --local-dir ./models/translate/m2m100_418M

# Language detection — required for auto-detect and DetectLanguage RPC (CC-BY-SA 3.0)
hf download lopatnov/fasttext-language-id lid.176.bin --local-dir ./models/detect-lang/fasttext-language-id

# Speech-to-text — Whisper small (~500 MB, MIT)
hf download lopatnov/whisper.cpp ggml-small.bin --local-dir ./models/audio-to-text/whisper.cpp

# Text-to-speech — Piper English voice (MIT)
hf download lopatnov/piper-voices \
  en_US/en_US-joe-medium.onnx en_US/en_US-joe-medium.onnx.json \
  --local-dir ./models/text-to-audio/piper-voices

See docs/models.md for all available models, voices, and language detection options.

3. Start

docker compose -f docker/docker-compose.yml up --build

The gRPC server starts on port 5100.

4. Translate text

grpcurl -plaintext \
  -d '{"text":"Hello","source_language":"en","target_language":"uk"}' \
  localhost:5100 lopatnov.translate.v1.TranslateService/TranslateText

5. Transcribe audio

# Linux (GNU base64)
grpcurl -plaintext \
  -d "{\"audio_data\": \"$(base64 -w0 my-audio.wav)\", \"language\": \"auto\"}" \
  localhost:5100 lopatnov.translate.v1.TranslateService/TranscribeAudio

# macOS (BSD base64 has no -w flag)
grpcurl -plaintext \
  -d "{\"audio_data\": \"$(base64 my-audio.wav | tr -d '\n')\", \"language\": \"auto\"}" \
  localhost:5100 lopatnov.translate.v1.TranslateService/TranscribeAudio

# PowerShell (Windows)
$b = [Convert]::ToBase64String([IO.File]::ReadAllBytes("my-audio.wav"))
grpcurl -plaintext -d "{`"audio_data`":`"$b`",`"language`":`"auto`"}" `
  localhost:5100 lopatnov.translate.v1.TranslateService/TranscribeAudio

6. Synthesize speech

grpcurl -plaintext \
  -d '{"text":"Hello, world!","language":"en"}' \
  localhost:5100 lopatnov.translate.v1.TranslateService/SynthesizeSpeech \
  | jq -r '.audioData' | base64 -d > output.wav

7. Speech-to-speech translation

# Transcribe + translate + synthesize in one call
grpcurl -plaintext \
  -d "{\"audio_data\": \"$(base64 -w0 speech.wav)\", \"source_language\": \"uk\", \"target_language\": \"en\"}" \
  localhost:5100 lopatnov.translate.v1.TranslateService/TranslateAudio \
  | jq -r '.translatedAudio' | base64 -d > translated.wav

See docs/api.md for the full API reference.

Documentation

Doc	Description
docs/api.md	gRPC API reference — RPCs, messages, examples
docs/models.md	Model setup — download, configuration, licenses
docs/deployment.md	Docker deployment
docs/development.md	Local dev, build, testing

Project Structure

src/
  Lopatnov.Translate.Grpc/           # gRPC server, DI wiring, model registry
  Lopatnov.Translate.Core/           # interfaces, language detection, JSON localization
  Lopatnov.Translate.Nllb/           # NLLB-200 translator (ONNX Runtime)
  Lopatnov.Translate.M2M100/         # M2M-100 translator (ONNX Runtime)
  Lopatnov.Translate.Whisper/        # Whisper speech-to-text (Whisper.net)
  Lopatnov.Translate.Piper/          # Piper text-to-speech (ONNX Runtime + espeak-ng)
  Lopatnov.Translate.LibreTranslate/ # LibreTranslate HTTP client (optional)

tests/
  Lopatnov.Translate.Grpc.Tests/     # service dispatch, model session manager
  Lopatnov.Translate.Core.Tests/     # language detection, JSON localization
  Lopatnov.Translate.Nllb.Tests/     # tokenizer, translator, integration
  Lopatnov.Translate.M2M100.Tests/   # tokenizer, translator, integration
  Lopatnov.Translate.Whisper.Tests/  # audio resampling, recognizer, integration
  Lopatnov.Translate.Piper.Tests/    # phonemizer, synthesizer, integration

models/                              # gitignored — populate via hf (see docs/models.md)
  translate/                         # M2M-100, NLLB ONNX files
  detect-lang/                       # FastText LID-176, GlotLID
  audio-to-text/                     # Whisper ggml files
  text-to-audio/                     # Piper voice files

clients/
  translate-angular/                 # Angular web UI (7 pages: translate, detect, localize,
                                     #   transcribe, synthesize, speech-to-speech, live)
  translate-mcp/                     # MCP server — integrates the service as an AI tool

docker/
  Dockerfile
  docker-compose.yml

Contributing

Contributions are welcome. Please read CONTRIBUTING.md before opening a pull request.

Bug reports → open an issue
Found it useful? A star on GitHub helps others discover the project

Lopatnov.Translate

Self-hosted gRPC service for speech-to-text, text translation, and language detection. Runs entirely offline — no cloud, no API keys. Powered by Whisper, NLLB-200, and M2M-100 via ONNX Runtime on .NET 10. Deploy with a single Docker Compose command.

Lopatnov.Translate

Getting Started

1. Clone

2. Download models through hf

3. Start

4. Translate text

5. Transcribe audio

6. Synthesize speech

7. Speech-to-speech translation

Documentation

Project Structure

Contributing

License