feat(insights): add local (self-hosted) transcription & translation provider by xynstr · Pull Request #844 · mynaparrot/plugNmeet-server

xynstr · 2026-04-24T11:50:35Z

Motivation

Adds a local provider to the Insights system for self-hosted,
privacy-preserving transcription and translation. Useful for:

GDPR-sensitive deployments where sending audio to Azure/Google is
blocked by policy.
Air-gapped / on-premise installations.
Cost-sensitive operators running on CPU-only hardware.

Design

Implements the existing insights.Provider interface — no changes
to existing interfaces or protobuf messages.
Communicates with a separate companion service over a minimal
WebSocket protocol (documented in docs/providers/local.md).
Anyone can replace the reference companion (faster-whisper + NLLB)
with another backend (whisper.cpp, Vosk, Deepgram self-hosted, …)
without touching any Go code.

The reference companion service lives in a separate repository
(https://github.com/xynstr/plugnmeet-local-insights, MIT licensed)
so it can have its own Python-native lifecycle and CI.

Scope

pkg/insights/providers/local/client.go       (provider + translation)
pkg/insights/providers/local/transcribe.go   (WebSocket stream client)
pkg/insights/providers/local/languages.go    (supported language list)
pkg/services/insights/insights_service.go    (+3 lines: register "local")
config_sample.yaml                           (+ commented example block)
docs/providers/local.md                      (new: setup guide)

Purely additive. No behaviour change for existing azure / google
users.

Testing

End-to-end smoke-tested on ARM64 (Neoverse-N1, 10 cores) and x86_64.
Handshake → audio streaming → partial/final transcription →
per-language translation round-trip verified.
faster-whisper small int8, VAD filter enabled, 500 ms chunks.
Multi-language translation batched in a single CTranslate2 call — 3
target languages take roughly the same wall time as 1.

Compatibility & Licenses

This PR is MIT-licensed, like the rest of the project.
The reference companion's translation model (NLLB-200) is
CC-BY-NC 4.0. The companion README documents this explicitly and
points commercial operators at permissive alternatives
(e.g. opus-mt-*). Transcription (Whisper) is MIT-compatible.
If only transcription is configured (no translate_url), the
translation model never loads and its license is not triggered.

Out of scope for this PR

Speech synthesis (SynthesizeText) and AI chat (AITextChat*) are
stubbed with explicit "not supported" responses — the local provider
focuses on STT and MT only.
Batch file summarisation is likewise out of scope.

Happy to iterate on naming, packaging, or documentation based on review
feedback.

…rovider Adds a new 'local' provider type implementing insights.Provider. The provider talks to a separate companion service over a minimal WebSocket protocol (PCM16 audio in, partial/final text out) and an Azure-compatible HTTP endpoint for translation. Reference implementation: https://github.com/xynstr/plugnmeet-local-insights (MIT). Useful for GDPR-sensitive or air-gapped deployments where sending audio to Azure/Google is not an option, and for cost-sensitive operators running on CPU-only hardware. - Purely additive: no changes to existing interfaces or protobuf. - config_sample.yaml: commented example block. - docs/providers/local.md: setup guide and protocol documentation.

CLAassistant · 2026-04-24T11:50:42Z

All committers have signed the CLA.

gemini-code-assist

Code Review

This pull request introduces a new 'local' insights provider for self-hosted transcription and translation, utilizing a companion service over WebSockets. The implementation includes a new documentation file, configuration samples, and the Go client logic. Feedback focuses on optimizing the HTTP client usage for translation requests, preventing potential memory exhaustion when reading response bodies, and improving the robustness of the transcription stream's read loop and memory management during audio sample processing.

gemini-code-assist · 2026-04-24T11:53:35Z

+type LocalProvider struct {
+	account *config.ProviderAccount
+	service *config.ServiceConfig
+	logger  *logrus.Entry
+}
+
+// NewProvider creates a new LocalProvider.
+func NewProvider(providerAccount *config.ProviderAccount, serviceConfig *config.ServiceConfig, log *logrus.Entry) (insights.Provider, error) {
+	return &LocalProvider{
+		account: providerAccount,
+		service: serviceConfig,
+		logger:  log.WithField("service", "local"),
+	}, nil


Creating a new http.Client for every translation request is inefficient as it prevents connection reuse (TCP/TLS keep-alive). It is recommended to initialize the client once in the LocalProvider struct and reuse it.

Suggested change

type LocalProvider struct {

account *config.ProviderAccount

service *config.ServiceConfig

logger *logrus.Entry

}

// NewProvider creates a new LocalProvider.

func NewProvider(providerAccount *config.ProviderAccount, serviceConfig *config.ServiceConfig, log *logrus.Entry) (insights.Provider, error) {

return &LocalProvider{

account: providerAccount,

service: serviceConfig,

logger: log.WithField("service", "local"),

}, nil

type LocalProvider struct {

account *config.ProviderAccount

service *config.ServiceConfig

logger *logrus.Entry

httpClient *http.Client

}

// NewProvider creates a new LocalProvider.

func NewProvider(providerAccount *config.ProviderAccount, serviceConfig *config.ServiceConfig, log *logrus.Entry) (insights.Provider, error) {

return &LocalProvider{

account: providerAccount,

service: serviceConfig,

logger: log.WithField("service", "local"),

httpClient: &http.Client{Timeout: 15 * time.Second},

}, nil

}

gemini-code-assist · 2026-04-24T11:53:35Z

+	client := &http.Client{Timeout: 15 * time.Second}
+	resp, err := client.Do(req)


Use the shared httpClient instead of creating a new one for each request.

Suggested change

client := &http.Client{Timeout: 15 * time.Second}

resp, err := client.Do(req)

resp, err := p.httpClient.Do(req)

gemini-code-assist · 2026-04-24T11:53:36Z

+	defer resp.Body.Close()
+
+	if resp.StatusCode != http.StatusOK {
+		bodyBytes, _ := io.ReadAll(resp.Body)


Using io.ReadAll on an HTTP response body without a limit can lead to excessive memory consumption if the server returns a large response. Consider using io.LimitReader to read only a reasonable amount of data for error messages.

Suggested change

bodyBytes, _ := io.ReadAll(resp.Body)

bodyBytes, _ := io.ReadAll(io.LimitReader(resp.Body, 1024))

gemini-code-assist · 2026-04-24T11:53:36Z

+				resultsChan <- &insights.TranscriptionEvent{
+					Type:  insights.EventTypeError,
+					Error: err.Error(),
+				}


Sending to resultsChan can block indefinitely if the consumer stops reading from the channel. This would hang the readLoop goroutine. Use a select block with ctx.Done() to ensure the goroutine can exit even if the channel is full and the consumer is gone. This pattern should be applied to all sends to resultsChan within the loop.

select { case resultsChan <- &insights.TranscriptionEvent{ Type: insights.EventTypeError, Error: err.Error(), }: case <-ctx.Done(): return }

gemini-code-assist · 2026-04-24T11:53:36Z

+		return fmt.Errorf("stream is closed")
+	}
+
+	buf := make([]byte, len(sample)*2)


WriteSample is called frequently for every audio chunk. Allocating a new byte slice buf on every call increases GC pressure. Consider using a sync.Pool or a pre-allocated buffer protected by the existing mutex to improve performance.

- Reuse http.Client across translation requests (TCP/TLS keep-alive). - Bound error response body reads with io.LimitReader (1 KiB) to avoid unbounded memory consumption on malformed companion responses. - Send all readLoop events through a select with ctx.Done() so the goroutine can unwind when the consumer stops reading. - Pool PCM-encoding scratch buffers with sync.Pool to reduce GC pressure on the hot WriteSample path.

xynstr · 2026-04-24T12:10:14Z

Thanks for the review! All five points are valid — pushed a follow-up commit (57d7f9f) that addresses them:

Reusable http.Client on the LocalProvider struct (combined with Feat: webhook #2).
io.LimitReader bounded to 1 KiB when reading error response bodies.
select with ctx.Done() on every send to resultsChan in readLoop via a small emit closure, so the goroutine can unwind even if the consumer stopped reading.
sync.Pool of []byte buffers for WriteSample to reduce GC pressure on the hot audio-frame path.

Local verification: gofmt clean, go vet ./pkg/insights/providers/local/... clean, full go build produces a working binary.

xynstr · 2026-04-28T21:14:23Z

Superseded by #848 — reworked along the lines you suggested (pure Go, on top of openai-go, no extra language). The diff there has nothing in common with this branch any more, so a new PR felt cleaner than a force-push. Closing this one to keep the review surface focused. Thanks for the steer!

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

xynstr mentioned this pull request Apr 28, 2026

feat(insights): add OpenAI provider for transcription and translation #848

Open

xynstr closed this Apr 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(insights): add local (self-hosted) transcription & translation provider#844

feat(insights): add local (self-hosted) transcription & translation provider#844
xynstr wants to merge 2 commits intomynaparrot:mainfrom
xynstr:feat/local-insights-provider

xynstr commented Apr 24, 2026

Uh oh!

CLAassistant commented Apr 24, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

xynstr commented Apr 24, 2026

Uh oh!

xynstr commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		client := &http.Client{Timeout: 15 * time.Second}
		resp, err := client.Do(req)

	client := &http.Client{Timeout: 15 * time.Second}
	resp, err := client.Do(req)
	resp, err := p.httpClient.Do(req)

	bodyBytes, _ := io.ReadAll(resp.Body)
	bodyBytes, _ := io.ReadAll(io.LimitReader(resp.Body, 1024))

Conversation

xynstr commented Apr 24, 2026

Motivation

Design

Scope

Testing

Compatibility & Licenses

Out of scope for this PR

Uh oh!

CLAassistant commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

xynstr commented Apr 24, 2026

Uh oh!

xynstr commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Apr 24, 2026 •

edited

Loading