packages/provider-local is a provider adapter for local LLMs running an OpenAI-compatible chat completions API. It supports Ollama, LM Studio, llama.cpp, vLLM, and any other server that implements POST /chat/completions per the OpenAI spec.
Unlike the cloud providers (Gemini, OpenAI, Claude), local LLMs have no built-in web search. Citation detection is therefore heuristic — the adapter scans the model’s answer text for domain mentions rather than reading structured grounding metadata.
validateConfig(config: LocalConfig): LocalHealthcheckResultValidates that baseUrl is present and non-empty. Returns the model name that will be used (default: llama3).
healthcheck(config: LocalConfig): Promise<LocalHealthcheckResult>Calls GET /models on the configured endpoint to verify connectivity. Returns ok/error with the number of models detected. Does not require a specific model to be present.
executeTrackedQuery(input: LocalTrackedQueryInput): Promise<LocalRawResult>Sends a chat completion request to the local server with a system prompt instructing the model to include domain names in its answer. Returns:
rawResponse — the full chat completions API responsegroundingSources — always [] (local LLMs have no structured grounding)searchQueries — always []model — the model name usednormalizeResult(raw: LocalRawResult): LocalNormalizedResultExtracts analyst-relevant fields:
answerText — the first choice’s message contentcitedDomains — domains extracted from the answer text via heuristic scanning (see below)groundingSources — pass-through of []searchQueries — pass-through of []| Field | Required | Description |
|---|---|---|
baseUrl |
Yes | Base URL of the OpenAI-compatible server, e.g. http://localhost:11434/v1 |
apiKey |
No | API key if the server requires one (most local servers do not) |
model |
No | Model name to use. Default: llama3 |
Because local LLMs cannot search the web, citedDomains is built by scanning the answer text with two patterns:
https?://example.com/... and extracts the hostnameexample.com adjacent to whitespace or punctuation, for common TLDs (.com, .org, .net, .io, .co, .dev, .ai, .app, .edu, .gov, .health, .dental, .legal, .law, .med)Both patterns strip www. and lowercase the result. Duplicates are removed.
Reliability caveat: the extracted domains reflect the model’s training data, not a live web search. A model may mention well-known domains regardless of whether they currently rank for the keyword. Treat local provider citation data as a rough signal, not a ground truth.
Default: llama3. Configurable via LocalConfig.model. Any model name accepted by the local server can be used — the adapter passes the value directly to the completions request.
Quota policy is passed via LocalConfig.quotaPolicy but enforcement is handled by the job runner.
The job runner stores the following in query_snapshots.raw_response as JSON:
{
"model": "llama3",
"groundingSources": [],
"searchQueries": [],
"apiResponse": {
"id": "...",
"choices": [{ "message": { "role": "assistant", "content": "..." } }],
"usage": { "prompt_tokens": 42, "completion_tokens": 180 }
}
}
Phase 2: Live local LLM calls implemented using the openai npm package pointed at a configurable baseUrl. Cloud API keys are not required.