Configuration Reference¶

ALUE uses pydantic-settings for configuration management. All settings can be provided via: - Environment variables - A .env file at the project root (recommended)

This page provides the complete reference for all configuration variables.

Configuration Method¶

Create a .env file from the example:

cp .env.example .env

Then edit .env with your settings. The application will automatically load these values on startup.

Complete Variable Reference¶

Inference Backend Configuration¶

These settings control the primary LLM used for task inference (MCQA, RAG, Summarization, etc.).

Variable	Type	Options	Description
`ALUE_ENDPOINT_TYPE`	Required	`openai`, `vllm`, `tgi`, `ollama`, `transformers`, `vllm-offline`	Backend type for inference
`ALUE_OPENAI_API_KEY`	Conditional	API key string	Required when `ALUE_ENDPOINT_TYPE=openai`
`ALUE_ENDPOINT_URL`	Conditional	URL	Required for `vllm`, `tgi`, `ollama`, or any OpenAI-compatible endpoint
`HF_TOKEN`	Conditional	Token string	Required for `transformers` backend or when downloading HuggingFace models

Example configurations:

# OpenAI
ALUE_ENDPOINT_TYPE=openai
ALUE_OPENAI_API_KEY=sk-...

# vLLM server
ALUE_ENDPOINT_TYPE=vllm
ALUE_ENDPOINT_URL=http://localhost:8000/v1
HF_TOKEN=hf_...

# Local Ollama
ALUE_ENDPOINT_TYPE=ollama
ALUE_ENDPOINT_URL=http://localhost:11434

# Transformers (offline)
ALUE_ENDPOINT_TYPE=transformers
HF_TOKEN=hf_...

LLM Judge Configuration¶

These settings control the LLM used for evaluation in RAG and Summarization tasks. The judge is used for metrics like Context Relevancy, Composite Correctness, and Claim Decomposition.

Variable	Type	Options	Description
`ALUE_LLM_JUDGE_ENDPOINT_TYPE`	Optional	Same as `ALUE_ENDPOINT_TYPE`	Backend type for LLM judge
`ALUE_LLM_JUDGE_OPENAI_API_KEY`	Conditional	API key string	Required when judge uses OpenAI
`ALUE_LLM_JUDGE_ENDPOINT_URL`	Conditional	URL	Required when judge uses `vllm`, `tgi`, `ollama`, or OpenAI-compatible endpoint

Note: LLM Judge is only required for RAG and Summarization evaluation. MCQA and Extractive QA do not use it.

Important: Using the same model for both inference and judging is not recommended, as it can introduce evaluation bias. Mixing backends is explicitly supported and encouraged.

Example configurations:

# Same backend, different model (if running multiple vLLM servers)
ALUE_LLM_JUDGE_ENDPOINT_TYPE=vllm
ALUE_LLM_JUDGE_ENDPOINT_URL=http://localhost:8001/v1

# Different backend entirely (recommended)
ALUE_LLM_JUDGE_ENDPOINT_TYPE=openai
ALUE_LLM_JUDGE_OPENAI_API_KEY=sk-...

Embedding Configuration¶

These settings control the embedding provider used for RAG tasks (vector database creation and retrieval).

Variable	Type	Options	Description
`EMBEDDING_ENDPOINT_TYPE`	Optional	`openai`, `ollama`, `hf`, `local`, `openai-compatible`	Embedding provider type. Default: `local`
`EMBEDDING_API_KEY`	Conditional	API key string	Required for `openai` embeddings
`EMBEDDING_ENDPOINT_URL`	Conditional	URL	Required for `ollama`, `hf`, or `openai-compatible`
`HF_TOKEN`	Conditional	Token string	Required for Hugging Face embedding models (`hf`)

Example configurations:

# OpenAI embeddings
EMBEDDING_ENDPOINT_TYPE=openai
EMBEDDING_API_KEY=sk-...

# Local embeddings (default, no external API needed)
EMBEDDING_ENDPOINT_TYPE=local

# Hugging Face embeddings
EMBEDDING_ENDPOINT_TYPE=hf
HF_TOKEN=hf_...
EMBEDDING_ENDPOINT_URL=http://localhost:8080

# Ollama embeddings
EMBEDDING_ENDPOINT_TYPE=ollama
EMBEDDING_ENDPOINT_URL=http://localhost:11434

Backend-Specific Notes¶

OpenAI¶

Requires valid API key with appropriate model access
Supports all OpenAI chat models (GPT-4, GPT-3.5, etc.)
Embedding support includes text-embedding-ada-002 and newer models

vLLM¶

Online mode (vllm): Requires a running vLLM server with OpenAI-compatible API
Offline mode (vllm-offline): Runs vLLM directly in-process (no server needed)
Server must be started separately: vllm serve <model_name> --port 8000

TGI (Text Generation Inference)¶

Requires a running TGI server
Must expose OpenAI-compatible endpoint
See HuggingFace TGI documentation

Ollama¶

Requires Ollama running locally or on accessible host
Default URL: http://localhost:11434
Models must be pulled first: ollama pull <model_name>
Supports both inference and embeddings

Transformers¶

Runs models directly via Hugging Face transformers library
No external server required
Requires sufficient GPU/CPU memory for model
Model weights downloaded automatically on first use (requires HF_TOKEN for gated models)

Complete `.env.example`¶

Here's a comprehensive example showing all available settings:

# === Inference Backend Configuration ===
# Options: openai | vllm | tgi | ollama | transformers | vllm-offline
ALUE_ENDPOINT_TYPE=openai

# Required for OpenAI
ALUE_OPENAI_API_KEY=sk-...

# Required for vLLM, TGI, Ollama, or openai-compatible endpoints
ALUE_ENDPOINT_URL=http://localhost:8000/v1

# Required for Hugging Face models (transformers backend)
HF_TOKEN=hf_...

# === LLM Judge Configuration (for RAG and Summarization evaluation) ===
# Options: same as ALUE_ENDPOINT_TYPE
ALUE_LLM_JUDGE_ENDPOINT_TYPE=openai

# Required if judge uses OpenAI
ALUE_LLM_JUDGE_OPENAI_API_KEY=sk-...

# Required if judge uses vLLM, TGI, Ollama, or openai-compatible
ALUE_LLM_JUDGE_ENDPOINT_URL=http://localhost:8001/v1

# === Embedding Configuration (for RAG tasks) ===
# Options: openai | ollama | hf | local | openai-compatible
EMBEDDING_ENDPOINT_TYPE=local

# Required for OpenAI embeddings
EMBEDDING_API_KEY=sk-...

# Required for ollama, hf, or openai-compatible embeddings
EMBEDDING_ENDPOINT_URL=http://localhost:11434

# Required for Hugging Face embedding models
HF_TOKEN=hf_...

Common Configuration Patterns¶

All-OpenAI (Simplest)¶

ALUE_ENDPOINT_TYPE=openai
ALUE_OPENAI_API_KEY=sk-...
ALUE_LLM_JUDGE_ENDPOINT_TYPE=openai
ALUE_LLM_JUDGE_OPENAI_API_KEY=sk-...
EMBEDDING_ENDPOINT_TYPE=openai
EMBEDDING_API_KEY=sk-...

All-Local (No External APIs)¶

ALUE_ENDPOINT_TYPE=ollama
ALUE_ENDPOINT_URL=http://localhost:11434
ALUE_LLM_JUDGE_ENDPOINT_TYPE=ollama
ALUE_LLM_JUDGE_ENDPOINT_URL=http://localhost:11434
EMBEDDING_ENDPOINT_TYPE=local

Mixed (Recommended for Production)¶

# Fast inference with vLLM
ALUE_ENDPOINT_TYPE=vllm
ALUE_ENDPOINT_URL=http://localhost:8000/v1
HF_TOKEN=hf_...

# Separate judge to reduce bias
ALUE_LLM_JUDGE_ENDPOINT_TYPE=openai
ALUE_LLM_JUDGE_OPENAI_API_KEY=sk-...

# Local embeddings (no API costs)
EMBEDDING_ENDPOINT_TYPE=local

Troubleshooting¶

"Missing API key" errors¶

Verify the appropriate *_API_KEY variable is set for your backend
Check for typos in the .env file
Ensure .env is in the project root directory

"Connection refused" errors¶

Verify the server is running (for vLLM, TGI, Ollama)
Check that *_ENDPOINT_URL points to the correct host and port
Ensure no firewall is blocking the connection

"Model not found" errors¶

For Ollama: Run ollama pull <model_name> first
For Transformers: Verify HF_TOKEN is set and has access to the model
For vLLM/TGI: Ensure the server was started with the correct model

"HuggingFace token invalid"¶

Generate a new token at https://huggingface.co/settings/tokens
Ensure the token has read access
For gated models, accept the model's license agreement first

Configuration Reference¶

Configuration Method¶

Complete Variable Reference¶

Inference Backend Configuration¶

LLM Judge Configuration¶

Embedding Configuration¶

Backend-Specific Notes¶

OpenAI¶

vLLM¶

TGI (Text Generation Inference)¶

Ollama¶

Transformers¶

Complete .env.example¶

Common Configuration Patterns¶

All-OpenAI (Simplest)¶

All-Local (No External APIs)¶

Mixed (Recommended for Production)¶

Troubleshooting¶

"Missing API key" errors¶

"Connection refused" errors¶

"Model not found" errors¶

"HuggingFace token invalid"¶

See Also¶

Complete `.env.example`¶