Getting Started¶

This section describes how to install dependencies, configure the environment, and verify that ALUE is set up correctly.

ALUE has been tested primarily on Python 3.10 (with partial testing on 3.11).
Other versions may work, but are not officially supported.

1. Installation¶

ALUE supports two installation methods.
We recommend uv for reproducibility and speed.

Using `uv` (preferred)¶

# install dependencies into a managed virtual environment
uv sync

This creates a .venv directory automatically. No manual venv creation is needed.

Using `pip` (fallback)¶

# create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# install dependencies
pip install -r requirements.txt

2. Environment Configuration¶

ALUE uses a .env file for configuration. Create one from the example:

cp .env.example .env

Then edit .env to add your API keys and endpoints. At minimum, you'll need to configure:

For OpenAI (simplest):

ALUE_ENDPOINT_TYPE=openai
ALUE_OPENAI_API_KEY=sk-...

For local Ollama:

ALUE_ENDPOINT_TYPE=ollama
ALUE_ENDPOINT_URL=http://localhost:11434

For vLLM:

ALUE_ENDPOINT_TYPE=vllm
ALUE_ENDPOINT_URL=http://localhost:8000/v1
HF_TOKEN=hf_...

Note: Different tasks require different configuration. RAG and Summarization tasks require additional LLM Judge and embedding settings. See the Configuration Reference for complete details on all available settings and backend-specific requirements.

3. Quick Verification¶

To check that ALUE is installed and functional:

# run the test suite
pytest tests

# or run a sample task (MCQA)
python -m scripts.mcqa inference \
  -i data/aviation_knowledge_exam/3_1_aviation_test.json \
  -o runs/mcqa \
  -m gpt-4o-mini \
  --task_type aviation_exam \
  --num_examples 3

If successful, predictions will be written to runs/mcqa_<timestamp>/predictions.json.

Next Steps¶

Configuration Reference — Complete .env variables reference with troubleshooting
Models & Backends — Overview of supported inference and embedding engines
Tasks — Detailed guides per task:
MCQA — Multiple choice question answering
RAG — Retrieval-augmented generation
Summarization — Narrative summarization
Extractive QA — Span extraction
Creating Datasets — Dataset format specifications for each task type ```