Ora Documentation

The FastAPI of NL2SQL — speak to any database with any LLM in 3 lines.

⚡

Quickstart

Up and running in under 30 seconds with SQLite.

⬡

Multi-Source

Query across Postgres, Snowflake, and CSV in one call.

📊

Benchmarks

BIRD 65%+, Spider 85%+ with GPT-4o out of the box.

🖥

Workspace UI

Full visual workspace with execution traces and learning.

Quickstart

The simplest possible Ora query — no schema setup, no vector store, no API key tricks:

python

import ora

db = ora.connect("postgresql://localhost/mydb")
result = db.query("Show me top 10 customers by revenue")

# result.dataframe → pandas DataFrame
# result.sql       → the generated SQL
# result.summary   → plain-English explanation

Zero-dependency demo: Works with SQLite and Ollama — no API key required. Run pip install ora-sql and use sqlite:///path/to/db.sqlite.

Installation

shell

# Core install
pip install ora-sql

# With database extras
pip install ora-sql[postgres]     # asyncpg
pip install ora-sql[snowflake]    # snowflake-connector
pip install ora-sql[bigquery]     # google-cloud-bigquery
pip install ora-sql[all]          # everything

# With Langfuse tracing
pip install ora-sql[langfuse]

Requirements: Python 3.10+. Core dependencies: litellm, sqlalchemy 2.0, sqlglot, chromadb, pydantic 2.

LLM Providers

Ora uses LiteLLM under the hood — every provider works identically:

Provider	Model	Config
OpenAI	gpt-4o, gpt-4o-mini	`ORA_LLM_PROVIDER=openai`
Anthropic	claude-sonnet-4, claude-haiku-3	`ORA_LLM_PROVIDER=anthropic`
AWS Bedrock	Nova Pro, Claude on Bedrock	`ORA_LLM_PROVIDER=bedrock`
Google Gemini	gemini-2.5-pro, flash	`ORA_LLM_PROVIDER=gemini`
Ollama	llama3.3, deepseek-r1	`ORA_LLM_PROVIDER=ollama`
vLLM	Any HuggingFace model	`ORA_LLM_PROVIDER=vllm`

python

import ora
from ora.config import OraConfig

# Anthropic Claude
db = ora.connect("postgresql://...", model="claude-sonnet-4")

# Local Ollama — no API key
db = ora.connect("sqlite:///mydb.db", model="ollama/llama3.3")

# Via env vars
# ORA_LLM_MODEL=gpt-4o-mini ORA_LLM_PROVIDER=openai

Database Support

Database	Connection String	Status
PostgreSQL	`postgresql://user:pass@host/db`	Stable
SQLite	`sqlite:///path/to/db.sqlite`	Stable
MySQL	`mysql://user:pass@host/db`	Stable
DuckDB	`duckdb:///path/to/db.duckdb`	Stable
Snowflake	`snowflake://user@account/db`	Beta
BigQuery	`bigquery://project/dataset`	Beta
Redshift	`redshift+psycopg2://user@host/db`	Preview
CSV / XLSX / Parquet	`path/to/file.csv`	Stable

The Pipeline

Every query runs through a 6-stage pipeline. Each stage is independently configurable and testable:

Intent Classification

Routes simple queries to the fast path, complex or cross-source queries to accurate. Uses a lightweight classifier — no LLM call.

Schema Linking — CHESS LSH

Prunes your schema from hundreds of columns to ~8 relevant ones using locality-sensitive hashing. Scales to 1,000+ table schemas without token overflow.

SQL Generation

Three generators run in parallel (fewshot, plan-based, decompose). A pairwise LLM selector picks the best candidate — CHASE-SQL technique.

Self-Correction — ReFoRCE

Four correction layers: syntax → schema validation → execution errors → semantic sanity. Up to 3 attempts per layer before escalation.

Execution

Sandboxed, read-only, timeout-enforced. AST firewall rejects any non-SELECT statement before it reaches the database.

Result Formatting

Returns a PipelineResult with .dataframe, .sql, .summary (NL), .chart_config, and .follow_ups.

Schema Linking

Ora's schema linker uses the CHESS LSH algorithm (Stanford, 2024) to prune massive schemas before the LLM ever sees them. This is why Ora works at enterprise scale where other tools fail.

python

# Schema pruning happens automatically — but you can inspect it:
result = db.query("top customers by revenue")
print(result.schema_used)
# → SchemaSnapshot(tables=['customers','orders'], columns=8, pruned_from=347)

# Add business context to improve linking accuracy:
db.add_description("customers.lifetime_value", "Total revenue from customer since sign-up")
db.add_metric("revenue", "SUM(orders.total_amount) WHERE status = 'paid'")

Self-Correction

The correction loop runs automatically. It catches ~95% of generation errors before they reach your users:

Layer	What it catches	Mechanism
Syntax	Parse errors, bad SQL structure	SQLGlot parse + error message fed back to LLM
Schema	Hallucinated table/column names	Verify all identifiers exist in DB metadata
Execution	Type mismatches, division by zero	Run query, catch runtime errors, retry with context
Semantic	Logically wrong but valid SQL	LLM checks if result plausibly answers the question

python

# Configure correction depth
db = ora.connect("postgresql://...",
    max_corrections=5,         # default: 3
    strategy="accurate",        # fast | balanced | accurate
)

Multi-Source Queries

The feature nobody else has shipped. Ora decomposes cross-source questions into sub-queries, runs them in parallel, then synthesises the results inside DuckDB in-memory — no data leaves your environment.

python

with ora.connect(
    sales="postgresql://prod/sales",
    inventory="snowflake://warehouse/inventory",
    staff="headcount.csv",
) as db:
    result = db.ask("which stores have highest revenue per employee?")
    print(result.dataframe)
    # → One DataFrame. Three databases. Zero manual join keys.

How it works: Ora's DecomposeAgent splits the query into sub-problems, runs each against the appropriate source in parallel, then SynthesisAgent joins the results using DuckDB's in-memory engine. The join keys are inferred — you never write them.

Security

Security defaults are non-negotiable and cannot be accidentally disabled:

Control	Default	Override
Read-only enforcement	Always on	Not possible by design
AST SQL firewall	SELECT only	Not possible by design
Row limit	1,000 rows	`max_rows=5000`
Query timeout	30 seconds	`timeout_seconds=60`
Input sanitization	Always on	Not configurable
PII masking	Off	`mask_pii=True`

Training & Feedback

Ora gets smarter with every query. Successful NL→SQL pairs are stored in the vector store and surfaced as few-shot examples for similar future queries.

python

# Train a specific pair (thumbs-up equivalent)
await db.train_sql(
    question="top 10 stores by revenue last quarter",
    sql="SELECT store, SUM(total) FROM sales WHERE ...",
)

# Bootstrap from real query logs
from ora.integrations.query_log import SnowflakeHarvester
result = await SnowflakeHarvester(db).harvest(lookback_days=30)
# → HarvestResult(queries_scanned=847, examples_added=312, duplicates_skipped=535)

SOUL Layer

After 20+ queries, Ora builds a SOUL profile — a persistent mental model of how each user thinks about their data. This improves generation accuracy for that specific user over time.

SOUL is always in a try/except. It never breaks a query. If SOUL inference fails, it's silently skipped and the pipeline continues normally.

python

# Inspect what SOUL has inferred
profile = await db.get_soul(user_id="ash")
print(profile.cadence)      # "weekly"
print(profile.entity_focus) # ["store", "revenue", "employee"]
print(profile.vocabulary)  # {"revenue": "SUM(total_amount)", ...}

dbt Integration

Load column descriptions, metric definitions, and lineage directly from your dbt project. No dbt-core import required.

python

from ora.integrations.dbt import load_dbt_docs

result = await load_dbt_docs(db, dbt_project_path="/path/to/dbt")
print(result.models_loaded)      # 47
print(result.columns_described)  # 284
print(result.examples_added)    # 47 (auto-generated questions)

MCP Server

Expose Ora as an MCP server — lets Claude, Cursor, and Windsurf query your database directly from the chat interface.

shell

pip install ora-sql
ora serve --mcp --db postgresql://localhost/mydb --port 8080

Then add to your Claude Desktop / Cursor config:

json

{
  "mcpServers": {
    "ora": {
      "command": "ora",
      "args": ["serve", "--mcp", "--db", "postgresql://localhost/mydb"]
    }
  }
}

Workspace UI

shell

ora serve --db postgresql://localhost/mydb --port 8080
# Open http://localhost:8080

The workspace gives you:

Ask view — chat-first interface with live execution traces (node-by-node)
Data view — interactive knowledge map with FK relationships
Tasks view — full history of every query with full execution traces
Learn view — training progress, accuracy trend, improvement suggestions

Benchmarks

Ora's target scores with GPT-4o (launch day results):

Spider 1.0 EX

85%+

target: >85%

BIRD EX

65%+

target: >65%

Spider 2.0 EX

40%+

target: >40%

CHESS LSH schema pruning is the key accuracy driver — it ensures the LLM receives only the 8 most relevant columns out of potentially thousands, eliminating the #1 failure mode (wrong table/column selection) in production NL2SQL.

Eval Harness

python

from ora.eval import Evaluator

eval = Evaluator(engine=db, dataset="bird")
results = eval.run(model="gpt-4o", n=100)

print(results.execution_accuracy)  # 0.67
print(results.latency_p95)         # 2.4s
print(results.cost_per_query)      # $0.006
results.save_benchmark_card()      # → benchmark.md (README-ready)

API Reference — connect()

python

ora.connect(
    connection_string: str,         # or named kwargs for multi-source
    model: str = "gpt-4o-mini",
    strategy: str = "balanced",    # fast | balanced | accurate
    max_rows: int = 1000,
    timeout_seconds: int = 30,
    max_corrections: int = 3,
    trace: bool = False,
    mask_pii: bool = False,
) → OraEngine

db.query() / db.ask()

python

result = db.query(
    question: str,
    strategy: str | None = None,  # overrides engine default
) → PipelineResult

# PipelineResult attributes:
result.dataframe    # pandas DataFrame
result.sql          # generated SQL string
result.summary      # plain-English answer
result.follow_ups   # list[str] — suggested next questions
result.chart_config # Chart.js config if applicable
result.trace        # PipelineTrace — timing per stage
result.succeeded    # bool
result.row_count    # int
result.cost_usd     # float — LLM cost for this query

Quick one-liner: ask()

python

from ora import ask

result = ask("top customers by revenue", db="postgresql://localhost/mydb")
print(result.dataframe)

train_sql() / train_docs()

python

await db.train_sql(question="...", sql="...")
await db.train_docs(documentation="...")  # free-text business context

REST API

Method	Endpoint	Description
POST	`/query`	Run a NL query, returns JSON result
POST	`/query/stream`	SSE stream — live execution trace events
GET	`/schema`	Full schema metadata
GET	`/schema/graph`	Knowledge map (nodes + edges)
POST	`/schema/refresh`	Force re-introspect database
POST	`/train/sql`	Add a NL→SQL training pair
GET	`/health`	Health check + system status
GET	`/metrics`	Prometheus metrics endpoint

Apache 2.0 · GitHub · Discord

Open Workspace →