The open-source context layer for data agents

Kaelio is the company behind ktx, the open-source context layer that makes data agents reliable. Data agents today fail because they lack structured context and determinism. Metrics are defined in five places, governed in none. Teams spend months on semantic layer projects that never finish. ktx fixes that. Connect your warehouse, BI tools, and docs. ktx builds the context layer automatically. Your team governs it. Any agent can query it. One context layer across your entire data stack.

Active Founders

Luca Martial

Co-founder & CEO

Co-founder & CEO at Kaelio. Previously data scientist and NLP engineer, with expertise in enterprise data systems and AI safety

Luca Martial

Co-founder & CEO

Co-founder & CEO at Kaelio. Previously data scientist and NLP engineer, with expertise in enterprise data systems and AI safety

Andrey Avtomonov

Co-founder & CTO

Co-founder & CTO at Kaelio. Repeat AI safety founder. Previously software & AI engineer at CERN & Dataiku

Andrey Avtomonov

Co-founder & CTO

Co-founder & CTO at Kaelio. Repeat AI safety founder. Previously software & AI engineer at CERN & Dataiku

Company Launches

Kaelio: Open-source context layer for data agents

See original launch post

Hey everyone! We’re Luca and Andrey, the founders of Kaelio.

TL;DR:

Kaelio is the company behind ktx, the open-source context engine for data agents. Claude Code, Codex, and custom data agents can write SQL that looks reasonable, runs fine, and still returns the wrong number.

ktx gives them a context layer: Markdown wiki pages for business knowledge, YAML files for executable metric definitions, joins, grain, measures, dimensions, filters, and segments. Agents ask ktx for the metric they need; ktx plans the query and compiles SQL.

https://www.youtube.com/watch?v=5V4TuzYVlrA

The problem

Agents are good at exploring schemas and writing SQL that looks correct and runs fine, but always ends up using the wrong joins, filters, or metric logic.

To cite a few examples of “agents gone wrong”:

Stale column + hidden business rule: when preparing a board report, a finance analyst asks Claude Code for “ARR by customer segment”, it derives ARR from multiple tables (subscriptions, plans, accounts), then groups by accounts.industry. But CC doesn’t know that this industry column was deprecated a few months prior, or that past board reports excluded paused subscriptions from the ARR calculation
Join fanout: a data analyst at a retailer uses their company’s internal agent to prep a product revenue deck for a QBR. The agent joins orders to order_items, then sums orders.total _amount_cents grouped by order_items.product_id. The SQL runs fine, but each order’s revenue is repeated once per line item, which most people will miss if most orders only have 1 item
Missing attribution logic: a marketing analyst asks Codex “Which campaigns drove the most revenue?” Codex joins marketing_touches to users to orders and groups by utm_campaign. But since each order can have multiple touches before purchase, the same order can be credited to first touch, last touch, every touch, or every campaign the user clicked before buying. If the agent chooses the method that doesn’t match the team’s attribution logic, they’ll make suboptimal decisions

The issue is that schema access doesn’t tell an agent which metric definition is approved, which dimension is stale, or what jargon the company uses internally.

How ktx works

ktx splits context into 2 parts:

Business context: Markdown wiki pages (definitions, conventions, jargon, gotchas).
Executable definitions: YAML files declaring tables, row grain, joins, measures, dimensions, filters, and filter groups.

Both are plain files in git.

When an agent needs a metric, it asks ktx for a measure + dimensions + filters instead of writing SQL itself. ktx’s planner picks the join path, uses grain and relationship metadata, catches issues like join fanout and chasm joins, and compiles the warehouse SQL.

uploaded image

ktx can ingest context from:

Warehouses: Postgres, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, SQLite, (more coming)
Modeling tools: dbt, MetricFlow, LookML, (more coming)
BI tools: Looker, Metabase, (more coming)
Docs: Notion, (more coming)
Live corrections from users during agent sessions

How we got here

While building data agents for dozens of companies, from SMBs to enterprises, we learned 2 valuable lessons.

Giving agents more context through prompts, skills, or Markdown docs helps them navigate the schema, but the final step is still pretty much “write the SQL from scratch.” Since the entities written to these docs are rigid, the agents still have to decide which joins or definitions to use, how to aggregate, and whether a result is safe to trust.
Semantic layers solve the executable part, but they’re extremely painful to build and maintain. Also, a lot of useful context lives outside the semantic layer: dbt, dashboards, query history, warehouse metadata, Notion pages, Slack threads, and corrections from analysts.

ktx combines the best of both worlds: the breadth of a knowledge base + the SQL safety of a semantic layer, optimized for agent use and maintenance.

Try it out

GitHub: https://github.com/Kaelio/ktx

Install manually:

npm install -g @kaelio/ktx
ktx setup

Or tell your agent to do it for you:

Run npx skills add Kaelio/ktx --skill ktx and use ktx skill to install and configure ktx

If you’d like help managing context for your data agents, book a demo for ktx’s managed version: https://www.kaelio.com/products/ktx-cloud