Tuning a Genie Space until analysts trusted it: how Spaces ended up coexisting with Power BI

Author: Tyler Faulkner

22 June, 2026

The Reporting Gap Between Business Questions and Dashboards

As a Data Engineer, I spend far more time working as the glue between systems than actually implementing new logic. Business wants a question answered. Analysts go check the semantic layer to see if the answer already exists, and a lot of the time it doesn’t, let alone exists as a Power BI dashboard somewhere. The seat math is worth mentioning here too. At our client, a big chunk of the Power BI Pro licenses belong to people on the data side who almost never open the tool (myself included). The real usage sits with analysts and business users, and the questions that kicked off this whole evaluation were the ones no dashboard was ever going to answer anyway.

Those questions usually mean new data needs to be ingested, which means I need to understand both ends of the pipeline. What is the analyst actually trying to achieve, and what format does the incoming data show up in. Databricks Autoloader turbocharges this part by turning unstructured data into Delta tables that get exposed directly in Snowflake. The catch is Autoloader only mimics the original structure, and that doesn’t inherently make the data approachable or understandable. The age old philosophy of garbage in, garbage out, and sometimes you’re the one who has to take out the garbage. I’ve built a fixed width file to Delta converter in Spark, dealt with external Delta tables storing XML as string types, and reverse engineered a complex JSON API payload back into a relational structure. All of this work generates a ton of tribal knowledge, and that knowledge usually ends up in wiki pages, Word docs, and team chats that get buried and forgotten about over time.

Scoping the First Genie Space

I recently spent some time assessing whether Genie Spaces could act as a conversational host for that knowledge instead. The theory we proposed was that if you produce an agent that truly understands the data well enough, it becomes an exploration engine for combining datasets together. Get one Space tried and tested, then make copies and transfer that foundation onto new third party data. Step one of that plan is a Space that gives correct results when an actual analyst audits it.

From previous work on knowledge base chatbots and assistants, I knew we needed to narrow the scope. LLMs are glorified probability machines. Context is everything, and ambiguity is where they roll the dice in the wrong direction and hallucinate. At an insurance client there are a lot of interwoven systems at play, and the closer you get to the reporting layer the more tangled it all becomes. So we scoped down to the 7 most vital claims tables. A core star schema, one fact table surrounded by six dimensions covering the claim itself, what’s covered, who’s handling it, and when and where it happened. That leaves out nearly 100 other tables in the model (police reports, workers comp, litigation, the list goes on). Fewer branches to traverse, less likelihood of hallucination. The other reason we picked these 7 is that analysts could name them off-hand during planning meetings, so we knew they were known and used often. Setup took under half a day. The client ingests through Databricks but does all warehousing and modeling in Snowflake, and a federated external catalog connection was already wired into our Unity Catalog from another initiative, so attaching the 7 tables was a few clicks.

Before adding any context at all, we ran some dry runs to see how the model behaves with zero assistance. The results honestly surprised us. We asked it to rank adjusters from most claims handled to fewest, and Genie inferred the joins purely from table and column names, wrote a solid aggregate query, and even auto plotted the results without being asked. It wasn’t all clean though. Ask about claims in a specific state and it filtered on the full state name, when our data stores abbreviations, so the query ran fine and returned nothing. An empty result instead of an error. That one stuck with us. (Quick tip, when testing LLM responses always start from a fresh chat so your context window is actually reset.)

Tuning the Space with Examples, Entity Matching, and Context

Then we started tuning, incrementally, because if you change five things at once you’ll never figure out which change did what. We had ten question and SQL pairs gathered from stakeholders, so we gave Genie five and held five back for testing. Our process was honestly more vibe based than statistical. Each test answer got an initial rank of perfect, good, ok, weak, or wrong. Make one change, run the five questions again, mark each one better, same, or worse. Small sample size and a short deadline, but it turned vibes into something repeatable that surfaces patterns, and the ratings can be turned into numbers if you ever want to scale it.

Adding the held back questions in as example queries got us modest gains. Questions that sat close to an example improved noticeably, the rest stayed roughly where they were. The real win was entity matching on our state and line of business columns. That empty state result from the dry runs came down to the model filtering on full state names while our data stores abbreviations. We fed it the distinct values of those columns, both low cardinality, under a hundred values each, and the gap closed. Questions like claims grouped by state that never used to work started coming back correct. The prompt carried the rest. Date conventions for reporting, where the data originated, plus the tribal knowledge our team accumulated while building the original model. That last part mattered more than I expected. With the right context Genie stopped just answering and started framing answers the way an analyst would hand them to business. Right terminology, numbers grouped the way leadership actually wants to see them, not a raw query result somebody still has to translate.

Where Genie Fits Alongside Power BI

Then we proved it against people. Three analysts spent a day querying the Space with questions they ask frequently and some they rarely ask, rating each response with the built in thumbs up and down. We did a final tuning pass off their feedback. The reviews were good, and the more telling part was they started naming other datasets they wanted this same treatment on.

The honest caveat is the upfront cost. You need engineers who know the system well to supply the right context, plus the time to validate, and that investment is the hardest part to sell to a stakeholder. What you get back is the knowledge that used to disappear into Teams threads and old emails, now permanent, centralized, and query-able. The Power BI relationship sorted itself out over time without anyone forcing it. Nobody treated the Space like a dashboard replacement. It’s where questions get explored. When an ad hoc question starts getting asked more than once, that’s the signal it should graduate into a proper Power BI dashboard for the business. Genie sits upstream. Power BI sits downstream.

Genie isn’t a magic wand. You have to point it at the right context for the right job, and somebody who actually knows the data has to put in the hours up front. But that’s kind of the whole trick. The tuning work we did on this Space is the same documentation work we were never going to get around to anyway, except now it lives somewhere an analyst can ask it questions. We built this once against seven tables. The next one will be better and faster.

If you’re curious about Genie Spaces, or any other interests for Databricks solutions, we’d love to have that conversation. Please reach out using the form below.

Blogs

It’s been a couple of weeks since the announcements at...

Blogs

Building a Secure Foundation for Azure Databricks: Lessons from Designing a Secure Reference Architecture

In today’s digital environment, it’s no longer enough for businesses...

Case Studies

Driving Financial Wellness Through Custom Digital Solutions

Discover how Xorbix delivered custom software development, cloud-based...

Case Studies

Modernizing Heavy Equipment Operations with a Multi-Platform Manuals & Documentation Tool

Discover how Xorbix delivered a custom software solution &...

Let’s Start a Conversation

Request a Personalized Demo of Xorbix’s Solutions and Services

Discover how our expertise can drive innovation and efficiency in your projects. Whether you’re looking to harness the power of AI, streamline software development, or transform your data into actionable insights, our tailored demos will showcase the potential of our solutions and services to meet your unique needs.

Take the First Step

Connect with our team today by filling out your project information.

Services

Solutions

Tuning a Genie Space until analysts trusted it: how Spaces ended up coexisting with Power BI

Tuning a Genie Space until analysts trusted it: how Spaces ended up coexisting with Power BI

Author: Tyler Faulkner

The Reporting Gap Between Business Questions and Dashboards

Scoping the First Genie Space

Tuning the Space with Examples, Entity Matching, and Context

Where Genie Fits Alongside Power BI

Let’s Start a Conversation

Request a Personalized Demo of Xorbix’s Solutions and Services

Take the First Step

Address

Billing Inquiries

Information and Sales