The Future of BI: Will AI Replace BI Developers?

I have been asked this question at every conference I have attended in the last two years. Not always directly. Sometimes it arrives as “what do you think about Copilot” or “is there still a point learning DAX properly.” But the underlying question is always the same: is my job going to exist in five years?

It is a fair question. When you watch GitHub Copilot complete a CALCULATE function before you have finished typing the first argument, or paste a business requirement into Claude and get back a working Power Query transformation, it is easy to understand why people are asking it.

Here is my honest take, as someone who has been building BI solutions since 2006 and who has spent the last year testing these tools in real work, not in demos.

What These Tools Can Actually Do Today

I have been using ChatGPT, Microsoft Copilot and Claude regularly over the last eighteen months, in actual client work and personal projects. Not in theory. In my editor, on real data models, with real business requirements.

ChatGPT is strong at generating DAX and SQL when you give it enough context. If you describe the table schema, the business logic and the expected output format, the first attempt is often close enough to use with minor adjustments. I have used it to draft measures I would have spent an hour on, in five minutes. It is not always right on the first pass, but it moves fast enough that iteration costs less than starting from scratch.

Microsoft Copilot inside Fabric and Power BI has improved noticeably over the last year. The report creation assistant went from generating generic placeholder visuals in early 2024 to producing layouts I would actually take and refine for production use in 2025. It will not replace design judgment, but it removes the blank canvas problem. For report creation at scale, that matters.

Claude has been the most useful for reasoning about model design. I used it to reverse-engineer a semantic model from an annotated schema diagram and it handled the relationships and measure dependencies better than I expected. I also hit real limits: it had no knowledge of my actual data, made assumptions that were wrong for our domain, and I needed five or six iterations before the output was usable. The hiccups were real. I am not smoothing those over.

The pattern that holds across all three: if the task is well-defined, the context is fully provided, and the output can be verified quickly, these tools are fast and genuinely useful. That describes a significant portion of the day-to-day coding work in a typical BI role.

Where They Still Fall Short

None of these tools know your business.

That sentence sounds simple, but it is where the real gap sits in practice. Generating a CALCULATE function is not difficult once you know the filter context. The difficult part is knowing that in your organization, “active customers” means accounts with at least one transaction in the last 90 days, but only in the consumer segment, and that this definition changed in Q2 2024 and needs to be handled differently across historical and current period comparisons.

That context lives in your head, in Confluence pages nobody reads, in a meeting that happened eighteen months ago, and in an email from a finance analyst who has since left the company. No AI tool picks that up from a prompt. You bring it, or it does not exist in the output.

The same gap shows up in data quality judgment. AI tools will generate a transformation pipeline from a spec without questioning whether the spec is correct. They will not notice that the order date column has 8% null values, or that this matters for the revenue calculation, or that the exceptions exist because of a legacy system migration that your company did in 2019. You notice, because you have seen the data before and know what those patterns mean.

Report creation has the same ceiling. The AI can build a layout, suggest a chart type and write a title. It cannot make the call that this dashboard will be shown on a 40-inch screen in a warehouse and that the font size from the default template will be unreadable from three meters away. That judgment comes from having sat with the people who use the reports.

What This Actually Means, by Role

The impact is not uniform. It depends on how much of your current working week involves mechanical repetition.

For BI developers, the most immediate change is in code generation. Boilerplate DAX, repetitive ETL transformations, standard report templates: these are exactly the tasks AI accelerates most. If this work is a large part of your week, your output volume will increase and expectations around it will rise with it. That is not a threat if you understand it early enough to stay ahead of it.

For data analysts, the change is most visible in exploration speed. Asking Claude or Copilot to produce a first-pass analysis of a dataset, flag anomalies or suggest groupings gets you to a hypothesis faster. The interpretation of that hypothesis, and the judgment about whether it is the right hypothesis for the actual decision being made, remains yours.

For data engineers, pipeline boilerplate generation and schema documentation are straightforward wins. Debugging complex transformation failures where the error is opaque is an area where these tools also provide real value, particularly if you can paste the full stack trace and table definitions into the prompt.

For data architects, the tools work well as thinking partners for structured design questions. Talking through a proposed model, generating documentation drafts, checking naming conventions across a schema. The decisions about what to govern, where domain boundaries sit, and how to design for the organizational reality rather than the theoretical ideal still require judgment that comes from knowing the context.

For data governance specialists, the upside is in documentation and lineage drafting. The real governance work, defining what quality means for a specific data product and making that stick across teams, is still a people and process problem that no AI tool solves for you.

My Perspective as a Practitioner

I am not going to tell you AI will not change BI work. It already has. But the question of whether it replaces BI developers wholesale is the wrong framing.

The more useful question is: which parts of your current work are mechanical enough that AI tools can do them faster and cheaper? Be honest about that list. If it is long, that is information worth having now rather than in two years when the market has already adjusted.

Here is what I have observed across the work I do and the people I talk to at conferences: the work that clients are most anxious, most uncertain, and most willing to pay for is not the part AI is good at. It is the part that requires knowing their business, understanding who trusts what report and why, having the conversation with the finance director who distrusts the new model, and standing next to the operations manager while they explain what the dashboard actually needs to show.

That is still practitioner work. AI does not do it. It accelerates the technical work that happens around it, which gives you more time for the part that matters most.

There is one shift I have started to notice though, and it is worth naming. The finance director on the other side of the table is also using these tools. They arrive at the meeting having already asked Claude to explain the variance, or having had Copilot summarize the dashboard. They come in with a better baseline understanding than they had two years ago, and that makes for a more productive conversation. You spend less time on mechanics and more time on the actual decision. That is a good thing, not a threat.

The hallucination problem is real and it is the most reasonable objection anyone raises to trusting AI output in production work. But it is also improving faster than most people expected a year ago. The gap between what these tools got wrong in early 2024 and what they get wrong now is measurable. I expect that to continue. The sensible approach is to verify outputs where the stakes are high and to track how often you have to correct them over time. That number has been going down in my experience.

The BI developers I have watched get uncomfortable recently are the ones who built careers around syntax knowledge, template maintenance and format conversion. Those jobs are genuinely changing. The developers who are doing well are the ones who understood that the syntax was never the point: the business problem was the point, and tools that help with the syntax give them more time for the problem.

My advice is straightforward: learn the tools, use them on real work rather than tutorial datasets, and find out where they fail on your actual data in your actual domain. That is the only way to know where your judgment matters more than their suggestion.

For me, the answer to whether AI will replace BI developers is: not the ones who are clear about what they are actually being hired to do.

What has your experience been with AI tools in your daily BI work? I would like to know what people are actually finding useful versus what sounds better in a conference session than it works in practice.

Never say never again: Keeping Your SSMS Server List Safe in 2026

Twelve years ago 🤯 I wrote a quick post about never losing your server list in SSMS again. The short version was: copy one file, stay sane. The file was called SqlStudio.bin, and the trick still works today if you are on an old enough version.

But if you are running SSMS 19, 20, 21 or 22 on Windows 11, the file is gone. The settings have moved, and the format has changed. The principle is the same, but the file you need to grab is different.

Here is the updated version for 2026.


What Changed

In the older versions of SSMS, the registered server list was stored in a binary file:

C:\Users\<PROFILE>\AppData\Roaming\Microsoft\SQL Server Management Studio\<VERSION>\SqlStudio.bin

From SSMS 19 onwards, the file you want is:

C:\Users\<PROFILE>\AppData\Roaming\Microsoft\SQL Server Management Studio\<VERSION>\UserSettings.xml

Same idea, different file, plain XML instead of binary. That last part is actually an improvement: you can open it, read it, and understand what is in there before you copy it anywhere.


Where to Find It

On Windows 11, with SSMS installed and a server list you want to keep, navigate to:

C:\Users\<your Windows username>\AppData\Roaming\Microsoft\SQL Server Management Studio\

You will see one folder per installed SSMS version. I currently have four:

  • 19\
  • 20\
  • 21\
  • 22\

Inside each one, look for UserSettings.xml. That is the file that holds your registered servers.

The AppData folder is hidden by default in Windows 11. If you do not see it in Explorer, type the full path into the address bar, or go to View > Show > Hidden items.


What to Do With It

The same logic as 2014 applies:

  1. Make a copy of UserSettings.xml from your current machine.
  2. Store it somewhere you can get to it: OneDrive, a network share, a USB drive, wherever works for you.
  3. On a fresh install, copy the file into the correct version folder before opening SSMS.

SSMS reads the file on startup. If the file is in place when SSMS opens, your registered servers appear as if you never left.

If you have multiple SSMS versions installed side by side and maintain different server lists in each, you need a separate copy per version. They do not share the file.


One Thing Worth Knowing

The file is XML, so you can also merge server lists manually if you need to combine two environments into one. I have not done this in a scripted way, but looking at the structure it is readable enough that it would not take long to figure out.

If you are already using Azure Data Studio instead of or alongside SSMS, the settings are stored differently again. That is a separate post.


The short version: the file changed, the folder structure stayed the same, and the trick still works. Copy UserSettings.xml before you reinstall anything.

Azure Databricks at FabCon 2026: What Got Announced and What It Actually Means

FabCon 2026 in Atlanta last week was bigger in many ways. For the first time, the Microsoft Fabric Community Conference ran alongside SQLCon, which meant two large communities shared the same convention center, the same coffee queues and the general atmosphere of too-many-sessions-not-enough-sleep that defines any conference worth attending.

Databricks showed up with several announcements. Some of them are incremental. A few are worth more than a passing glance, depending on where you sit on the data and BI stack.

One thing that still catches people off guard: Azure Databricks has been a first-party Azure service since 2017. Not a partner product, not a third-party integration. A first-party Azure service, alongside Power BI, Excel, Teams, Azure OpenAI, Copilot Studio and the Power Platform. When Microsoft talks about a unified data and AI platform on Azure, Databricks is part of the architecture. The announcements this week make that more visible.

Here is what they shipped, and what I think it means in practice.


Lakeflow Connect Free Tier: 100 Million Records a Day, at No Cost

Bad pipelines are one of the constants of BI work. The data arrives late, the connectors are fragile, someone is maintaining a web of custom scripts because there was never time to do it properly, and the people building reports spend half their week cleaning up after problems they did not cause.

Databricks announced a Lakeflow Connect Free Tier, and the headline number is worth taking seriously: 100 million records per workspace per day, at no charge. That is 100 free DBUs per day, included with every workspace, before standard Lakeflow Connect pricing applies.

What it connects to out of the box:

  • Databases: SQL Server, Oracle, Teradata, PostgreSQL, MySQL, Snowflake, Redshift, Synapse, BigQuery
  • SaaS applications: Dynamics 365, Salesforce, ServiceNow, Workday, Google Analytics

For databases, the ingestion runs on Change Data Capture, which means it reads the transaction log incrementally rather than scanning full tables. Data lands in Delta format on Azure Data Lake Storage. Unity Catalog governance applies from the moment the first record arrives, so access control and lineage are not something to sort out later.

Databricks quotes 25x faster pipeline builds and 83% ETL cost reduction. I would take vendor benchmarks with the usual scepticism, but the direction is clear: the intent is to make data ingestion a problem you configure rather than one you maintain. For a BI team currently paying for third-party Dynamics or Salesforce connectors, or running CSV exports on a schedule, this is worth a practical test.


Lakebase Is Generally Available: A Postgres Database Inside the Lakehouse

This one sits closer to architecture and engineering than it does to daily BI work, but it changes some assumptions that are worth understanding.

Azure Databricks Lakebase is now generally available in 14 Azure regions. It is a managed, serverless Postgres service that runs inside your lakehouse, on the same storage as your Delta tables.

The problem it addresses is one data architects have been working around for years: operational data and analytical data have historically lived on separate platforms, connected by pipelines that were always someone’s responsibility and frequently nobody’s priority. Lakebase puts an operational database directly in the same governed environment as the rest of the data platform.

Key characteristics:

  • Full Postgres compatibility, with support for extensions including pgvector and PostGIS
  • Compute and storage separated, with scale-to-zero and sub-second startup
  • Branching and instant restore for development and testing workflows
  • High availability with automatic failover across availability zones

The use cases Databricks highlights: transactional analytics on operational data, AI agent state management, and customer personalization and feature serving. For data engineers building pipelines that feed AI applications, Lakebase removes the need to run a separate operational database outside the Databricks platform just to give an agent somewhere to write state.

It is available to test today in 14 regions. If you have been looking for a Postgres layer that sits inside the lakehouse without architectural compromise, now is a reasonable time to look at the documentation.


The Excel Add-in Is in Public Preview: Governed Lakehouse Data in the Tool Most People Actually Use

This is the announcement that will get the most immediate attention from analysts and business users, and probably the one that causes the most internal conversations about data governance.

The Azure Databricks Excel Add-in is in public preview. It connects Excel directly to Unity Catalog tables and Metric Views. From inside Excel, you can browse the catalog, build pivot tables from governed semantic definitions, and filter and analyze data without writing SQL. It works on Excel for Windows, macOS and the web.

The problem it addresses is one every BI developer and governance specialist knows well: business users need data in Excel. So someone exports a CSV. Or the business user pulls their own export. Within 24 hours there are four versions of the file in four different places, none of them current, all of them cited in separate meetings. The analyst who originally produced them has no idea which version is being used.

The add-in replaces that pattern with a live connection to the same tables that power your Power BI reports and your analytics models. The data is current. The access rules in Unity Catalog apply here too, so a user who cannot query a table in Databricks cannot query it through the add-in either.

For analysts who work primarily in Excel, this is a genuine change in how a typical Tuesday works. For governance teams, it removes a whole class of ungoverned data copy that currently exists because there was no better option.


Genie Gets More Capable: Agent Mode, Genie Code and Databricks One

Genie is Databricks’ conversational analytics experience: you ask a data question in plain language and get back a chart, a table or a narrative answer. Databricks reported this week that 98% of Databricks SQL warehouse customers are using AI/BI, with monthly active Genie users up more than 300% year-over-year. The numbers are moving fast enough to suggest this has passed the experimental phase.

Three updates this week.

Genie Agent Mode

Standard Genie answers one question at a time. Genie Agent Mode takes a more complex business question, builds a research plan, runs multiple queries, tests intermediate results, refines its approach and then delivers a complete answer with supporting tables, charts and narrative context.

The difference becomes concrete quickly. Standard Genie handles: “What were total sales in Q3?” Genie Agent Mode handles: “Revenue in the Southeast dropped in Q3. Why did that happen, and what does the pattern suggest for Q4?” That is not a single query. It is an investigation, and Agent Mode runs it without someone having to direct every step.

For analytics managers sitting on a queue of complex ad hoc requests that only a senior analyst can currently answer, this is the update worth spending time with.

Genie Code

Genie Code is aimed at data practitioners, not end users. It is an agentic development assistant that runs inside Databricks notebooks, SQL editors and Lakeflow pipelines.

The distinction from a general-purpose AI coding assistant is that Genie Code understands your data context through Unity Catalog. It knows your tables, your lineage, your governance policies and your business semantics. With that, it can build pipelines and dashboards from natural language prompts, debug Lakeflow failures, generate queries grounded in your actual schema, and handle routine operational monitoring.

For senior BI developers and data engineers who spend part of every week on repetitive work that requires knowing the platform well, having an assistant that actually knows prod.gold.customer_activity is a different experience from hitting tab on a general-purpose tool that has never seen your schema.

Databricks One and Databricks One Mobile

Databricks One now includes a unified multi-agent chat experience powered by Genie. Business users can ask questions across the full data estate without needing to know which Genie space to route to. When a question goes beyond what existing spaces can answer, Databricks One can bring in additional agents to investigate. AI/BI dashboards and Databricks Apps are surfaced in the same interface.

Databricks One Mobile brings this to iOS and Android: Genie, dashboards and apps from a phone. Business users can ask data questions without being at a desk.


Genie in Microsoft Teams: Data Answers Where the Decisions Actually Happen

For organizations already using Microsoft 365, this is probably the most immediately deployable announcement.

You can now connect Genie to Microsoft Teams via Copilot Studio. The setup connects a Genie space to a Teams agent through the Copilot Studio connector, which handles the API and MCP logic. Once connected, users can ask data questions directly in a Teams conversation and get answers backed by your lakehouse data.

The part that makes this credible to security teams and BI leaders: every conversation runs through OAuth, authenticated against the user’s own identity. If a user does not have SELECT access to a table in Unity Catalog, Genie will not surface that data in Teams. The access model you already manage in Unity Catalog carries through to every Teams conversation.

For data governance managers who have spent years explaining why pasting screenshots of reports into Teams messages is not the same as having a governed answer, this changes the practical alternative. The question gets answered where it was asked, with the right access controls applied, and nothing leaves the governed environment.

For business users, it means getting a trusted data answer without leaving the tool they already have open.


What I Am Taking Away From This Week

The pattern across all of these announcements is one I have been watching build for a couple of years. Operational data, analytical data and AI have historically lived on separate platforms, and the work connecting them got called integration. That work is expensive, slow, and usually the first thing cut when a project runs over budget.

What Databricks is building is a single platform where all of it sits together, governed by Unity Catalog, accessible from Excel, Teams, a notebook, a mobile app or a SQL query. Whether the individual pieces fit together as neatly in production as they do in the announcement demos is something I will be watching as they move from preview toward GA.

If you were at FabCon this week, the Databricks session was Thursday March 19th in room C302 and should be available on demand if you missed it.

The next major Databricks gathering is Data + AI Summit, June 15 to 18, 2026, in San Francisco. 25,000 attendees, 800+ sessions, and the most complete view of where the platform is heading. Worth putting on the calendar.


What caught your attention this week at FabCon? Drop a comment below. I would like to hear what people are actually planning to test.

MVP Summit 2026

Today is day one of the Microsoft MVP Summit 2026. The event runs until the 26th, and the core of it happens on the Microsoft campus in Redmond, Washington. For the second year in a row, I’m joining from my home office.

The Summit is an invitation-only event, open to active Microsoft MVPs and Regional Directors. You sign an NDA and spend a few days getting direct access to the product teams building the tools you use and advocate for every day. Real roadmap conversations, early previews, and the chance to make your voice heard in rooms where decisions are still being made. Around 3,000 MVPs attending, from all over the world.

It is a great event and I wouldn’t want to miss out. I should say that plainly, because what follows is honest and not a complaint.

The remote experience

Attending remotely works. The virtual sessions run well, the content is real, and I come away with things worth knowing. I’m not going to pretend otherwise.

But here’s the thing: the sessions are maybe half of what makes the Summit worth attending. The other half is the people. Three thousand of the most experienced, most generous, most technically opinionated people in the Microsoft ecosystem, in the same place for three days. The conversations that happen between sessions, at dinner, in the corridors, over a coffee at the venue. That is where a lot of the real value is, and that does almost not exist in a virtual format. It’s not the organizers fault, it’s inherently the format.

Product Group Day

The specific part that really stings to miss is Product Group Day.

It is in-person only. No virtual stream, no recording, no alternative. It is where MVPs get direct, unscripted time with the engineering teams, and where the feedback that actually matters gets delivered face-to-face. It is the most unique piece of the whole event.

Still here

The time zone gap between Copenhagen and Redmond means most sessions land in the late evening and push into the early hours. That cuts both ways: the regular work day stays intact, which is good, and somewhere around the second session after midnight the tiredness kicks in, which is less good.

But I will be there, taking notes, looking for the things worth acting on.

And I’m already making a mental note about 2027.

Exploring Fabric Ontology

Note: The Fabric Ontology is currently in preview as part of the Fabric IQ workload. Features and behaviour may change before general availability.

I have been spending a little time with the Microsoft Fabric data agent documentation lately, and one pattern keeps showing up, and it is not just in the official guidance but in community posts from people who have actually tried to deploy these things: the demo runs beautifully. The AI answers questions in plain English, leadership gets excited, the pilot gets approved. Then it hits production. Real users send real questions. The answers start drifting. Numbers that should match do not. The same question returns different results on different days. Trust evaporates faster than it was built.

And almost every time, the root cause is the same thing: the semantic foundation was not solid enough before anyone pointed an agent at it.

That is exactly the problem the Fabric Ontology is designed to address. It is the piece I think most teams will underestimate right up until the moment they need it.


Why the Data Agent Gets It Wrong

Generative AI is genuinely good at working with language and meaning. What it cannot do is fill in documentation that was never written.

Most enterprise databases were built for systems, not for consumption. Column names follow technical conventions an engineer settled on years ago. Business logic lives in a stored procedure nobody has touched since SQL Server 2014. Which customer table is the authoritative customer table? Documented nowhere. The abbreviation cust_rev_ytd_adj was obvious to the person who named it. To everyone else, including an AI agent, it is a puzzle.

When you connect an agent to that data and ask it to answer business questions, you are asking it to decode a language it was never given a dictionary for. It is not going to find meaning that was never documented. Someone has to build that foundation deliberately, before the agent gets anywhere near it.

This is not a new problem. It is the same problem that made undocumented semantic models painful for analysts, made onboarding new BI developers slow, and made “what does ‘active customer’ mean?” a recurring meeting agenda item. The AI just made it impossible to paper over.


What the Fabric Ontology Actually Is

The Fabric Ontology operates above the table and column level, at the concept level, the level where business people actually think and where agreements actually need to live.

Three building blocks:

Entity types are the real-world objects your business runs on. Customer, Order, Product, Shipment, Store. Defined once, with a stable name, description, and identifiers. Not four slightly different customer tables with different primary keys depending on which source system populated them first.

Properties are named, typed facts about an entity. Instead of a column called cust_rev_ytd_adj, you publish a Customer property called Adjusted Year-to-Date Revenue with a declared unit, a data type, and a binding to the underlying source column. Something a new analyst can understand without asking someone who remembers the original intent. Something an AI agent can reason about without guessing.

Relationships are explicit, directional, typed links between entities with cardinality rules. Customer places Order. Order contains Product. Shipment originates from Plant. Made reusable and visible, rather than buried in join logic spread across three different pipelines and a Power BI measure that no one wants to open.

Those concept definitions then bind to your actual data in OneLake: lakehouse tables, Eventhouse streams, Power BI semantic models. The data bindings handle schema drift, enforce data quality checks, and track provenance at the concept layer.

The result: a shared vocabulary that both people and AI agents can reason over. When an agent is grounded in a well-defined ontology, it is not reverse-engineering meaning from raw tables. It is working from a context that someone owns and maintains.


The Ontology Graph: Relationships as Queryable Data

The Fabric Ontology also builds an ontology graph from your data bindings and relationship definitions: a queryable instance graph where entity instances are nodes and relationships are edges, each carrying metadata and data source lineage, refreshed on a schedule.

For anyone who has spent time making implicit relationships explicit and queryable, this is worth understanding. Context that previously only existed as join logic, tracing which customers are tied to which orders and which products trace back to which suppliers, becomes something you can traverse, analyze, and govern. Path finding, centrality analysis, community detection: graph algorithms applied to your actual business data.

On top of that sits a Natural Language to Ontology (NL2Ontology) query layer that converts plain-language questions into structured queries across your bound sources, routing automatically to GQL for graph queries or KQL for Eventhouse. Not a best-effort guess at what a column might mean. Consistent answers that follow the definitions you published in your ontology.


Three Things That Actually Matter Before You Build the Agent

I have not shipped a production data agent grounded in a Fabric Ontology end-to-end yet. The feature is still in preview and I am still working through it. But the guidance is consistent enough across documentation and early community experience that I think these three things are worth naming before you start.

Build the semantic foundation first

This is the step that gets rushed. The agent is only as reliable as the context it has to work with. If your semantic model has undocumented measures, ambiguous column names, and definitions that three different teams would answer three different ways, an ontology built on top of that inherits all of it.

Before connecting an agent to your data:

  • Audit your semantic model. Are measure names self-explanatory? Are the terms your business uses defined anywhere?
  • Generate a Fabric Ontology from your semantic model as a starting point. Fabric can auto-generate one to give you something concrete to refine, rather than starting from a blank canvas.
  • Write descriptions for the columns and measures that currently only make sense to the person who created them.
  • Resolve the definitions that lack consensus before rollout, not after. “What counts as an active customer?” is not a technical question. It is a business alignment question. It needs an answer before the agent encounters it at 9am from a business stakeholder.

Keep each agent’s scope narrow

The temptation is to build one agent that answers everything. It almost always underperforms. The more data an agent has to reason over per question, the harder it is for it to return consistent answers.

A sales agent. An inventory agent. A finance agent. Each one is easier to configure, easier to test, and easier for the people who rely on it to trust, because the scope is legible and the owner is clear.

Start with one domain. The one where trust matters most and the semantic definitions are clearest. Do it properly. Let that one earn credibility before expanding.

Write the instructions like you are briefing a smart new colleague

Data agents are probabilistic: they use statistical reasoning to determine the most likely answer. Business users expect deterministic behavior, meaning the same question should return the same answer every time. Detailed agent instructions are the primary lever for closing that gap.

Think of it as the standing brief you would write for a new analyst on their first day: here is what matters, here is how we define things, here is what belongs out of scope, and here is what to do when a question is ambiguous.

For your most critical business questions, Fabric data agents support sample questions with pre-defined SQL, DAX, or KQL behind them, removing the probabilistic element entirely for those specific scenarios. Use it. Treat the instructions as a living document and update them as you learn how people in your organization actually phrase questions.


Where I Am On This

The hard part of building reliable AI over enterprise data is not the model. It is the semantic gap between raw data structures and the meaning business users expect the model to already know. The Fabric Ontology looks like the most direct thing Microsoft has shipped to address that gap at the platform level. That is what makes it worth paying attention to, even while it is in preview.

I am still early in exploring this and plan to dig further as it moves toward GA. If you have already started building with it, whether you found a workflow that clicked, hit a wall, or worked around something unexpected, I would genuinely like to hear about it in the comments.