Developer Experience Challenge – DAX Dependency Graph

The Fabric Semantic Link Developer Experience Challenge gave me a concrete reason to build something I had been meaning to build for a while. The result is a Fabric notebook that extracts every DAX measure from a Power BI semantic model, maps measure-to-measure dependencies into a directed graph, and produces a ranked complexity audit with risk ratings. This post is about what it does, how it works, and why the approach is useful beyond the challenge context.

The problem it solves

If you have worked on a Power BI semantic model that has grown organically over a few years, you already know the problem. Measures reference other measures. Those measures reference other measures. Nobody wrote it down. The person who built the original [Gross Margin Adj YTD] has since moved on, and the name was self-explanatory at the time.

Power BI Desktop shows you a measure’s DAX expression when you click on it. It does not show you which other measures that expression depends on, how deep the dependency chain goes, or whether any measure sits in a circular reference loop that the engine has been quietly working around. Tabular Editor helps, but it still requires manual navigation. There is no built-in view that answers “what are the ten most complex measures in this model, and which ones does everything else depend on?”

That is what this notebook answers.

Getting the measures out

The notebook uses sempy.fabric.list_measures() from Semantic Link to pull every measure from the target model in a single call. It returns a pandas DataFrame with measure name, parent table, DAX expression, visibility, and description per row.

measures_df = fabric.list_measures(dataset=DATASET_NAME, workspace=WORKSPACE_NAME)

Under the hood, Semantic Link connects via the Tabular Object Model (TOM) over the XMLA endpoint. Fabric handles authentication from the notebook’s identity. Two config values is all the setup needed:

WORKSPACE_NAME = None                      # None = current workspace
DATASET_NAME   = "YourSemanticModelName"

Then Run All.

Parsing measure references: three passes

The interesting part is working out which measures each expression actually references. DAX uses square bracket notation for both measures and columns: [Total Revenue] is a measure reference, Sales[Amount] is a column reference. The parser has to distinguish them correctly.

It does this in three passes:

  1. Strip string literals ("...") to avoid false positives. A FORMAT call like FORMAT([Date], "Total Revenue") would otherwise incorrectly register a dependency on [Total Revenue].
  2. Strip single-line and multi-line comments (// and /* */).
  3. Extract all [Name] patterns where the opening bracket is not preceded by a word character, digit, or apostrophe. That lookbehind excludes table-qualified references like Sales[Amount] and 'My Table'[Column].

The extracted names are then cross-referenced against the full set of known measure names. Anything that is not a measure name is discarded.

pattern = r"(?<![a-zA-Z0-9_'])\[([^\]]+)\]"
matches = re.findall(pattern, cleaned)
return list({m for m in matches if m in measure_names})

This correctly handles the common edge cases in real models. The one known limitation: measures referenced via SELECTEDMEASURE() or through a disconnected table SWITCH pattern cannot be resolved statically. If your model uses those patterns heavily, some dependencies will be missing from the graph.

Building the graph

Once the parser has run on every expression, the dependencies go into a NetworkX directed graph. Each measure is a node. An edge A -> B means “A’s DAX expression references measure B” — A depends on B.

The graph direction is important. It lets the tool compute:

  • In-degree (fan-in): how many measures depend on this one. High fan-in means “hub” measure. Breaking it cascades everywhere.
  • Out-degree (fan-out): how many measures this one calls. High fan-out means complex composition.
  • Longest path from any node: the transitive dependency depth.
  • Cycles: circular reference chains.

From those two properties alone, the next three analyses fall out naturally.

Five analyses

Dead measures. In-degree of zero means no other measure references this one. It might be a top-level report measure used directly in a visual, or it might be genuinely unused. The notebook flags all of them; cross-referencing with report usage is a follow-up step.

Root measures. Out-degree of zero means no dependencies on other measures. These are the foundation: the SUM(Sales[Amount]) base calculations that everything else builds on. Errors in root measures propagate silently through every measure above them.

Circular references. The notebook runs Johnson’s algorithm via nx.simple_cycles() to find every elementary cycle in the graph. In a well-designed model the result is: “No circular dependencies detected.” When it is not, the full chain is printed — A -> B -> C -> A — so you know exactly what to untangle.

Complexity scoring. Each measure gets a weighted composite score across six dimensions:

DimensionWeightRationale
CALCULATE / CALCULATETABLE count3Context transitions are the primary source of subtle DAX bugs
Max parenthesis nesting depth1Readability proxy
Branching (IF / SWITCH)2Code path count
Filter functions (FILTER, ALL, ALLEXCEPT, etc.)2Filter-context manipulation
Dependency depth (longest downstream path)2Transitive error amplification
Fan-out (direct measure references)1Composition width

The weights are the part most open to debate. I gave CALCULATE the highest weight because context-transition confusion is, in my experience, where DAX models accumulate the most invisible risk. The depth and branching weights reflect that those properties make a measure harder to verify than to write.

Visual dependency DAG. The notebook renders the graph with matplotlib. Node color encodes complexity score on a green-to-red scale. Node size encodes in-degree, so hub measures are physically larger. An optional pyvis interactive version renders inline in the notebook via an iframe: zoomable, draggable, with hover tooltips showing measure name, table, and score.

The audit report

The final step consolidates everything into a single DataFrame sorted by risk rating, then by descending complexity score within each tier.

Risk logic:

  • Critical: participates in any circular reference
  • High: complexity score ≥ 20
  • Medium: score between 10 and 19
  • Low: everything else

The summary banner above the table looks like this:

============================================================
  DAX DEPENDENCY GRAPH - AUDIT SUMMARY
============================================================
  Total measures:            147
  Dependency edges:          312
  Unreferenced measures:     38
  Root measures (no deps):   22
  Circular references:       0
  High complexity (>=20):    11
  Medium complexity (10-19): 29
============================================================

That is the starting point for any refactoring conversation. Eleven measures scoring High is a concrete prioritisation signal: start there, not with the 38 unreferenced ones.

Limitations worth knowing

The parser only resolves measure-to-measure dependencies. Column-level lineage is out of scope. Calculation groups are not modeled as nodes, so models that use them heavily will have gaps. Cross-model references (composite model / DirectQuery to external datasets) are not in scope either.

The “unreferenced” flag does not mean unused. It means not referenced by other measures. A measure used directly in 15 report visuals will still show as unreferenced in this graph, because the tool has no report-level visibility. That cross-reference is worth doing separately with sempy.fabric.list_reports() if you are planning to delete anything.

Getting the notebook

The notebook is on GitHub: github.com/vestergaardj/DDG-DAX-Dependency-Graph. Upload it to any Fabric workspace, set the two configuration values in the Configuration cell, and run all. Everything else is automatic.

It requires semantic-link-sempy, networkx, and matplotlib, all of which come pre-installed in Fabric. pyvis is optional; the static graph still renders without it.

Are you still the air traffic controller?

In February 2025 I wrote about building an event-driven ETL system in Microsoft Fabric. The metaphor was air traffic control: notebooks as flights, Azure Service Bus as the control tower, the Bronze/Silver/Gold medallion layers as the runway sequence. The whole system existed because Fabric has core-based execution limits that throttle how many Spark jobs run simultaneously on a given capacity SKU.

The post was about working around a constraint. You could not just fire all your notebooks at once. You needed something to manage the queue.

More than a year on, it is worth being honest about what held up and what has changed.

The original architecture, briefly

Three components:

Azure Service Bus acted as the message queue. When a source system finished loading raw data, it dropped a message onto the bus. Each message represented one flight waiting for clearance.

A capacity monitor notebook ran on a short schedule. It checked how many notebooks were currently executing and compared that count against the available core capacity. If capacity was available, it pulled the next message from the queue and triggered the appropriate notebook via the Fabric REST API.

The processing notebooks were standard Bronze, Silver, and Gold Spark notebooks. They ran as normal Fabric notebooks with no awareness of the orchestration layer above them. On completion, they acknowledged the Service Bus message.

The deliberate design choice was to keep the notebooks clean and put the complexity in the orchestration layer. A notebook should not need to know whether it is being called by a scheduled job, a pipeline, or a service bus monitor. That separation held up well.

What has changed in Fabric

Two relevant changes since February 2025:

Native job queueing

Fabric now queues Spark jobs automatically when capacity is exhausted rather than rejecting them outright. Jobs queue in FIFO order and wait up to 24 hours before expiring. The platform starts them automatically as capacity becomes available.

There is a hard constraint that limits how far this goes: the queue depth is bounded by the SKU’s CU allocation. It is not unlimited. A sudden burst of 100+ notebooks would exceed both the concurrent execution limit and the queue depth, and the excess jobs would be rejected rather than queued — the same failure mode as before native queueing existed.

So native FIFO queueing helps if your workload arrives gradually. It does not change the original problem if your trigger pattern involves large simultaneous batches. The Service Bus buffer sits outside Fabric and has no queue depth constraint. That distinction is why the architecture is still relevant.

Job-level bursting controls

Fabric capacities support bursting at up to 3x the nominal CU allocation. You can now disable bursting for specific Spark jobs, giving finer-grained control over which jobs are allowed to consume burst headroom. Useful for ring-fencing critical workloads. This is an additive improvement to the platform regardless of which orchestration approach you use.

What held up

The decoupled architecture held up. Keeping orchestration logic out of the processing notebooks made them easier to test, modify, and redeploy independently. A notebook that does not know how it was triggered is easier to reason about than one that contains scheduling and queueing logic alongside its data transformation. Nothing about that changed.

Azure Service Bus held up as a reliable messaging backbone. At-least-once delivery, dead-letter queues, message peek-lock, and configurable time-to-live are production-grade features. There were no reliability issues with the messaging layer over the period.

The Bronze, Silver, Gold medallion structure held up. That is sound data architecture independent of the orchestration tool above it.

What I would do differently today

Not much, given the workload pattern. The original system was designed for burst scenarios, and the burst scenario has not changed. The native queue depth limit tied to CU allocation means there is still no Fabric-native replacement for an external buffer that absorbs an unbounded number of incoming messages and feeds them into Fabric at a controlled rate.

The one thing I would reconsider is the capacity monitor notebook’s polling loop. It runs on a short schedule and checks active job counts before pulling the next message. That works, but it adds latency and a scheduling dependency. Whether the Fabric REST API now exposes enough observability to build a leaner version of that loop is worth investigating — but that is an implementation detail, not a reason to replace the architecture.

The honest production reality

The system is still running. A working production system with understood failure modes and people who know how to debug it has a high replacement threshold. The architecture from February 2025 has not needed replacing because the problem it solved has not stopped existing.

The question I get asked: is there a more native way to do this now? The answer is no, not for the burst-buffering problem specifically. Fabric pipelines handle sequential orchestration well. Native FIFO queueing handles gradual workloads within its depth limit. Neither absorbs an unbounded burst and controls submission into a capacity-constrained runtime. Service Bus still does that job, and it still does it well.

The air traffic controller is still in the tower. The runway is a little wider than it was, but the traffic has grown too.

The Future of BI: Will AI Replace BI Developers?

I have been asked this question at every conference I have attended in the last two years. Not always directly. Sometimes it arrives as “what do you think about Copilot” or “is there still a point learning DAX properly.” But the underlying question is always the same: is my job going to exist in five years?

It is a fair question. When you watch GitHub Copilot complete a CALCULATE function before you have finished typing the first argument, or paste a business requirement into Claude and get back a working Power Query transformation, it is easy to understand why people are asking it.

Here is my honest take, as someone who has been building BI solutions since 2006 and who has spent the last year testing these tools in real work, not in demos.

What These Tools Can Actually Do Today

I have been using ChatGPT, Microsoft Copilot and Claude regularly over the last eighteen months, in actual client work and personal projects. Not in theory. In my editor, on real data models, with real business requirements.

ChatGPT is strong at generating DAX and SQL when you give it enough context. If you describe the table schema, the business logic and the expected output format, the first attempt is often close enough to use with minor adjustments. I have used it to draft measures I would have spent an hour on, in five minutes. It is not always right on the first pass, but it moves fast enough that iteration costs less than starting from scratch.

Microsoft Copilot inside Fabric and Power BI has improved noticeably over the last year. The report creation assistant went from generating generic placeholder visuals in early 2024 to producing layouts I would actually take and refine for production use in 2025. It will not replace design judgment, but it removes the blank canvas problem. For report creation at scale, that matters.

Claude has been the most useful for reasoning about model design. I used it to reverse-engineer a semantic model from an annotated schema diagram and it handled the relationships and measure dependencies better than I expected. I also hit real limits: it had no knowledge of my actual data, made assumptions that were wrong for our domain, and I needed five or six iterations before the output was usable. The hiccups were real. I am not smoothing those over.

The pattern that holds across all three: if the task is well-defined, the context is fully provided, and the output can be verified quickly, these tools are fast and genuinely useful. That describes a significant portion of the day-to-day coding work in a typical BI role.

Where They Still Fall Short

None of these tools know your business.

That sentence sounds simple, but it is where the real gap sits in practice. Generating a CALCULATE function is not difficult once you know the filter context. The difficult part is knowing that in your organization, “active customers” means accounts with at least one transaction in the last 90 days, but only in the consumer segment, and that this definition changed in Q2 2024 and needs to be handled differently across historical and current period comparisons.

That context lives in your head, in Confluence pages nobody reads, in a meeting that happened eighteen months ago, and in an email from a finance analyst who has since left the company. No AI tool picks that up from a prompt. You bring it, or it does not exist in the output.

The same gap shows up in data quality judgment. AI tools will generate a transformation pipeline from a spec without questioning whether the spec is correct. They will not notice that the order date column has 8% null values, or that this matters for the revenue calculation, or that the exceptions exist because of a legacy system migration that your company did in 2019. You notice, because you have seen the data before and know what those patterns mean.

Report creation has the same ceiling. The AI can build a layout, suggest a chart type and write a title. It cannot make the call that this dashboard will be shown on a 40-inch screen in a warehouse and that the font size from the default template will be unreadable from three meters away. That judgment comes from having sat with the people who use the reports.

What This Actually Means, by Role

The impact is not uniform. It depends on how much of your current working week involves mechanical repetition.

For BI developers, the most immediate change is in code generation. Boilerplate DAX, repetitive ETL transformations, standard report templates: these are exactly the tasks AI accelerates most. If this work is a large part of your week, your output volume will increase and expectations around it will rise with it. That is not a threat if you understand it early enough to stay ahead of it.

For data analysts, the change is most visible in exploration speed. Asking Claude or Copilot to produce a first-pass analysis of a dataset, flag anomalies or suggest groupings gets you to a hypothesis faster. The interpretation of that hypothesis, and the judgment about whether it is the right hypothesis for the actual decision being made, remains yours.

For data engineers, pipeline boilerplate generation and schema documentation are straightforward wins. Debugging complex transformation failures where the error is opaque is an area where these tools also provide real value, particularly if you can paste the full stack trace and table definitions into the prompt.

For data architects, the tools work well as thinking partners for structured design questions. Talking through a proposed model, generating documentation drafts, checking naming conventions across a schema. The decisions about what to govern, where domain boundaries sit, and how to design for the organizational reality rather than the theoretical ideal still require judgment that comes from knowing the context.

For data governance specialists, the upside is in documentation and lineage drafting. The real governance work, defining what quality means for a specific data product and making that stick across teams, is still a people and process problem that no AI tool solves for you.

My Perspective as a Practitioner

I am not going to tell you AI will not change BI work. It already has. But the question of whether it replaces BI developers wholesale is the wrong framing.

The more useful question is: which parts of your current work are mechanical enough that AI tools can do them faster and cheaper? Be honest about that list. If it is long, that is information worth having now rather than in two years when the market has already adjusted.

Here is what I have observed across the work I do and the people I talk to at conferences: the work that clients are most anxious, most uncertain, and most willing to pay for is not the part AI is good at. It is the part that requires knowing their business, understanding who trusts what report and why, having the conversation with the finance director who distrusts the new model, and standing next to the operations manager while they explain what the dashboard actually needs to show.

That is still practitioner work. AI does not do it. It accelerates the technical work that happens around it, which gives you more time for the part that matters most.

There is one shift I have started to notice though, and it is worth naming. The finance director on the other side of the table is also using these tools. They arrive at the meeting having already asked Claude to explain the variance, or having had Copilot summarize the dashboard. They come in with a better baseline understanding than they had two years ago, and that makes for a more productive conversation. You spend less time on mechanics and more time on the actual decision. That is a good thing, not a threat.

The hallucination problem is real and it is the most reasonable objection anyone raises to trusting AI output in production work. But it is also improving faster than most people expected a year ago. The gap between what these tools got wrong in early 2024 and what they get wrong now is measurable. I expect that to continue. The sensible approach is to verify outputs where the stakes are high and to track how often you have to correct them over time. That number has been going down in my experience.

The BI developers I have watched get uncomfortable recently are the ones who built careers around syntax knowledge, template maintenance and format conversion. Those jobs are genuinely changing. The developers who are doing well are the ones who understood that the syntax was never the point: the business problem was the point, and tools that help with the syntax give them more time for the problem.

My advice is straightforward: learn the tools, use them on real work rather than tutorial datasets, and find out where they fail on your actual data in your actual domain. That is the only way to know where your judgment matters more than their suggestion.

For me, the answer to whether AI will replace BI developers is: not the ones who are clear about what they are actually being hired to do.

What has your experience been with AI tools in your daily BI work? I would like to know what people are actually finding useful versus what sounds better in a conference session than it works in practice.

Never say never again: Keeping Your SSMS Server List Safe in 2026

Twelve years ago 🤯 I wrote a quick post about never losing your server list in SSMS again. The short version was: copy one file, stay sane. The file was called SqlStudio.bin, and the trick still works today if you are on an old enough version.

But if you are running SSMS 19, 20, 21 or 22 on Windows 11, the file is gone. The settings have moved, and the format has changed. The principle is the same, but the file you need to grab is different.

Here is the updated version for 2026.


What Changed

In the older versions of SSMS, the registered server list was stored in a binary file:

C:\Users\<PROFILE>\AppData\Roaming\Microsoft\SQL Server Management Studio\<VERSION>\SqlStudio.bin

From SSMS 19 onwards, the file you want is:

C:\Users\<PROFILE>\AppData\Roaming\Microsoft\SQL Server Management Studio\<VERSION>\UserSettings.xml

Same idea, different file, plain XML instead of binary. That last part is actually an improvement: you can open it, read it, and understand what is in there before you copy it anywhere.


Where to Find It

On Windows 11, with SSMS installed and a server list you want to keep, navigate to:

C:\Users\<your Windows username>\AppData\Roaming\Microsoft\SQL Server Management Studio\

You will see one folder per installed SSMS version. I currently have four:

  • 19\
  • 20\
  • 21\
  • 22\

Inside each one, look for UserSettings.xml. That is the file that holds your registered servers.

The AppData folder is hidden by default in Windows 11. If you do not see it in Explorer, type the full path into the address bar, or go to View > Show > Hidden items.


What to Do With It

The same logic as 2014 applies:

  1. Make a copy of UserSettings.xml from your current machine.
  2. Store it somewhere you can get to it: OneDrive, a network share, a USB drive, wherever works for you.
  3. On a fresh install, copy the file into the correct version folder before opening SSMS.

SSMS reads the file on startup. If the file is in place when SSMS opens, your registered servers appear as if you never left.

If you have multiple SSMS versions installed side by side and maintain different server lists in each, you need a separate copy per version. They do not share the file.


One Thing Worth Knowing

The file is XML, so you can also merge server lists manually if you need to combine two environments into one. I have not done this in a scripted way, but looking at the structure it is readable enough that it would not take long to figure out.

If you are already using Azure Data Studio instead of or alongside SSMS, the settings are stored differently again. That is a separate post.


The short version: the file changed, the folder structure stayed the same, and the trick still works. Copy UserSettings.xml before you reinstall anything.

Azure Databricks at FabCon 2026: What Got Announced and What It Actually Means

FabCon 2026 in Atlanta last week was bigger in many ways. For the first time, the Microsoft Fabric Community Conference ran alongside SQLCon, which meant two large communities shared the same convention center, the same coffee queues and the general atmosphere of too-many-sessions-not-enough-sleep that defines any conference worth attending.

Databricks showed up with several announcements. Some of them are incremental. A few are worth more than a passing glance, depending on where you sit on the data and BI stack.

One thing that still catches people off guard: Azure Databricks has been a first-party Azure service since 2017. Not a partner product, not a third-party integration. A first-party Azure service, alongside Power BI, Excel, Teams, Azure OpenAI, Copilot Studio and the Power Platform. When Microsoft talks about a unified data and AI platform on Azure, Databricks is part of the architecture. The announcements this week make that more visible.

Here is what they shipped, and what I think it means in practice.


Lakeflow Connect Free Tier: 100 Million Records a Day, at No Cost

Bad pipelines are one of the constants of BI work. The data arrives late, the connectors are fragile, someone is maintaining a web of custom scripts because there was never time to do it properly, and the people building reports spend half their week cleaning up after problems they did not cause.

Databricks announced a Lakeflow Connect Free Tier, and the headline number is worth taking seriously: 100 million records per workspace per day, at no charge. That is 100 free DBUs per day, included with every workspace, before standard Lakeflow Connect pricing applies.

What it connects to out of the box:

  • Databases: SQL Server, Oracle, Teradata, PostgreSQL, MySQL, Snowflake, Redshift, Synapse, BigQuery
  • SaaS applications: Dynamics 365, Salesforce, ServiceNow, Workday, Google Analytics

For databases, the ingestion runs on Change Data Capture, which means it reads the transaction log incrementally rather than scanning full tables. Data lands in Delta format on Azure Data Lake Storage. Unity Catalog governance applies from the moment the first record arrives, so access control and lineage are not something to sort out later.

Databricks quotes 25x faster pipeline builds and 83% ETL cost reduction. I would take vendor benchmarks with the usual scepticism, but the direction is clear: the intent is to make data ingestion a problem you configure rather than one you maintain. For a BI team currently paying for third-party Dynamics or Salesforce connectors, or running CSV exports on a schedule, this is worth a practical test.


Lakebase Is Generally Available: A Postgres Database Inside the Lakehouse

This one sits closer to architecture and engineering than it does to daily BI work, but it changes some assumptions that are worth understanding.

Azure Databricks Lakebase is now generally available in 14 Azure regions. It is a managed, serverless Postgres service that runs inside your lakehouse, on the same storage as your Delta tables.

The problem it addresses is one data architects have been working around for years: operational data and analytical data have historically lived on separate platforms, connected by pipelines that were always someone’s responsibility and frequently nobody’s priority. Lakebase puts an operational database directly in the same governed environment as the rest of the data platform.

Key characteristics:

  • Full Postgres compatibility, with support for extensions including pgvector and PostGIS
  • Compute and storage separated, with scale-to-zero and sub-second startup
  • Branching and instant restore for development and testing workflows
  • High availability with automatic failover across availability zones

The use cases Databricks highlights: transactional analytics on operational data, AI agent state management, and customer personalization and feature serving. For data engineers building pipelines that feed AI applications, Lakebase removes the need to run a separate operational database outside the Databricks platform just to give an agent somewhere to write state.

It is available to test today in 14 regions. If you have been looking for a Postgres layer that sits inside the lakehouse without architectural compromise, now is a reasonable time to look at the documentation.


The Excel Add-in Is in Public Preview: Governed Lakehouse Data in the Tool Most People Actually Use

This is the announcement that will get the most immediate attention from analysts and business users, and probably the one that causes the most internal conversations about data governance.

The Azure Databricks Excel Add-in is in public preview. It connects Excel directly to Unity Catalog tables and Metric Views. From inside Excel, you can browse the catalog, build pivot tables from governed semantic definitions, and filter and analyze data without writing SQL. It works on Excel for Windows, macOS and the web.

The problem it addresses is one every BI developer and governance specialist knows well: business users need data in Excel. So someone exports a CSV. Or the business user pulls their own export. Within 24 hours there are four versions of the file in four different places, none of them current, all of them cited in separate meetings. The analyst who originally produced them has no idea which version is being used.

The add-in replaces that pattern with a live connection to the same tables that power your Power BI reports and your analytics models. The data is current. The access rules in Unity Catalog apply here too, so a user who cannot query a table in Databricks cannot query it through the add-in either.

For analysts who work primarily in Excel, this is a genuine change in how a typical Tuesday works. For governance teams, it removes a whole class of ungoverned data copy that currently exists because there was no better option.


Genie Gets More Capable: Agent Mode, Genie Code and Databricks One

Genie is Databricks’ conversational analytics experience: you ask a data question in plain language and get back a chart, a table or a narrative answer. Databricks reported this week that 98% of Databricks SQL warehouse customers are using AI/BI, with monthly active Genie users up more than 300% year-over-year. The numbers are moving fast enough to suggest this has passed the experimental phase.

Three updates this week.

Genie Agent Mode

Standard Genie answers one question at a time. Genie Agent Mode takes a more complex business question, builds a research plan, runs multiple queries, tests intermediate results, refines its approach and then delivers a complete answer with supporting tables, charts and narrative context.

The difference becomes concrete quickly. Standard Genie handles: “What were total sales in Q3?” Genie Agent Mode handles: “Revenue in the Southeast dropped in Q3. Why did that happen, and what does the pattern suggest for Q4?” That is not a single query. It is an investigation, and Agent Mode runs it without someone having to direct every step.

For analytics managers sitting on a queue of complex ad hoc requests that only a senior analyst can currently answer, this is the update worth spending time with.

Genie Code

Genie Code is aimed at data practitioners, not end users. It is an agentic development assistant that runs inside Databricks notebooks, SQL editors and Lakeflow pipelines.

The distinction from a general-purpose AI coding assistant is that Genie Code understands your data context through Unity Catalog. It knows your tables, your lineage, your governance policies and your business semantics. With that, it can build pipelines and dashboards from natural language prompts, debug Lakeflow failures, generate queries grounded in your actual schema, and handle routine operational monitoring.

For senior BI developers and data engineers who spend part of every week on repetitive work that requires knowing the platform well, having an assistant that actually knows prod.gold.customer_activity is a different experience from hitting tab on a general-purpose tool that has never seen your schema.

Databricks One and Databricks One Mobile

Databricks One now includes a unified multi-agent chat experience powered by Genie. Business users can ask questions across the full data estate without needing to know which Genie space to route to. When a question goes beyond what existing spaces can answer, Databricks One can bring in additional agents to investigate. AI/BI dashboards and Databricks Apps are surfaced in the same interface.

Databricks One Mobile brings this to iOS and Android: Genie, dashboards and apps from a phone. Business users can ask data questions without being at a desk.


Genie in Microsoft Teams: Data Answers Where the Decisions Actually Happen

For organizations already using Microsoft 365, this is probably the most immediately deployable announcement.

You can now connect Genie to Microsoft Teams via Copilot Studio. The setup connects a Genie space to a Teams agent through the Copilot Studio connector, which handles the API and MCP logic. Once connected, users can ask data questions directly in a Teams conversation and get answers backed by your lakehouse data.

The part that makes this credible to security teams and BI leaders: every conversation runs through OAuth, authenticated against the user’s own identity. If a user does not have SELECT access to a table in Unity Catalog, Genie will not surface that data in Teams. The access model you already manage in Unity Catalog carries through to every Teams conversation.

For data governance managers who have spent years explaining why pasting screenshots of reports into Teams messages is not the same as having a governed answer, this changes the practical alternative. The question gets answered where it was asked, with the right access controls applied, and nothing leaves the governed environment.

For business users, it means getting a trusted data answer without leaving the tool they already have open.


What I Am Taking Away From This Week

The pattern across all of these announcements is one I have been watching build for a couple of years. Operational data, analytical data and AI have historically lived on separate platforms, and the work connecting them got called integration. That work is expensive, slow, and usually the first thing cut when a project runs over budget.

What Databricks is building is a single platform where all of it sits together, governed by Unity Catalog, accessible from Excel, Teams, a notebook, a mobile app or a SQL query. Whether the individual pieces fit together as neatly in production as they do in the announcement demos is something I will be watching as they move from preview toward GA.

If you were at FabCon this week, the Databricks session was Thursday March 19th in room C302 and should be available on demand if you missed it.

The next major Databricks gathering is Data + AI Summit, June 15 to 18, 2026, in San Francisco. 25,000 attendees, 800+ sessions, and the most complete view of where the platform is heading. Worth putting on the calendar.


What caught your attention this week at FabCon? Drop a comment below. I would like to hear what people are actually planning to test.