Stop Waiting for a Winner

May 19, 2026
| Jeff Johnston
Last updated on May 19, 2026

⏳ Estimated reading time: 11 min

Table of Contents

The Fork in the Road Everyone Pretends Is a Roundabout 

This is Part 2 of a two-part series on Microsoft Fabric and Azure Databricks. Part 1 covers the decision framework, when each platform fits and how to think about using both. This post assumes you’ve crossed that line and are ready for the unvarnished version. 

I’ve been in the room when this decision gets made. Not the vendor presentation, the real meeting, the one after the vendor leaves, where a Chief Data Officer, Principal Architect, and a BI Director try to reconcile three different things they heard in the same demo. The engineers heard “Spark, MLflow, Unity Catalog.” The BI team heard “Direct Lake, Copilot, zero refresh lag.” Finance heard “single platform, consolidated spend.” 

They all heard what they wanted. And that’s where the trouble starts. 

The data team in question was doing the right things. They’d built a medallion Lakehouse, implemented governance, pushed data access down to business users who’d never needed a SQL client before. They weren’t behind. They were, in fact, exactly where every data strategy presentation says you should be, and they’d spent the better part of two years debating whether that success belonged on Microsoft Fabric or Databricks, because surely, the thinking went, one of them was going to win….and when it did, you’d want to be on the right side of it. 

But what happens when you keep waiting for that winner to emerge? 

Every Day Feels Like a Tie 

The productive paralysis of a platform debate this close is hard to describe from the outside. But if you’ve lived it, you know the story. Every architecture review reopens the question. Engineers push back on Fabric’s Spark capabilities; it’s capable, not best-in-class, and the distance between it and Databricks has years of compounding behind it. BI leads won’t hear of abandoning Power BI; 35 million active users, a decade of DAX investment, and Copilot embedded in M365 is not a thing you migrate away from on principle. Finance is paying for both in parallel, which feels wrong, but nobody can make the case for fully abandoning either one. 

Every vendor QBR restarts the debate with new slides. Databricks comes along and announces something like Unity Catalog Iceberg REST Catalog support; reads available today, managed Iceberg table writes in Public Preview. Then Microsoft announces Direct Lake on OneLake going GA in March 2026. Both announcements are real. Both are genuinely valuable capabilities. Neither one ends the argument, they just give each side fresh ammunition. 

The false belief underneath all of it is this: that there is a right answer, and the right answer is singular, and the team that picks wrong will pay for it. That belief turns every architectural decision into a zero-sum argument. We all know zero-sum arguments in data teams tend to produce exactly one outcome, paralysis, followed by the same debate eighteen months later.

Until One Morning at 3 A.M. 

Here’s the inciting incident, and I’m combining a few real conversations into one composite because the pattern is that consistent. 

The team had a Direct Lake semantic model sitting over a gold-layer Delta table. Everything worked beautifully in dev. In production, on an F32 capacity, the model hit the row limit; 300 million rows per table is the ceiling on F2 through F32, and there’s no graceful fallback on Direct Lake on OneLake. Unlike the older Direct Lake on SQL, which would quietly fall back to DirectQuery when guardrails were exceeded, Direct Lake on OneLake simply errors. The model didn’t degrade. It stopped…at 3 a.m., ahead of a 7 a.m. executive dashboard review. 

Meanwhile, on the Databricks side, AllPurpose clusters that had been set up for notebook development were still running scheduled ETL jobs; a classic anti-pattern that compounds DBU billing quietly and ruthlessly. The month came in 40% over forecast. Not because of bad data engineering, but because nobody moved scheduled jobs to Job compute or serverless. Auto-termination helps, but AllPurpose clusters still bill at 2–4× the per-DBU rate of Jobs Compute for the same workload. The cluster type, not just the termination policy, is the larger lever. 

Two different platforms, two different failure modes, same root cause: the team had been debating which one to keep instead of learning to run both deliberately.

The Honest Ledger 

When you’re forced out of the binary war framing, you can finally look at each platform honestly. 

Fabric’s killer feature is real. Direct Lake on OneLake, delivers Import-class DAX performance over live Delta with no scheduled refresh. For a Power BI shop serving hundreds of business users, nothing in the market comes close. The SaaS simplicity is genuine; no clusters to size, no infrastructure to manage, capacity admin handles it. Now, OneLake shortcuts and mirroring actually reduce the data copies that have been the quiet tax on every multi-cloud enterprise: ADLSgen2, S3, GCS, Snowflake, Databricks, Dataverse; they all flow into one logical lake without moving bytes. 

But the failure modes are real too. Fabric Capacity smoothing is not a footnote, it’s the architecture. Background workloads like Spark jobs smooth over a 24-hour window, which means a single misconfigured pipeline can put a small SKU in the penalty box for an entire business day. Power BI users don’t know why their reports are failing. They don’t see the smoothing math. They just see failures. The first-line defense is Surge Protection (free, capacity-admin setting), which caps CU consumption by background jobs so interactive Power BI queries stay responsive. Microsoft’s Capacity Overage feature, which entered Preview at FabCon 2026, lets you pay through the throttle; a pragmatic safety valve, but also a quiet admission that the smoothing model has an edge case that organizations keep finding. The fix is right-sizing and using Autoscale Billing for Spark, which moves Spark entirely off the shared CU pool onto separate PAYG-billed compute, eliminating the contention at its source, not an opt-in overage switch. 

Databricks’ strengths are equally genuine. This is the best Spark engineering experience in the market, not by a small margin, but by years of compounding investment in the runtime, the toolchain, and the operational model. Unity Catalog is the most opinionated, capable open-lake governance layer in the industry: column-level lineageattribute-based access control (Public Preview as of early 2026), row filters, column masks, Delta Sharing, credential vending to external engines. For ML and AI workloads, you have MLflow, Mosaic AI, Vector Search, Agent Bricks, Lakeflow declarative pipelines now GA. There is no serious competitor on Azure. 

And the failure modes are equally real. Cost is the number-one production complaint for a reason: DBUs plus cloud VM plus storage plus serverless premium plus Photon multiplier compounds fast. The shallow clone problem still bites practitioners working with non-UC-managed tables or external-table clones, where VACUUM on the source past the 7-day default retention window can leave the clone with FileNotFound exceptions. Note that UC managed shallow clones (DBR 13.3+) are reference-tracked across the source-clone graph, so VACUUM on a UC managed source will not delete files referenced by a UC managed clone. The failure mode is real, but it is narrower than it used to be. 

The Plot Twist Nobody Wanted to Hear 

Here’s the thing vendors don’t put on their slides: the file format war everyone expected never happened. 

For years, data teams absorbed the Iceberg vs. Delta narrative as though they were picking sides in a Kimball vs Inmon conflict. The actual outcome was that both vendors decided to support both formats. Databricks shipped managed Iceberg tables in Unity Catalog with full read-and-write via the Iceberg REST Catalog API (in Public Preview in 2025). Microsoft shipped automatic Iceberg-to-Delta translation, OneLake shortcuts to Snowflake Iceberg, and OneLake Table APIs compatible with the Iceberg REST spec. 

The competition moved up the stack. The real question isn’t which table format wins; it’s which catalog governs your lake. Databricks is betting that Unity Catalog wins the governance layer while storage remains open. Microsoft is betting that OneLake wins the physical lake in Microsoft estates because the gravity of M365 and Power BI is too strong to escape. Both bets can be simultaneously correct…and in most large enterprises, they are. 

What “Better Together” Actually Looks Like 

The reference architecture that works, is as follows. Databricks handles bronze, silver, and gold medallion ETL on customer-owned ADLS Gen2, with Unity Catalog as the engineering governance plane. ML training, streaming via Lakeflow, live here. This is the engineering platform. 

The Mirrored Databricks Catalog, which went GA in July 2025 and GA over private endpoints with a VNet data gateway later that year, brings Unity Catalog Delta external tables into OneLake as zero-copy shortcuts. No bytes move. Fabric reads the same Parquet files sitting in your Databricks storage account. Direct Lake on OneLake semantic models sit over the mirrored gold layer and deliver Power BI dashboards with Import-class performance. Copilot in M365, Teams, Excel; they all reach the data through this path. 

The native UC read of OneLake, which reached Public Preview at FabCon March 2026, closes the loop in the other direction: Databricks workloads that need Fabric-curated gold data or Dataverse data can reach back through the same physical files. 

This is the architecture. Not “better together” as a marketing phrase. An actual integration story that eliminates duplicate ETL into Fabric while keeping engineering and consumption on the platforms each one was built for. 

The friction that survives is real and worth naming before you commit to it. Unity Catalog privileges do not replicate to OneLake when you mirror Databricks into Fabric; this is not a small print item; it’s an operational trap. You reapply access on the Fabric side using OneLake Security, workspace roles, and Power BI RLS. Three policy engines. You’ll need to script the governance bridge; pretending it doesn’t exist is how you end up with a security gap six months after go-live. End-to-end column lineage from a Databricks notebook to a Power BI visual is still aspirational as you’ll see the source in Fabric’s lineage view, but you won’t see the column transformation. And “better together” can become “expensive together” fast; a mid-size org with a thin FinOps practice may genuinely be better served by committing fully to one platform and accepting its constraints. 

The Org Design Conversation Nobody Wants to Have 

There’s a reason the platform debate keeps reopening, and it isn’t the platforms. 

A Spark engineer and a Power BI/DAX modeler are not the same person. They’re different professional identities, different conference circuits, different career paths, different communities of practice. Forcing one platform to serve both is often, and not always, an attempt to avoid having the organizational conversation about whether the team is actually one team or two teams with adjacent missions. The architecture debate is sometimes a proxy for the org design debate. 

The best data organizations I’ve worked with resolved the platform question not by picking a winner, but by getting clear on the seam. Who owns bronze and silver? What does the handoff to the semantic layer look like? Who governs the mirror? Who responds at 3 a.m. when Direct Lake errors? Those questions are organizational before they’re architectural. The architecture just reflects the answers. 

The Teams That Will Win 

The teams that stop waiting for one platform to win, and instead build deliberately across the engineering-consumption divide, will outcompete the teams still running the debate. Not because “better together” is a clean story, it isn’t perfect just yet. It has two governance planes, two FinOps practices, real lineage gaps, and a skill split you have to staff for. But it is an honest story, and honest stories age better than vendor narratives. 

The teams still waiting for one platform to win will keep re-platforming. Every 18 months, a new vendor announcement will restart the argument. And while they’re arguing, the teams that committed to a seam and scripted the bridge will be shipping. 

The question was never which platform wins. The question was always which workloads belong where  and whether you have the organizational clarity to stop pretending that’s the same question. 

Need the decision framework first?

Read Part 1 for a practical guide to where Fabric, Databricks, or both fit best.


Next Steps

Find out how our ideas and expertise can help you attain digital leadership with the Microsoft platform.

Subscribe to our blog:

Categories:
Share: