You can read the first part of the series via the link below: What is Microsoft Fabric? Part 1: From the past to the present.
We established that Microsoft Fabric combines modern data warehousing and reporting platform needs into a single product with its new cloud services product. Well, it’s great and clear that one product, available as a service, solves the various needs of data warehousing and reporting in a versatile way, but is there some kind of new special technical innovation in Fabric? What’s the point?
In the big picture, data warehousing and reporting is quite simple. We collect data from different sources into a data warehouse. We model, clean up, combine, etc. From the data warehouse, the data can be loaded into a data model for reporting, with defined metrics and hierarchies, among other things. Data from the data model flows on to reports and dashboards. And all this has been possible for decades. Where’s the need for improvement?
I happened to catch a television re-run of a car series recently, which featured the 1961 Jaguar E Type, a huge success that, while moderately priced, offered the performance of supercars of its time. The story goes that the E Type was, in the opinion of Enzo Ferrari himself, “the most beautiful car in the world”.
The revolutionary idea behind Fabric is that the same platform is made compatible with the different engines that are needed.
In the previous Fabric article, we reviewed the past years, and mentioned Microsoft’s competitors Qlik and Snowflake. If we put ourselves in Enzo’s shoes, Qlik brought unprecedented performance to the data model, and beautiful reports. On the other hand, the underlying data warehouse was not part of that story. Later, Snowflake brought astonishing performance to the database level with its platform and engine solution. I remember realizing at the time that, in principle, there was really no need to duplicate large data sets into a data model for reporting purposes. Reporting systems such as Power BI can use Snowflake directly for reporting, but the problem is Direct Query (the powertrain), which imposes a lot of restrictions on reporting.
What kind of Jaguar is Fabric then? Fabric is built on a new platform. Fabric’s platform is not a state secret, but technically it is built on top of OneLake storage on Deltaformat tables, and further compressed at the disk level using Parquet compression technology. The revolutionary idea behind Fabric is that the same platform is made compatible with the different engines that are needed.
“Scheduled transfers grind sales data in the data warehouse into the format required for business monitoring. This takes time. Once everything is ready, Power BI semantic models are refreshed in order to update the reports. This takes even more time.”
Technically, the preceding description means that data, for example, compiled using the SQL engine, is still duplicated in a format suitable for the Power BI Analysis Service engine. This step is redundant in the Fabric ideology.
In addition to saving time in the data warehouse, it allows different data specialists to use the same platform. Other common problems are:
“New data is needed for the sales data of the data warehouse, but a different team is responsible for the data lake, and just accessing the data takes an awful lot of time.”
“A new forecast is needed for the sales data – someone should provide the data scientist with the transfer files.”
“The forecasting model is ready, and it should be integrated into the data warehouse.”
“Can I get production failure data and sales data in the same view?”
Technically, the challenge in these imaginary situations is the siloing of engines (data and experts) such as SQL, Spark, Kusto and finally Power BI Analysis Service. With Fabric, the platform is shared, management is centralized and even data can be cross-shared if necessary.
Many things have been possible for a long time, but in practice their continuous development or integration into business processes has too often been overshadowed by various platform, engine, or powertrain problems. Fabric offers a built-in advantage in this regard, and if successful, Microsoft is about to hit the entire data industry with a shift to a higher gear.
This article is the second part of a blog post series on Microsoft Fabric. In the already published part, I discussed the history of data processing based on Fabric. In the next part of this series, we will look at Fabric’s performance with large data sets.
Knowledge management – How to use data more effectively
Microsoft Fabric and AI make it easier for educational institutions to manage data and self-service reporting
What is Microsoft Fabric?
Pinja’s knowledge management and business intelligence services