Mailbag Hot Takes – Ep. 265

Ep. 265 is a mailbag of quick questions and even quicker opinions.

After a short opener on Fabric’s ‘catching up’ quality-of-life improvements (table maintenance, Delta table performance) and a quick OpenAI DevDay tangent, the crew tackles four real-world topics: incremental refresh, Report Server vs Service, workspace governance, and what changes in the Fabric era architecture.

News & Announcements

Submit a topic idea (Explicit Measures Podcast) — Drop a question into the mailbag for a future episode.
PowerBI.tips Podcast — Subscribe and explore the full Explicit Measures back catalog.
Power BI Theme Generator (Tips+) — Build consistent themes faster (especially when multiple authors ship reports).
OpenAI DevDay announcements — Context for the ‘Assistants’ discussion and why this direction pairs well with semantic-model-driven analytics.

Main Discussion

The questions in this mailbag are all variations of the same tension: make things fast today without creating a refresh/governance mess you can’t maintain next quarter.

Here are the takeaways worth stealing:

Stop reloading what didn’t change. Look for incremental patterns in both the warehouse ingestion (ETL/ELT) and the Power BI semantic model refresh.
Use incremental refresh with a rolling ‘change window.’ Reload the last N days/months to catch late-arriving updates while keeping refresh times and costs predictable.
Turn on ‘Detect data changes’ when you can. If you have reliable updated timestamps, you can avoid touching older partitions unless something actually changed.
Deletes are the hard part. If the source doesn’t expose deletes (or change tracking), you may need wider reload windows, soft-delete patterns, or a different ingestion strategy.
Treat Power BI Report Server as a transition state. The Service is ahead on features and iteration speed; invest in tenant/admin settings and security posture so ‘cloud’ is a managed decision, not a fear response.
Govern workspaces with standards + education, not bureaucracy. Start controlled, define naming/ownership, promote certified content, then open creation as adoption grows.
Remember: Fabric doesn’t change the architecture—only the options. You still create curated tables → build semantic models → publish reports; Fabric just adds engines (lakehouse, warehouse, notebooks, pipelines) to produce those tables.
Choose tools by scale. Dataflows are approachable; Spark/pipelines are often the next step for higher volume and performance.

Looking Forward

If you want self-service to scale, invest early in incremental patterns, shared models, and lightweight governance—then let creators move faster inside the guardrails.

Episode Transcript

0:30 good morning everyone welcome back to the explicit MERS podcast with Tommy Seth and Mike good morning and happy Tuesday gentlemen how are you this morning there it is that’s what we’re looking for oh that’s perfect man perect we’re back at it again another week what they say another day another dollar so M money mo money and here we go the circle turn right wait As the World Turns yes do that one do the refence

1:00 refence old soap opera man that’s an old soap opera I believe like the soap operas they would run all day long sorry wow that I I would actually not watch them I would I would not have any pay attention to those things neither would I but it was like what your moms used to watch growing up couple things in the news article anything any news or worthy items actually before we do that let’s do a quick topic today our topic today will be all around grabbing a whole bunch of items from the mailbag and we’re just going to do a hot take and quick run through them in an

1:32 take and quick run through them in an expedient way so Tommy’s going to list us some topics we’re he’s going to ask us some questions and then from there we’ll just take a quick answer because we’ve we’ve actually had a lot of mailbag items pop up so that’ll be our episode for today we’ll be talking about a lot of random questions that’ll probably be all over the board we will likely get stuck on one of them and discuss and argue about one topic but usually that’s how it works with us we have this Grand idea of planning out what these episodes are going to look like and we just wind up bickering about something something

2:04 small I don’t know what I don’t know what you’re talking about the hours of preparation oh hours and hours yes we prepare for this in news any other articles and or news werey items that came out this week for everyone I would say I’ve there’s lots of stuff coming out but there is not not not anything of substance I guess that I would call out specifically I would say there’s a lot of what I would maybe call it would you call it quality of life type items that are

2:35 quality of life type items that are happening inside Fabric and it seems like Fabric’s getting a lot more of the investment right now there’s a lot of more more Warehouse support more other features that are fabric related one of the ones that I saw come out this week that I I’m actually quite excited about is question no I wouldn’t I wouldn’t call it life I’d call it all of the features in in the product yeah in the product and being announced prior to GA because they exist in the product but just not the

3:05 exist in the product but just not the fabric product Oh that’s true okay yeah so not quality of life just just trying to get back to Ground Zero again yes yes agree with that one that that is a fair assessment a lot of these features and this is what I’ve been saying from day one they were like well my clients are like well we we want to know like should we move over to fabric is this great should we just start like well like if you already have something built in like aure data Factory somewhere somewhere else it’s probably a bit more feature Rich to go stay in aure data Factory for

3:36 Rich to go stay in aure data Factory for now before you move over to fabric because it’s there’s it’s still a little bit hard to work with certainly want to test a lot of things a lot of testing take your take your workloads from your old environment and try and move them over to the new environment because it is it is distinctly different just because they have the same name like SQL Warehouse doesn’t mean they have the same same capability quite yet we’re being a little glib obviously but I I did I did

4:06 obviously but I I did I did find so there was an announcement yesterday and what we’re talking about is things like in The Lakehouse you can now do Spore rename yep that’s that’s a function in SQL been there for many many years now it’s in fabric so it’s it’s an announcement hey featuring like it’s so the nice thing is is if you’re not familiar with that now the new function exactly it’s like a re rehash of like oh my gosh like it’s amazing Donald I miss [Laughter] ssis one one of the features that I’m

4:38 ssis one one of the features that I’m actually very happy that has come out recently is if you were going to create tables in the lake house you had to previously write a notebook you had a run a SQL command called SQL optimize Andor SQL vacuum to clean stuff up you now have the ability to do the optimize command directly on the table so now it’s a rightclick menu option on top of the tables that are in your Lake housee this is amazing and very good for when you have tables that you have lots of tiny files or you’re loading lots of things it definitely just

5:08 lots of things it definitely just compacts them and makes them optimized for read and write it will definitely speed up your performance things like data Factory are I think data Factor doesn’t have that exact option if you write things into a table as well as data flows Gen 2 do not have an automatic optimized function in them so having that as a part of the rightclick menu options I think will be helpful so I’m very excited about that as well it’d be nice if they also had a optimize and vacuum command as well because if you’re going to optimize the table it does you

5:39 going to optimize the table it does you no good just to optimize it only and not clean out all the old files you don’t need what is the order of oper like vacuum and then optimize right it I well vacuum I think you would like to I think you want to optimize it then vacuum because optimize basically says if you oh clean it cleans and hold them together and say look you had a thousand files I can do this in three files it will compact them down to three files and then what it when when you do the optimize it says we’re no longer using

6:09 optimize it says we’re no longer using these other tables then when you run the vacuum step it says these deletion these files to be deleted in seven days so then it marks the files for deletion and then seven days later when if you run another vacuum step it will physically say okay we can we can now remove those files which I found something very interesting in my fac tenant usage where I ran an optimize I ran a vacuum I waited seven days I ran another vacuum so the files would actually be removed the files were

6:40 would actually be removed the files were no longer being referenced inside the Lakehouse however my Lakehouse report the the fabric capacity Lakehouse reporting didn’t actually show a drop until like 30 days later so there must be some retention policy that lives inside the the one Lake environment that I wasn’t aware of or you can’t see or adjust that was keeping those files around a little bit longer than even my environment so it took a while for it to actually delete the files but after eventually this huge

7:11 the files but after eventually this huge drop came in my L like oh look the files have finally been removed so that was good to know anyways I’m not I’m not sure how all this works a little bit yet so I’m I’m still a little bit leery on what Microsoft has putting together here because I don’t feel like I have a full Comfort yet about what’s happening anyways good stuff did you guys see the bringing your own Library into Microsoft fabric that especially I don’t know if you guys saw but yesterday was open ai’s

7:42 but yesterday was open ai’s conference or their Dev day and apparently what they announced this idea called assistance which anyone can do without any coding I’ve already been tinkering or playing around with how could that work with everything already in like in a notebook right because rather than having to deal with all this code code dude what what open ai’s been doing is going to go perfectly with the semantic model and actually just grabbing a few columns and I I really do

8:14 grabbing a few columns and I I really do love that you can upload your own your own requirement your own environment to fabric I’m I’m a little bit hesitant on this because I’ve gotten burned a number of times by adding my own libraries into a Spark engine now I I believe this is all my understanding is this is python code running in the example they gave you this is path and code running in a local vs code instance and they’re

8:45 local vs code instance and they’re doing a what they call a pip install which allows you to install libraries or things on top of the Spark engine the fabric engine is how I read it I could be wrong on this but this was very heavily dependent on whoever makes the package and making sure that your fabric has the right run time so they work together yeah if you ever run a package that is out of support or The Spark engine updates and you don’t have your package updated you will be forced it will just break it’ll

9:16 will be forced it will just break it’ll stop working and you won’t be able to use not everything is backwards compatible to some degree some some of those some of those brakes are insanely hard to find too so hard like anyway and it’s it’s just Chang the runtime on a cluster or something like that yep and and I don’t even know if you can honestly I don’t know if you can pick in in the fabric world I don’t know if you can pick which runtime you’re running on yeah it’s a little bit trickier so anyways I like the idea that you can do it I think it will definitely solve some use cases where people who need to have

9:46 use cases where people who need to have some custom code cool I’ve been just caution you very much against it because I I have gotten burned by it a number of

9:52 I I have gotten burned by it a number of times where things just totally stopped working and it wasn’t very easy to get things fixed again so anyways just be aware I think it’s cool I have gotten personally burned by it I will not use it yeah it it is still pretty buggy in terms of you try to import a package and it will tell you there’s an error copy the and it tells you to copy the error log which then it gives you another error so you’re like okay cool I don’t know if you guys saw this but there was a there was a tweet sent out

10:22 there was a there was a tweet sent out by someone I don’t know if it was off of the Microsoft team but I had made a comment it was on Twitter I believe you made a comment somewhere are you kid really well it was it was around this whole idea of like I think it was around the statement around the now Auto the optimize now you can optimize on a table itself and I said really the challenge is and I was looking at Microsoft or Microsoft data bricks environment this is going to

10:52 bricks environment this is going to maybe go off too far in a tangent well pull me back in here if I’m going too far out here but the comment was Hey look we have now a rightclick optimize on these tables that’s the comment I made earlier but with this I said look this is really fine this is very cool I love the fact that we can optimize my right click menu it’ll definitely help out a lot for some of our business users but I said really the challenge here is how do you design the right partition strategy and how do you design the right size of files

11:22 design the right size of files like there’s there’s a whole bunch of things you can do to optimize what how many files you have what are the size of those files how do you like what’s the partitioning strategy for this table because how you read and write that table is highly dependent on your partition strategy and I just watched a data bricks they’re using AI to help you build optimization on top of your Delta tables and data bricks and I was like this is where AI should be applied yeah because if you have really large tables and you’re not

11:53 have really large tables and you’re not quite sure what is the partition strategy they talked about this idea of like the large customer I have a lot of records and I have one customer that’s really large but have a lot of other small customers when you Partition by customer you get this one massive partition for one person and all these smaller really really small partitions for everyone else so it’s actually it the the the AI they’ve been able to make a model using millions of Delta tables that are being split up all across data bricks and basically optimize it for you I’m like that’s what I that’s what I want Microsoft to do

12:23 I that’s what I want Microsoft to do like where are the hardest Char challenges working with data yeah it’s it’s this idea of I have data that needs to be I have data that’s coming in that’s being restated which makes it challenging so how do I handle restatements of data I have data that comes in incrementally day by day but I need to have a full year view of that how to make that easy for me and I have this challenge around well how do I pick the right design of the Delta table yes it’s very cheap to store things fine whatever it I can get away with

12:53 whatever it I can get away with a lot of not well-designed things but if I want it to be fast everything we do in the lake house it’s all pay per use so I want everything to be fast and efficient therefore I want Microsoft to figure that out like hey I’ll throw you a table Microsoft you just figure out the best way to store it most efficiently so that way all my reads on it are fast and optimized all my rights on it are fast and optimized and there’s all this really interesting Tech that they were building out there so I’m hoping Microsoft’s listening to what others are doing in this industry because I think Delta is the way to go it will be the

13:24 Delta is the way to go it will be the format that everyone’s going to use and it’s becoming more Universal now it’s going to be who can optimize it better yeah I’ll I’ll tell you though that’s a compelling story if that’s if that’s if that’s a a a a marketing pitch or like something they’re going to ride I that’s very compe if I have a choice between one service that just charges me y right and and and without the granularity of understanding like what things I can pick or a service that is telling me

13:54 pick or a service that is telling me listen this is a service we’re going to we’re already going to make this as optimal as possible but as things change it’s automatically going to adjust and and remain optimal and and there’s a cost difference in there like because now everything’s running so much more efficiently like it all these things run really well right now right well totally it it’s it’s not until you like run into this buffer of like why is why did this jump why why I’m looking at a cost increase somewhere L along the lines or it’s like oh we

14:24 L along the lines or it’s like oh we didn’t vacuum and clean up our tables and there’s just a a an immense amount of extra data that we’re not doing anything oh okay why isn’t this running as efficiently as it possibly could like oh it’s because we need to repartition things and the further down those rabbit holes of like hardcore data engineering you need to go more timec consuming they are right yes so if you’re going to solve those problems and and make your service more like I can extend it further I’d say that’s that’s compelling I and I and I put the I put

14:56 compelling I and I and I put the I put the link to the session in there and I was like was like on the on the Twitter thread Justina who’s the PM for a lot of the the AI and Microsoft Big Data stuffer around spark it’s like she’s like yeah we’re we’re definitely looking at to it this is the next step for them I think but to me this whole point is Microsoft is definitely in this catch-up mode they will probably build a very easy UI on top of these very complicated things it won’t all have to be done in a notebook I like where they’re going with this I just also think Microsoft will need a

15:26 just also think Microsoft will need a bit more time to continue to round out and refine the experience of yeah this whole lake house thing and it will get really good it’s already getting really good I really love the experience of making Delta tables and having data sets already built on top of it that that’s a great experience I really like it so they’re doing a lot of good things it’s just going to be we need to them to continue refining the edges and sanding it down and makeing it more clean over time indeed anyways enough of our intros

15:56 time indeed anyways enough of our intros Tommy I’m not seeing any topics come through so I’m going to push it to you oh yeah yeah I’m ready okay so let’s go through just a couple random topics these are all from the mailbag Seth you want to take the first one here and read us out the first topic idea this is from we have no idea so add your name at the end of your comment thank you for your show I’ve been enjoying it for a few months now I’d like to say I’m new to data work but I have been working as an analyst for a few years now I’ve observed that my data warehouse goes out and grabs a massive chunk of data from Source databases and brings it into the

16:27 Source databases and brings it into the DW and overwrites 99 % of the exact same data that is already in the warehouse then my on Prem powerbi reports do the same thing bringing a massive a massive data from the data warehouse 99% of which was brought in yesterday this feels wrong is there any way to just grab the new data is this the way it has to be or is it a bad data smell being relatively new I don’t know how but I think there must be a better way you are this was my this was my comment just like two seconds ago like

16:58 comment just like two seconds ago like like this is the stuff that should make it easier yes there’s definitely a better way Tommy how would you kick this one off what would you what would your first part of response be here for this for this question yeah this goes exactly with the what we’ve talked about where we’re opening up to the masses data for everyone fa fabric everything but a lot of people they don’t know where to start and you can’t blame anyone for oh here’s a data warehouse now now you’re it’s going to have all the configuration

17:28 it’s going to have all the configuration with all the settings and all the bells and whistles and there’s really no you and whistles and there’s really no helper tools there it’s like okay know helper tools there it’s like okay it looks fine today it looks fine right now but these are the things that you never know what happened it’s not like where it’s they introduce a new application or you get a new app and all these helper popup messages come up like hey notice that you’re trying to create a view in the a data warehouse did this is what can happen or hey

18:00 hey think this is a fabric question though yeah but it’s still doing with the data the data warehouse though no it’s just talking generally data I’m bringing I have the the data warehouse is coming from the source system so they’re potentially using like a data flow or they’re doing the operational system is pulling all this net new data into the data warehouse which they didn’t say where the data house live I’m assuming that’s not in fabric yet still new but then all the data sets go back and refresh full refreshes on everything as

18:31 full refreshes on everything as well well so did go ahead Tommy did you did you want to keep going keep going keep going okay I think in this recommendation what I would say here is yes there is a feature that you’re looking for and it’s called incremental refresh so the idea is you’re only looking for the last day or two of information and there’s two challenges that I think that come along with this one is there is data that has to be restated so for example if today I refresh refesh all the data from yesterday usually or some in some

19:02 yesterday usually or some in some systems they will let you update records from yesterday or update the version of those records so there’s always this concept of there’s a there’s potentially after the day occurred updates in the system that need to be made what you’ll what I find is there’s a window of time on when those updates start getting less and less frequent over time so a week a two weeks a month two months there is a cycle by which the data starts getting static and so they have in the

19:32 static and so they have in the incremental refresh process of parb desktop so you actually can to do incrementally refreshing data you’re you’re looking for good patterns where you’re always appending net new data so

19:43 you’re always appending net new data so you can just delete the last end number of days and replace them or the last end number of months so an incremental refresh would be very good in this situation where you could say look I will always reload the last two months of data every single day so it will just drop drop those two months it’ll then reload that data and that way you’re not loading if you had three years of data in there you’re not reloading the entire table all the time I would also agree it you’re right this is a wasteful process but there are tables that you have in your data

20:14 are tables that you have in your data system that you just need to do a full reload no matter what and where I found the challenge here lies is if you can’t track your deletes in a table from The Source system if there’s not a way of tracking those deleting records that’s where all the stuff falls apart and so again really depends on your data systems some had some systems have here’s a record I created it here’s the created date here’s the updated date and here’s the deleted date of that record and so the record never actually

20:45 record and so the record never actually disappears it actually tells you when it was created when it was updated and when it was removed because then you can actually use that information to do filtering or whatever you want in your data set to get rid of records that are no longer required in your final data set so that would be my answer use incremental refresh but you’re going to have to go through and evaluate all your tables and figure out what can you actually incrementally load or what can you what do you have to full load every time well key keyword being incremental right so if if they’re also loading full

21:16 right so if if they’re also loading full reloading everything in the warehouse like like take take a look at that there are plenty of like different methods or there should be by which you can create processes that incrementally add data to certain certain tables and not have to reload everything so I think that that’s the key word on both those points but it’s two touch points I’d look at your ETL right into the data warehouse how the data warehouse is structured because if it is structured right like like I’m I’m not

21:47 structured right like like I’m I’m not aware of methods where it’s like I would have to reload every single table every time but in in both those areas and Mike’s description of increment or refresh like both those are areas that save an immense amount of money even in the powerbi refreshing like not having to reload a like a complete model reload every day yeah versus a few thousand few hundred few million depending on what size of business you are could be a substantial cost savings yeah and then you look amazing

22:18 savings yeah and then you look amazing then you look you’re saving a ton of money again it’s just being smart with your data like what if your data is new and even in incremental refresh you can do auto detection on top of that you can Auto detect when the partitions hold so if you if you have records in your system that has a creation date and an update date you can part you can load data by creation of Records because that way any new records appear and then you can verify that if the record was updated you can use that as your detect data changes feature because what that

22:49 data changes feature because what that does is it will let you go through and slowly load older partitions in time when only one record or something has been replaced so again this is a this’s a lot more Management on like what’s happening inside the data set so right good question there move on to the next random question Tommy yes this is from I I’ll see how well I can narrate like Seth no Seth no pressure thanks for your great efforts I’ve been listening to your podcast for

23:19 I’ve been listening to your podcast for more than nine months we got a nine monther I wish if you could cover support for p PBI report server tweaks tricks hints Etc my company is hesitant to load its data to the cloud we installed the report server and currently use it for our internal use awesome my hint and trick will be is kill that thing as soon as possible and get it into the cloud I know most organizations can’t immediately do that it’s definitely a

23:49 immediately do that it’s definitely a good first step I will say this I I honestly I got to be honest I don’t frankly spend a ton of time on power B report server it’s so feature behind compared to the service so I’ll have to lean on maybe Tommy and do you have any other tips or tricks that you have found when working with report server it’s it’s really hard to customize it except for probably the folder folder aspect that’s a good point yeah the only thing that his folders I

24:19 yeah the only thing that his folders I had a really good conversation with some people and I had a really good conversation around this particular topic going we’re used to using report server and in report server you have a different security mechanism because you can secure a folder each of the folders and Nest those folders in hierarchies right you could have a main folder and you can do subfolders and you can give very metered access to where reports are going to be accessed for those users but

24:49 going to be accessed for those users but essentially it’s just a file store right that’s how it works would you guys you guys agree that that’s how you could set it up sure for sure so the challenge was in thinking about so the hint or tips or trick here as you’re thinking about migrating from report server into like workspaces it’s not quite a oneto one of like a workspace is a folder you could potentially do some of that but I think it’s the combination of a workspace is

25:20 it’s the combination of a workspace is for your creators of content and then your folder structure becomes a combination of apps and app audiences audiences yep yep great recommendation I think I would really if you’re going to be using report server and again I would also Echo too look at the documentation from Microsoft about people pushing Cloud into power. com I would highly recommend if you’re going to go to microsoft. com make sure you understand what’s in the admin settings of your tenant there are some things you will want to adjust you’ll need to go either

25:50 want to adjust you’ll need to go either study up on it yourself or go get some training around what’s inside the admin Center and make sure you go through those in a very detailed space think through what your data policies will be with power behind because you can shut off a lot of things and reduce what you would call business risk around making sure you have the right settings checked in the admin side of things so I would definitely recommend that for sure if you’re going to go to the cloud as well I would not stay in report server yeah not unless you have to like

26:20 yeah not unless you have to like I know government but even government has its own online I think they got their own cloud service their own cloud right yeah they got their own so many benefits especially you’re you’re using powerbi M Microsoft is probably the best company bar nun that takes data security and security exceptionally it like it’s top of- mind in everything they do right like if if I would recommend moving into the cloud there there’s no reason to be where we were 5 10 years ago people this

26:52 where we were 5 10 years ago people this new fangled thing that people don’t understand everybody clearly understands it and if you’re not us utilizing it and using it you’re actually I I would argue putting yourself at a disadvantage disadvantage with other other companies I’ll also just throw out just my personal preference here on this one many companies that I see staying on Prem they’re not even up to date on the existing Hardware they have they’re more vulnerable they’re yes so in my opinion like what I’ve seen is one

27:23 in my opinion like what I’ve seen is one of the major moves or one of the major reasons why companies try to move away from on Prem into cloud is because the cloud is a software as a service it’s always up to date any security patches are coming multiple times per month desktop is and and par. com is getting multiple updates per month little things here and there and tweaks here and there so in my opinion I think it’s actually better to lean on Microsoft and let them own the infrastructure and now with fabric now with fabric it’s it’s amazing because I don’t have to manage the blob storage I don’t have to manage

27:53 the blob storage I don’t have to manage the spark clusters I don’t have to manage the notebooks like all this stuff is that part I could argue like well well and to some degree though I don’t want to manage it though I just want to work I get it I get it but and I agree 100% if underneath the covers like we talked about in the opener I know everything is being optimized in the most deficient way and now I don’t I don’t know that and that and that would be my hesit that’s like my hesitation but for someone who’s if you’re going from report server into

28:23 going from report server into the first time into the CL like this is a good place to start I do think there are better tools for data engineering than what Microsoft has provided I think you can do really well data flows Gen 2 is quite buggy for me it’s been getting better but it still feels quite buggy I’m going to lean on something here that Tommy and I have talked about a long time in the past data flows Gen 2 feels a lot like metrics or goals initially called goals when it came out like it was a really neat feature I was very

28:54 was a really neat feature I was very excited about it and then I was like it doesn’t really work it doesn’t doesn’t do what I want I can’t make it it doesn’t make me happy like to work with the experience well a year and a half later they kept adding features and developing it refining it and now you look at it going oh dang this is a pretty solid feature like this this can really do some work so I feel like it’s going to be a very similar thing with data flows Gen 2 it’s just a little bit too far behind data flows gen one it’s getting way better and the more they invest in fixing it I think it’ll be a

29:26 invest in fixing it I think it’ll be a very solid tool moving forward yeah and I I think final comment here is like all I think our recommendations are Co cloud and and the reason for that though is

29:36 and and the reason for that though is Microsoft made powerbi reports of work because of the on Prem push right out of the gate but they’re not investing in it so like if you think about tweaks tricks hun like there’s no innovation happening against that from my understanding or at least nowhere near like what it’s going to be in the cloud and in services and with all the attention on fabric like you’re you’re stuck with an ecosystem so like enjoy it like but like do we spend a lot of time in there like like enhancing it and creating this no it’s a

30:07 enhancing it and creating this no it’s a platform that if it works for your organization that’s great but like innov one other thing I’ll put out here as well is Microsoft came out with a really good layer documentation Melissa coats did an incredible job running out the adoption road map there is now another section inside the documentation here called implementation planning and it talks about your bi strategy your tenant settings your tools and devices how do you set up workspaces and security it talks one layer deeper

30:37 and security it talks one layer deeper than the adoption road map which is very high level and talking about data culture I think your question is really sitting around our data culture says we need to stay on Prem and I think your question here if I had to read it between the lines a little bit is I think we should be able to change our data culture and get them more acceptable or help help them be comfortable changing our data culture and going into to more of a cloud and power. com environment so I think you actually have more of a people problem here where you’re going to need to manage the story and the goal of this my

31:07 manage the story and the goal of this my opinion here if I’m going to if I’m going to solve this I’m going to go try and find make sure we have an executive sponsor I’m G to make sure they buy into going to power. com because if you don’t have that at the top you’re going to just want to stay with with powerbi report server and you you probably won’t be able to get the adoption the buying that you need because there is a level of education and and or push across the organization to have them understand how now this new ecosystem Works inside power. com

31:37 ecosystem Works inside power. com so that would be another another link here I’ll throw it here in the window I’ll grab the introduction to the implementation strategy because I think this is another great place for you to go read up more about this stuff as well all right I guess I’ll take the third one here we got another question here around RBI now fabric workspace creation policy for a self-surface bi what should an organization policy for creating powerbi workspaces in order to support a

32:08 workspaces in order to support a self-service powerbi within the organization here are some of my factors the factors are one I want to prevent the explosion of a number of workspaces if there’s a problem in a workspace will those problems increase exponentially because I have a lot more workspaces to deal with why don’t we well we don’t worry about the explosion of the number of child folders in an organization and a share drive so why is this a problem around powerbi workspaces that’s a good question on like part one

32:39 that’s a good question on like part one a number two can only one app be published from a workspace therefore a single team will need multiple workspaces to contribute to the explosion which will be contributing to the explosion of workspaces question number three around this one who should be allowed to create a a workspace another subsequent question on there does it create workspaces and for for requested users and teams number four should there be a naming standard and or pattern to the

33:10 naming standard and or pattern to the naming of your workspaces if so why all right this is a great question workspace management is I think where this is kind management is I think where this is coming from all right gentlemen tear of coming from all right gentlemen tear this one apart I got thoughts on this one what do you got for us I’m GNA see if I’m gonna argue with you because so I’m gon I’m gonna lean on this area of is it certified or is it not I think there is a barrier that if data is

33:40 there is a barrier that if data is coming out of the central team and we’re expecting people to trust it I think you’re talking about certified workspaces certified data sets and certified reports so first and foremost the organization should be able to be able to create a lot lot of workspace now this will depend on your organization some companies will want to be more hands-off and will allow users to create their own workspaces as needed other organizations don’t want to do that and their internal policy this is a

34:10 that and their internal policy this is a policy I think needs to be made organization by organization however if you’re going to lock off or shut off the ability for anyone to create workspaces in your organization I think at a minimum you need a policy or a justification for the business unit to come back and say hey I need a workspace and here’s what I’m going to use it for so there’s two ends of the spectrum one is there should be a small number of certified workspaces with good data in them that are being serviced to the broader part of the organization part two of this is decide whether or not your organization

34:41 decide whether or not your organization is allowing anyone to create a workspace or you need to have that closed off if you have it closed off make sure you document that process put a form on a SharePoint page explain to people how they would go get access to a workspace so in and that way it can slowly meter the roll out of workspaces so those are my initial thoughts I have more thoughts around apps and app audiences and I think that will also help out here to keep the proliferation workspaces down but where does your opinion lie like what do you

35:11 does your opinion lie like what do you how do you think it should be applied well I I I really think controles or you don’t I don’t I don’t think I have an opinion One Direction or the other I think the business because I see it working well in both directions I see some organizations finding value and letting it go wide open Innovation kind letting it go wide open Innovation runs rampant and wild however with of runs rampant and wild however with that comes a bit more management expert on your it or Central bi team right so if you let if you let the organization build whatever works space they want now you have to be more

35:43 they want now you have to be more rigorous around identifying what content is certified I think that’s that’s an increased effort if you have it manage it I still don’t think you ever I don’t think you close off and never give anyone extra work space but I do also see value from the it playing a central role in managing those workspaces for teams of people so I see it both ways and I actually I see it successfully working in both types of environments it really depends on does

36:13 environments it really depends on does your culture in your this is a data culture question does your culture in the company require that there’s a central team and is that Central team large or not if that’s a very large Central team that can handle the build of a lot of these things for you or you can offload this to a ticketing system where that can happen in an automated way then I would say yes you you let the central team control more of the workspace creation if you don’t have that capability and you’re just looking to use utilize powerb in very big

36:44 to use utilize powerb in very big organizations I’ve seen they don’t have enough Central team to actually manage all this for you so they have to think of another way to give a bit more control to the business yeah I I I lean more towards the control side but not to the point where it it’s handled by a team so I guess more more upfront thinking right guide guid guidance for best use cases know like and one of the reasons for that is like you think about the proliferation of folders in

37:15 the proliferation of folders in SharePoint right how do I how do I find things becomes the the question a lot of times or like we have so many things within the ecosystem like didn’t you within the ecosystem like didn’t somebody created this report like know somebody created this report like he can share it with you or they can share it with you from over here and it it becomes quite a mess so at at minimum I would say hey just here’s some good practices right like you can you can have the freedom to build out workspaces and use them in your groups as you would need but here’s

37:45 your groups as you would need but here’s the purpose of a workspace here’s how we share things like aim to to shoot for these goals because it’s going to help the organization in these ways and this is how how the team’s going to be interacting with you and whatever but yeah I agree like if unless you have the one person that’s a constant support of like hey I need this workspace created and they’re doing that diligence against guidelines and best practices for the organization that are defined or governance rules

38:15 that are defined or governance rules like that that’d be a boring job but at the same time though this can get out of hand very quickly with what organization is and again even if you’re small business I I’m completely worried about this if you’re even if you have manage self-service if you have the Hub and spoke approach the the amount of replication that you have to do one you better know how to use the API or have

38:46 better know how to use the API or have some app to create your Dev prod and then to try to to try to manage that unfortunately too the only way to actually see all the data that’s going going on is a scanner API however you going on is a scanner API however it only runs in batches of like 100 know it only runs in batches of like 100 workspaces but the idea that every team every department has three or four workspaces for a deployment pipeline and depending on those roles I am in you’re making some very broad assumptions like I I don’t think I’m

39:17 assumptions like I I don’t think I’m giving a deployment pipeline to a business unit frankly that’s to me that’s not that’s that is to me a certified layer of data that I would say

39:26 certified layer of data that I would say a a deployment pip that’s premium too so either you’re buying premium per user things but to me a pip a data pipeline deployment pipeline that’s starting to feel like more this is a certified data set area and that I would agree that should be managed by a central team you’re going to have more people in that team that are Developer Centric as opposed to hey I’m just a business person I’m going to go randomly grab some data stick it together in a powerbi report and just be done with it does that make sense like you’re me touching

39:56 that make sense like you’re me touching on a bit more of like the higher premium end features of powerbi and I would not expect a normal business user to understand or use them yeah but at the same time though I them yeah but at the same time though you’re you’re still dealing okay so mean you’re you’re still dealing okay so yeah maybe the business is not going to have deployment pipelines but if you’re in an Enterprise organization you’re going to have teams who are more than just the high level business creating a report here and there where if you really truly have that Hub and

40:26 if you really truly have that Hub and spoke approach those spokes are on their own with their own deployments with their with their own process of creating reports so they’re going to need some of that configuration because it’s are you g to just give them a Dev and prod workspace where they’re gonna do the old oldfashioned way of uploading to Dev and then deleting it and then republishing it I don’t think so do I care to like that’s their data it’s a it’s to me it it lives more of a who’s going to own it right so if you

40:56 who’s going to own it right so if you think about this is and this is some of the diagrams that come off of the Microsoft PB adoption road map there’s a great diagram on there that describes the blend of is every is the data set and Report going to be made by the business is the data set going to be made by it and reports made by the business or is the report and the data set going to be made by it or the central bi team whatever that is right so depending on the mix of what you think is going to to be built here I would argue the the the brains of your operation should focus on building solid

41:28 operation should focus on building solid models that could be reused in multiple ways so when the business is asking for I need access to data you want to centralize the hard stuff the hard stuff is making the model that could be reused in many different different situations so I would my opinion here is I feel like more organizations should focus on the modeling exercise from a central bi team one it’s a bit more fun because I enjoy the modeling side of things and then offload some more of the reporting pieces like there may be like a handful of reports that the central

41:59 a handful of reports that the central team does but really push that reporting element onto the business side let them own it but I think this is where data stewardship and like the content management piece comes into place here because I don’t want to own everything unless I have enough budget to put in a really big team around it right and to Steph Steph’s making a comment here in on LinkedIn Steph I would agree with your comment it is difficult to put the genie back in the bottle in this situation so if you just open it wide

42:30 situation so if you just open it wide open with no policy no planning to this it’s almost impossible to get back together you or you’ll spend a lot of effort doing it so I really do think you should roll it out slowly over time because there is an education pattern that needs to happen I think here with the business as they understand what is a workspace how do I make an app who owns what stuff inside these things yeah I as we’re talking I think this falls into the same thing like start out slow like probably more controlled with with a with a bent

43:02 controlled with with a with a bent towards Hub and spoke it’s it works with you too Tommy right you don’t just open up the doors correct you control it for a while educate as much as possible create guidance and gu guard rails and then I I would say open up the doors because you don’t want to manage that no as a as a central team you’re you’re not going to want to be in the business of managing creating work spaces but you also don’t want the adverse side effects of just letting people do something without any understanding of what the purpose of

43:33 understanding of what the purpose of these object is and creating a mess mess so we found somewhat agreement yes I I will say the only other one I will note here is they talk about this only one app published per workspace I would highly recommend you look into app audiences because app audiences allow you to take a single app and break it into chunks so that different people can view different things inside that app so yes it is only going to be one app

44:03 yes it is only going to be one app per one workspace but you no longer need to have that capability of having you to have that capability of having seven workspaces just because the know seven workspaces just because the audience of the people change based on those those grouping of reports use the app audiences to do that as well great question all right Seth I believe it’s over to you for our maybe our last question we got a couple minutes left here want to take our next one Seth yeah I’m going to man this one’s going to be difficult cuz there is actually a name and it’s a hard one so hi Mike Seth and Tommy in the prefabric era as far as I know the most optimal

44:35 era as far as I know the most optimal architecture for data set and Report development was the following collect data sources into Data flows create shared data sets which load data from the data flows create reports built upon the shared data sets in the fabric era how does the optimal architecture look like for data set and Report development is the prefabric architecture described of of still the most optimal if not what is your idea on it powerbi desktop will remain the primary tool for data set development or powerbi service or will powerb service take its place so there’s

45:05 powerb service take its place so there’s side question thanks for your time time regards B BL AEK Zs Blaze I’ll go with it looks good to me thanks question great question so first and foremost the powerbi service will definitely take over desktop at some point I’m just it’s going to be a better it’s already becoming a better experience already I will say this my initial reaction to this one is does the pattern

45:37 reaction to this one is does the pattern change data flows tables data set report my answer is going to be no because the pattern is still the same you’re still making tables of data except now you’ve been equipped with a Lakehouse a spark notebook a data flows gen two austo engine a SQL Server so I don’t think it’s changing in reality I think you’re doing the same thing nothing’s changed per se it’s now a matter of which compute engine do you

46:07 matter of which compute engine do you need to use in order to make the table and I think I think that’s kind and I think I think that’s how I would summarize it so I don’t of how I would summarize it so I don’t think you’re changing it the architecture just now grows you’re bringing a lot of a lot newer tools or different tools into how do I create those tables that’s all the thing I think I see is changing yeah I don’t I don’t I wouldn’t say we’re ready to like say things are going to change but our are with with everything getting rolled into the different with with everything

46:39 into the different with with everything getting rolled into the workspace and all those features and capabilities available and granted they’re harder engineering they’re they’re heavier on engineering yes does it challenge data flows like do I do I need data flows if I have access to other Paths of ETL and creating objects that I can directly plug into I think I think the answer is yes I think I think you could definitely start with data flows especially if you have smaller data but I’ve defin I’ve seen

47:11 smaller data but I’ve defin I’ve seen some notes Here on Twitter and some other people saying that the Spark engine is really fast and when you’re doing larger amounts of data tens of millions hundreds of millions of rows of data you will likely graduate out of data flows because data flows is very I love it it’s a very UI click click draggy drop it’s a very easy to use situation so I really like that the other thing here I’d also argue with is in data flows you’re also seeing now Microsoft

47:42 flows you’re also seeing now Microsoft is trying to build this thing called pipelines sometimes you just need to get the data to your lake or get the data into a table format the pipelines experience is designed to be that at scale large amounts move movement of data at high speeds and very efficient so data flows is good but it’s still single threaded it doesn’t really it doesn’t always run as fast as it needs to so what I see Happening Here is you’re getting more tools that are more capable around data movements or data engineering pieces that are going to

48:13 engineering pieces that are going to help you handle larger amounts of data or the same amount of data in a faster way yeah I think I think that’s what’s one of the most compelling things from the fabric offering that I really like I’m very interested to see how evolves is you you the graduation from data flows is these Services is these other higher volume or systems that can handle higher volumes if that’s the direction that you have to head but

48:43 direction that you have to head but now that those are all within this same path I I think it it brings the level of introduction into a realm that otherwise may not have been there so look in those outside of that Tommy how do you feel about the service taking over to parv desktop how I feel I’m trying to find the article I I’m I’m acceptance acceptance Tommy’s in the denial Fage right now it’s is there’s a there’s a

49:13 right now it’s is there’s a there’s a there’s a moment like there’s grieving the steps of grieving Tommy’s still in some sections of grieving

49:18 some sections of grieving here we have gone we have we’re well past grieving now we we hit anger yeah yeah anger has happened we had a lot of things I’m honestly like it’s actually more enjoyable now and I’m diving into the dev container thing creating notebooks on the browser when it’s not going to bog up my computer tomy’s in the in the bargaining stage right now in that stage right now only if I could have some other things but

49:50 I could have some other things but there’s actually a really interesting article and I I’ll have to find it I’ll send it after just about the quering of data and they they tested data flows pipelines and Jupiter not from MIM I think so does a lot of testing around different systems and another one that I found very interesting here is there’s this thing called duck DB or something that’s apparently very fast for querying data it still use I think it still uses Delta tables but MIM I think also sent out a query right if you

50:20 think also sent out a query right if you import this library and use this inside Fabric or inside the notebook’s experience you can red your query time like by half I’m like what the heck this thing’s really fast so it it opens up the world to your point Tommy all we’re getting is more variety of tools where as an ad so from a developer standpoint or from a business user the business user is getting way more value now with these new tools like tons more stuff which is awesome on the flip side as an

50:50 which is awesome on the flip side as an admin we now have way more stuff to manage like who’s using what things where’s this data going how are things going to be shared so I’m taking the same approach as what we said earlier about the workspaces I think it’s going to be very valuable for us to to use these things however I want to start small I want to start with a very small team of people figuring out how the tool Works where is the best place to do certain things because there’s going to be an education curve

51:20 there’s going to be an education curve for every organization to figure out okay well we used to do this stuff in a SQL server using ssis packages what’s the equivalent of that pattern now inside fabric we were doing everything with Azure data Factory before what is the equivalent of that pattern that we can now do inside fabric so it’s it’s a lot of things that you you already have tools that do the data engineering you need the idea is where’s the best most optimal way of using the fabric tools to use them do the same stuff you’re doing

51:52 use them do the same stuff you’re doing before so anyways I really like it I I’m most worried about Fabric in in the capacity and usage side of things that’s what I’m most concerned about because it’s all fine and dandy like before we had powerbi premium it was just loading data into Data flows running a data set that’s all the compute that we really needed a front end and a back end very simple now we’ve got custo and SQL server lless and now we’ve got spark and we’ve got data flows Gen 2

52:23 spark and we’ve got data flows Gen 2 these are all different things that are all they’re going to do is consume more time on the processing time on your capacity units every one of these things are just going to consume more stuff so what does that mean how can I it’s going to be more important than ever to understand what is the right compute engine for the right type of job data engineering job you need because if I can run a 30 minute window of time on a data flow but I can do it in 15 minutes

52:53 data flow but I can do it in 15 minutes or 10 minutes in a spark notebook book well I should learn sparkk notebooks because then I can do that data the same data the same compute in like half or a third of the time like that’s there’s going to be there’s a there’s an opportunity here to optimize your existing efforts using these new tools does that make sense it it does and it Sparks this other thought oh Sparks I like the word you use there where where I think I think well one Microsoft usually leaves it up to like

53:23 Microsoft usually leaves it up to like the individual use cases of compies to do the the the comparison of costs where it’s interesting to me and I’m I would be cautious for them is hopefully fabric performs better than those individualized service Services if people want to get on to fabric yes and and realize or see that the it doesn’t perform as well yeah right like it if I had an ADF pipeline that runs faster

53:54 had an ADF pipeline that runs faster than than then the Pains of going through what fabric solves related to combining all these Services together is still not worth it for some organizations if they have the skill sets to just say hey like these these Services already exist and independently I have more control and we verified that it’s going to cost us less yes that would be a big barrier for entry of people adopting Fabric and I which which is where I go

54:26 Fabric and I which which is where I go well then why aren’t you giving some recommendations around like whatever because like there is no mode of I I don’t think Microsoft doesn’t play this game of like hey you need to migrate onto fabric they’re always about the net new build yes but the this same principles apply right if I’m doing a net new build and I’m doing my due diligence and ping things you guarantee companies are going to be like okay well we’re going to PO this let’s try it in fabric yep let’s try it with these services that we are independent in an

54:57 services that we are independent in an ecosystem that we get much more control around which one’s more performant which one costs like x amount of dollars can we see how much x amount of dollars are being cost in these particular areas you’re challenging yourself against your own own products technically from a from a cost perspective so it’ll be interesting to see how that plays out right like but it it is it is it’s a thought that spring to my sparked to my head while you were talking through that area there is there is a I will say this I think there’s a large value ad for companies to be able

55:29 large value ad for companies to be able to see hey there is now less time required to be able to integrate multiple Services together like I think this is a huge value add for I don’t have to worry about a lot of the the virtual networks I don’t have to worry about things talking to each other you can literally just use it out of the box I think that adds an immense amount of value and time for you managing connections service principles all this other super technical stuff so I think there’s a but to your point I didn’t I never thought of that Seth like I never thought of like what happens if I run

56:00 thought of like what happens if I run the same pipeline inside fabric as I do inside ADF and do they run the same amount of time like I haven’t done that comparison yet but that would be a great like if you have ADF stuff running already doing the diff between what is that doing versus what’s happening inside the new Fabric World does it run faster have they been able to add more bits to make it more efficient maybe maybe not I have been hearing that data Factory Gen 2 is faster than data Factory gen 1 so you you could

56:30 Factory gen 1 so you you could potentially rebuild your data Factory pipelines in Fabric and test those kind pipelines in Fabric and test those side by side and you should be seeing of side by side and you should be seeing some performance Improvement there as well so it’ll be interesting to see how this is all going to pan out excellent interesting well we only got through what four questions that’s actually not bad it’s not bad I was thought we were gonna rip through them I thought we were GNA means you folks get get more podcasts in the future where we’re just ripping through mailbag questions and we

57:00 ripping through mailbag questions and we certainly do appreciate you sending in your questions certainly it it serves multiple different purposes and we get to rip around how you guys are thinking about your worlds and throw throw our spin on things or some of the things that we have some experience yeah with that we appreciate your listenership we thank you very much for taking an hour of your time and letting us ramble about random questions from a mail a number of mail bag or questions from the audience here hope you found one of these resonate with you

57:31 you found one of these resonate with you we hope this is was a u maybe a bit educational piece of this but maybe gave you some additional elements to think about our only request is if you like this content if you like what you heard here please share with somebody else let somebody else know you found some interesting insights in the podcast this is our candid conversation around Fabric and how Tommy hates the the deprecation of desktop and how he’s going to just cry and it’s over it’s he’s going to cry himself to sleep and and it’ll just be a bad day we we’ll enter the the denial stage here at

58:01 enter the the denial stage here at some point for Tommy and we’ll we’ll have to get him over on board to all only using the service at some point with that said Tommy where else can you find the find the podcast well you can still find Us online you cannot download any app to find us however that’s really good you can find us on Apple Spotify wherever get your podcast make sure to subscribe and leave a rating it helps us out a ton do you have a question idea or topic that you want us to talk about in a future episode head over to powerbi tips podcast leave your name and a great

58:31 tips podcast leave your name and a great question join us live every Tuesday and Thursday a. m. Central and join the conversation live thank you so much we appreciate you all very much and we’ll see you next

59:09 [Music] you

Thank You

Thanks for listening to the Explicit Measures Podcast. If you have a question or topic idea, drop it in the mailbag and you might hear it on a future episode.

Mailbag Hot Takes – Ep. 265

News & Announcements

Main Discussion

Looking Forward

Episode Transcript

Thank You

More Posts

Git Best Practices Diff Noise & Naming - Ep. 513

Publish to Web vs. Embedded - Ep. 512

Scaling a Power BI Side Hustle - Ep. 511