Tag: Data Connections

  • Power BI datasets: A Method to the Madness

    Power BI datasets: A Method to the Madness

    As report authors we sometimes get caught up in how easy it is to create a report and provide value to the business. Each report is an opportunity to make a big contribution to the organization. Power BI makes it easier than ever to turn many of those reports around quickly. This is a good thing of course. But, sometimes we can get caught up in the madness of turning out another report with only a flash of recall that we could have used the same or similar model done in a different report. The internal monologue kicks in.

    “Pffft! Re-use a model, that is way to much work! Why do that?, when we can just create a copy of the PBIX (Desktop file) and have two models to manage with the two reports for the same business area! That sounds like fun!”

    Hmm… Or does this actually sound awfully similar to the challenges one might face with sprawling Excel solution. Where variants of logic are buried in different files. Only the composer knows how to bring order from the chaos. Might I suggest that we spare ourselves that sort of pain. Just learn how we can easily leverage our already hard fought model work. Avoid tedious updates without having to over complicate our nirvana of sticking to the business world, but how would we do that?

    The Answer

    Power BI datasets. Power BI datasets allow us to re-use our model across multiple reports. This simplifies and speeds up future report authoring. Also this gives us the building blocks for sharing that dataset to a wider audience.

    How do we create one?

    Technically, we’ve likely already created many. You see, when we publish a Power BI report, we publish a dataset along with it. This dataset is stored in the Power BI Service, and our deployed report relies on it now. But the connector “Power BI datasets” allows us to connect directly to any of these datasets that we have permission to edit. This means that we have the ability to extend a single model across multiple reports without the need of standing up a separate Analysis Services server anywhere. This is a big deal, this allows the everyday business user to leverage a reusable model. A single change or update to a calculation can update multiple reports at the same time. One measure or calculation addition can be done in one place instead of many.

    All we need to do to create a dataset that we can connect to is publish a PBIX file that contains data. I’ve adopted a practice recently and rather than generating my first report and reusing that model, I now upload a PBIX file that ONLY contains the model and I name it something like “Sales-Model”. Now I have an object that I’ll know serves the purpose of just being a model instead of a report. This makes it easier from a trace-ability standpoint when looking at the related objects in the Service or selecting it from my list of options when choosing my dataset.

    How do we use one?

    Using the Power BI dataset is one of the most straightforward connections in Power BI. Selecting Get Data -> “Power BI datasets”

    This brings up the menu of all the datasets in the Power BI Service. The list that is shown are the datasets that our user account has access to use. The great thing about these datasets are we now have the ability to connect to and use a dataset from a different workspace provided we have permissions to edit them. This feature is called a Shared Dataset. Select the dataset and your report is automatically connected to dataset.

    Now, what we’ll notice here is that using this feature automatically pulls in a model for us and we can start building our report. This data source connection behaves exactly the same as if we created our own “live” connection to an Analysis Services instance we would set up. Probably not shocking to any reader here. But, that is exactly what is happening in this case as well. We get the benefit of Microsoft handling all that painful work for us while we reap the benefits of a streamlined process.

    As with any “easy button” solution, there are pro’s and con’s. What I mean is that in our new reports we do have easy access to the model. Now you can start building reports immediately. We don’t have the ability to modify the model or the ETL processes. If we want to edit then we need to go back to the original dataset to make those changes.

    But the minor inconvenience of having 2 PBIX files open if we need to in order to make updates to the model is trivial compared to being able to connect many reports to that single model. The live connection does still allow the report author the ability to create measures. So, if there are measures that are only suited to our report and not the overall model we still have the ability to add them.

    Once we’ve completed our report, we just publish as we normally would, only this time the dataset is already out in the Service and only our report is published. There are so many things we can now do to share that dataset, but we’ll leave that to another article.

    If you’ve never used this method before, I would highly encourage you to try it out. Any time you can save yourself now with reducing the number of models you maintain, the faster you can produce more reports. You now spend less time maintaining all the reports you are publishing.

    Happy report building!

    If you like the content from PowerBI.Tips, please follow us on all the social outlets to stay up to date on all the latest features and free tutorials.  Subscribe to our YouTube Channel, and follow us on Twitter where we will post all the announcements for new tutorials and content. Alternatively, you can catch us on LinkedIn (Seth) LinkedIn (Mike) where we will post all the announcements for new tutorials and content.

    As always, you’ll find the coolest PowerBI.tips SWAG in our store. Check out all the fun PowerBI.tips clothing and products:
    Store Merchandise

  • Power BI Connections: Direct Query

    Power BI Connections: Direct Query

    Now that we’ve solidly gotten through the basics in terms of what connection types are in the opening blog, found here, and detailed out what is included in the default connection type of Import found here, let’s get on with some of the more interesting connections.

    Connection Type Outline
    Connection Type Outline

    Direct Query is the first connection type that we will discuss that extends, but at the same time limits functionality in the tool itself. In terms of data access this connection type allows us to access our data in the source system. The supported data sources for direct query can be found here. This is distinctly different than what we observed in the import method.  When using Import, the data is a snapshot and refreshed on a periodic basis, but with Direct Query it is live. “Live” means that with Direct Query the data stays in the source system and Power BI sends queries to the source system to return only the data it needs in order to display the visualizations properly. There are some pros and cons in using this connection so it is important to understand when you might use it, and when you should probably avoid this connection.

    Cons:

    • When Direct Query is used you can no longer do many of the data mashup actions in the “Edit Queries” section of Power BI. It is assumed that you will have already done this in the backend. You can do simple actions such as removing columns, but don’t expect to be able to manipulate the data much. The Query Editor will allow you to make transformations, but when you try to load to the model you will most likely get an error that looks something like this

    Direct Query Error
    Direct Query Error 

    • The data tab is also disabled in the model layer and thus you need to make sure that all the formatting and data transformations are completed in the source.

    Data Tab no longer present
    Data Tab No Longer Present

    • You can do some minor adjustments to format, but this could be a heavy restriction if you don’t have access to the data source.
    • There are performance impacts to the report that need to be taken into consideration. How large is the audience that will be interacting with the report? How busy is the source system, are there other processes that could be impacted?
    • Troubleshooting skills in source system language
    • Multiple applications required to adjust data ingestion and formatting

    Pros:

    • The Direct Query connection does not store any data. It constantly sends queries to the source to display the visuals with the appropriate filter contexts selected.
    • In the November 2017 release there is a new capability in Power BI allows you to reduce the traffic and enhance this connection method exponentially. The feature is called Query reduction, and allows you to enable an “apply” button on a slicer or filter. The benefit with this option is that you can set all your selections of a filter prior to Power BI executing the query. Before this feature was available, every selection you made would fire off a query to the source database. To enable this feature, go to File -> Options and Settings -> Options -> Query Reduction you will find these options to help with Direct Query Performance.

    Query Reduction Options
    Query Reduction Options

    Note: This enhancement greatly increases the performance of a Power BI report against the data source, but be aware that there could be poor query performance, or aspects of the solution that would require troubleshooting in the data source depending on what queries are being passed. This would require understanding of how to performance tune the source.

    • Deployment of the Direct Query connection requires the use of the gateway previously called the Enterprise Gateway. Note that the Enterprise Gateway is different than the personal Gateway.
    • No data is ingested into the model using Direct Query thus, there is no need to schedule a refresh. Once the dataset is connected to the Gateway, the data source feeds information to the report as the user interacts with the report.
    • It will always show the latest information when you are interacting with the report.

    Direct Query is a powerful connection type in that it produces the most up to date data. However, as we have seen, it does come with some considerations that need to be taken into account. The Pros and Cons of the connection mostly revolve around whether or not the end user can understand and deal with potential performance issues, updating data retrieval processes, and understand the downstream implications of a wider audience. Typically, Direct Query is used in extremely large datasets, or in reports that require the most up to date information. It will most likely always perform slower than an import connection and requires an understanding of tuning and troubleshooting of the data source to alleviate any performance issues.