Author: mike carlo

  • Fixing Measure Madness

    Fixing Measure Madness

    Often times when you’re working with large data models you will have multiple tables with many relationships.  It could be complex maybe you’ve seen something like the following:

    Large Data Model
    Large Data Model – Photo Credit ( www.biinsight.com )

    Once all the tables have been loaded the manic measure building begins to support all the different visuals.  A couple of sums here, a number of calculates over there, and boom, a beautiful report.  You stand back and survey the work and realize you’ve built measures all over the place, in different tables, maybe even stuck a couple of measures in the wrong place.  Whoops.

    Tons of Measures
    Tons of Measures

    Maybe we should think about cleaning things up a bit, if only there was a way to group the measures.  How do I group my measures?  I’m glad you asked.  With a little trickery we can make a measure table.  Let’s begin.

    First we will load a little data.  For this tutorial we will simply copy and paste in some data.

    Note: For the full tutorial on manually entering in data visit this page.

    On the Home ribbon click the Enter Data button.  Copy in the table below into the Create Table window.  Rename the table Sales Data and click Load to exit.

    Salesman Item Unit Sales Revenue
    Salesman 3 Item 4 405 1357
    Salesman 1 Item 3 339 1649
    Salesman 1 Item 3 315 1332
    Salesman 3 Item 3 418 1531
    Salesman 1 Item 3 482 1633
    Salesman 2 Item 4 448 1676
    Salesman 1 Item 4 391 1432
    Salesman 2 Item 1 341 1539
    Salesman 3 Item 1 419 1482
    Salesman 2 Item 4 414 1610
    Salesman 1 Item 4 351 1670
    Salesman 3 Item 3 449 1795
    Manually Enter Data
    Manually Enter Data

    Upon loading our data table we now have the following fields.

    Fields of Data
    Fields of Data

    Now, let’s make a measure that calculates the revenue per unit.  On the Home ribbon click the New Measure button and enter the following DAX measure.

    Revenue Per Unit = SUM('Sales Data'[Revenue]) / SUM('Sales Data'[Unit Sales])

    Next, make a table with the following fields.

    Salesman Table
    Salesman Table

    Great! but, as we all know this is how the measure madness begins.  From here we refine and finesse the data to craft the data story, and end up with tons of additional tables and measures.

    Pro Tip: You can use the search window at the top of the Fields window to help you find buried measures or fields of data. 

    Using Search in Fields
    Using Search in Fields

    Let’s make the measure table.  Start by clicking Enter Data on the Home ribbon.  Rename the new table to My Calcs, and rename Column1 to Calcs.   You don’t have to re-name column1, but since I’m OCD about my data I like to rename the column to the same name as the table.  Then click Load to exit the screen.

    Measure Table Load
    Measure Table Load.

    We now have a new table labeled My Calcs with one column labeled Calcs.  Next highlight the measure we created Revenue Per Unit.  Then on the Modeling ribbon change the home table from Sales Data to My Calcs.  This will move the measure.

    Home Table for Measure
    Home Table for Measure

    Right click on the Calcs column in the My Calcs table and then select Hide.

    Hide Calcs Column
    Hide Calcs Column

    Next Save and then reopen the document (it’s a Microsoft thing I guess).  After the document has reopened the My Calcs table has changed it’s icon from a table to a Measure icon.

    Completed Measures Table
    Completed Measures Table

    For kicks and giggles add the following measure to the My Calcs table.

    Total Revenue = SUM('Sales Data'[Revenue])

    Ok, one more.

    Total Unit Sales = SUM('Sales Data'[Unit Sales])

    There you go.  A very straight forward approach to cleaning up all the random measures in your data model.  I have found that when others team members are working with your data model this helps other people understand which fields have been calculate and which ones were imported via a query.  This also helps you group logical calculations, further creating clarity within your data model.

    If you want to read up more on making measure tables check out this great site (also linked below).  In addition to walking you through creating a measure table it also explains how to make a measure table when using direct query mode.  As the article explains, while you are in direct query mode you are unable to manually enter data.  Nice job, Soheil Bakhshi, well done.

    How to Define A Measure Table in Power BI Desktop

    If you want to take your DAX skills to the next level, try jumping into this book by Rob Collie and Avichal Singh.  It’s an easy read but very insightful.

    If you liked this tutorial make sure you share.  See you next week!

     

  • Intro to Guy In a Cube

    Intro to Guy In a Cube

    Kicking off my video series for PowerBI.Tips I have to give incredible props to Adam Saxton.  Adam is a Microsoft Employee who creates THE BEST content for PowerBI.  There are a lot of videos out on the internet that are interesting and helpful to learn from.  However, Adam takes it to another level.  The videos are relatively short and packed with useful information.  Below is the introduction video to Adam who is The Guy in a Cube.

    I will be definitely sharing more videos from Adam in the future!

  • Using Advanced Mapping in ArcGIS Preview

    Using Advanced Mapping in ArcGIS Preview

    In the September 2016 release of PowerBI, Microsoft introduced a new visual called the ArcGIS Maps preview.  For more information on the maps integration you can read the following post from Microsoft.  This tutorial will review how to load data using Latitude and Longitude data and map those points on the ArcGIS map.

    First, we need to open PowerBI Desktop and then we will load some data.  The version of PowerBI Desktop for this tutorial is 2.39.4526.362 64-bit (September, 2016).  You can download the latest version of the software here.

    On the Home ribbon click on the Get Data button and from the Get Data window select Blank Query.  Click Connect to proceed.

    Now you will be in the Query Editor, click on the View ribbon and select the Advanced Editor button.  The Advanced Editor will now open.

    Enter the following code into the Advanced Editor: (you can copy and paste the code directly from this site)  Click Done to load the data.

    let
     Source = Excel.Workbook(Web.Contents("https://powerbitips03.blob.core.windows.net/blobpowerbitips03/wp-content/uploads/2016/09/Locations.xlsx"), null, true),
     Locations_Table = Source{[Item="Locations",Kind="Table"]}[Data],
     #"Changed Type" = Table.TransformColumnTypes(Locations_Table,{{"Event", type text}, {"Attenders", Int64.Type}, {"Zip", Int64.Type}, {"Latitude", type number}, {"Longitude", type number}})
    in
     #"Changed Type"

    Note: this will load an excel file that is hosted on PowerBI.Tips, so make sure you have an internet connection.

    Load Map Data
    Load Query

    Re-name your query to Map Data and then on the Home ribbon click Close & Apply.

    Load Map Data In PBI
    Load Map Data In PBI

    Before working on this tutorial, you will want to make sure you have enabled the ArcGIS map which is in preview.

    Click the Menu button to open up the menu options.

    PBI Menu Button
    PBI Menu Button

    This will expose the menu.  With the menu open click on Options and Settings and then click on Options.

    Selecting Options
    Selecting Options

    Once the Options menu is open, click on Preview Features and then make sure the ArcGIS Maps for PowerBI preview feature is check.  Then click OK to close the options menu.

    Options Menu
    Options Menu

    You should now see a new bright blue icon listed in the Visualizations window.

    ArcGIS Maps Icon
    ArcGIS Maps Icon

    Click on the ArcGIS visualization and then add the following following columns of data from the Fields window into the visual.

    Fields for ArcGIS Map
    Fields for ArcGIS Map

    OK, Wow, seems like a normal map.  So, why all the hype?  Well, unlike other mapping visualizations, this map enhances the selection methods for points on a map.

    By clicking on the square with the black mouse arrow (highlighted with a green box here because the selection tool for the visual uses a red box)  You can then click drag a red box across the map to select multiple geographical points on the map.

    Highlighting Points on Map
    Highlighting Points on Map

    Selecting points on the map will filter other visuals on the page.

    Add a Table visual with the following fields:

    Table Visual Fields
    Table Visual Fields

    Now click the Multi-Select button and highlight some points on the map.

    Multi-Select Button
    Multi-Select Button

    Notice how only the selected points are highlighted on the map and the table filters to only those points.

    To enhance the map further click the In-Focus Edit Mode button.

    In-Focus Edit Mode
    In-Focus Edit Mode

    Now, the map editor opens.  This allows you to change the basemap view, the theme of the map, symbols on the map and adds other data to enhance the coloring of the map.

    Click on the Basemap button and then select the Dark Gray Canvas.  We have turned the map in to a sort of night mode.

    Basemap Change
    Basemap Change

    Have fun here and explore a couple of the other map types.

    Next Click on the Map Theme then click on the Heat Map.  Alright, this is getting pretty cool.

    Heat Map Selection
    Heat Map Selection

    In the next section Symbol Style you can change the properties of the points on the map.  For the heat map you can change the Transparency and the Area of Influence of the points.  Each map theme, Location, Heat Map, Size, and Clustering have different Symbol Style properties.  So you might want to select a couple different Map Themes and try adjusting the Symbol Styles to see how they change.

    Now finally, the best part of the ArcGIS mapping, the Reference Layer.  This will blow your mind!

    Click the Reference layer button then select a layer to add from the Demographics tab.  For this example, I chose the USA Average Household Income.

    Household Income Layer
    Household Income Layer

    To return to the Report click the Back to Report button in the upper left hand corner of the page view.

    Back to Report
    Back to Report

    The layer feature is by far the most helpful part of this tool.  Imagine the time required to collect all that regional demographics data, model it and then to apply it to the mapping visual.  The ArcGIS mapping tool is quite impressive.

    One other note before we leave.  Now that you are back on the report level view.  Use your mouse scrolling wheel and zoom in and out on the map visual.  Notice the closer you zoom into the data points the more detailed the regional views become. See comparison below:

    Zoomed Views
    Zoomed Views

    Thanks for following along.  Remember to share if you liked this tutorial.  See you next week.

  • HexBin Plot using R

    HexBin Plot using R

    Continuing on the theme with R this month, this week tutorial will be to design a hexagonal bin plot.  At first you may say what in the world is a hexagonal bin plot.  I’m glad you asked, behold a sweet honey comb of data:

    Hexbin Plot
    Hexagonal Bin Plot

    The hexagonal bin plot looks just like a honey comb with different shading.   In this plot we have a number of data points with are graphed in two dimensions (Dimension 1, x-axis and Dimension 2, y-axis).  Each hexagon square represents a collection of points.  Now, if we plot only the points on the same graph we have the following.

    Scatter Plot
    Scatter Plot

    In the scatter plot, it’s difficult to see the concentration of points and if there is any correlation between the first dimension and the second dimension.  By comparison, the hex bin plot counts all the points and plots a heat map.  And, if you ask me the hexagonal bin plot just looks better visually.  To bring this all together, if we overlay the scatter plot on top of the hexagonal bin plot you can see that the higher concentration of dots are in the shaded areas with darker red.

    Plot Overlay
    Plot Overlay

    Cool, now lets build some visuals.  Lets begin.  Tutorial <- Hexagonal Bin Plot   (sorry had to interject a bit of R humor here, ignore if you don’t like code humor)

    The very first step will be to open the R console and to install a new library called HexBin.  Run the following code in the Mircosoft RGui.

    install.packages("hexbin")

    This will load the correct library for use within PowerBI.

    Install hexbin
    Install hexbin

    Start by opening up PowerBI.  Click on the Get Data button on the home ribbon, then select Blank Query.  In the Query editor click on the View ribbon and click on the Advanced Editor.  Enter the following query into the Advanced Editor:

    let
     Source = Csv.Document(Web.Contents("https://powerbitips03.blob.core.windows.net/blobpowerbitips03/wp-content/uploads/2016/09/Hexabin-Data.csv"),[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
     #"Promoted Headers" = Table.PromoteHeaders(Source),
     #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"SampleID", Int64.Type}, {"Xvalues", type number}, {"Yvalues", type number}})
    in
     #"Changed Type"

    This query loads a csv file of data into PowerBI.

    Note:  For more information on how to open and copy and paste M language into the Advanced Editor you can follow this tutorial, which will walk you though the steps.

    After the clicking Done in the Advanced Editor the data will load.  Next rename the query to Hexabin Data and then on the Home ribbon click Close & Apply.

    Save Query
    Save Query

    Next click on the R visual in the Visualizations bar on the right side of the screen.  There will likely be a pop up warning you about enabling R Scripts.  Click Enable to activate the R script editor.  With the R script visual selected on the page add the following columns to the Values field selector.

    R Visual Fields
    R Visual Fields

    Notice that the R visual is blank at this time.  Next add the following R code in the R script editor window.  This will tell PowerBI Desktop to load the ggplot2 library and define all the parameters for the plot.  I’ve added comments to the code using # symbols.

    library(ggplot2) #load ggplot2 package
    
    # define the data inputs to ggplot
     # set data for x and y values x=, and y=
     # set the min and max for both the x and y axis, xmin=, xmax=, ymin= and ymax=
     ggplot(dataset, aes(x=Xvalues,y=Yvalues, xmin=40, xmax=90, ymin=10, ymax=30)) +
    
    # define the color of the outline of the hexagons with color=c()
     # using c(#"809FFF") allows for the usage of hexadecimal color codes
     stat_binhex(bins=15, color=c("#D7DADB")) +
    
    # set the graph theme to classic, provides a white background and no grid lines
     # Change font size to 18 by using base_size = 18
     theme_classic(base_size=18) +
    
    # Apply lables to the graph for x and y
     labs(x = "Dimension 1", y = "Dimension 2")+
    
    # change the gradient fill to range from grey to Red
     scale_fill_gradient(low = "grey", high = "red")

    Click the run button and the code will execute revealing our new plot.

    R Script Code
    R Script Code

    One area of the code that is interesting to change is the section talking about the number of bins.  In the code pasted above the code states there are 15 bins.

    stat_binhex(bins=15, color=c("#D7DADB")) +

    Try increasing this number and decreasing this number to see what happens with the plot.

    Five Bins
    Five Bins
    stat_binhex(bins=5, color=c("#D7DADB")) +
    Thirty Bins
    Thirty Bins
    stat_binhex(bins=30, color=c("#D7DADB")) +

    Well that is it.  Thanks for reading through another tutorial.  I hope you had fun.

    Want to see more R checkout the Microsoft R Script Showcase.  If you want to download the PBIX file used to create this visual you can download the file here.

    If you want to learn more about R and the different visuals you can build within R check out this great book which helped me learn plotting with R.

  • Digging Deeper with R Visuals for PowerBI

    Digging Deeper with R Visuals for PowerBI

    Back by popular demand, we have another great tutorial on using R visuals.  There are a number of amazing visuals that have been supplied with the PowerBI desktop tool.  However, there are some limitations.  For example you can’t merge a scatter plot with a bar chart or with a area chart.  In some cases it may be applicable to display one graph with multiple plot types.  Now, to be fair Power BI desktop does supply you with a bar chart and line chart, Kudos Microsoft, #Winning…. but, I want more.

    This brings me to the need to learn R Visuals in PowerBI.  I’ve been interested in learning R and working on understanding how to leverage the drawing capabilities of R inside PowerBI.  Microsoft recently deployed the R Script Showcase, which has excellent examples of R scripts.  I took it upon myself to start learning.  Here is what I came up with.

    R Plot in PowerBI Desktop
    R Plot in PowerBI Desktop

    This is an area plot in the background, a bar chart as a middle layer and dots for each bar.  The use case for this type of plot would be to plot sales by item number,  sales are in the dark blue bars, and the price is shown as the light blue dots.  The area behind the bars represent a running total of all sales for all items.  Thus, when you reach item number 10, the area represents 100% of all sales for all items listed.

    If you want to download my R visual script included in the sample pbix file you can do so here.

    Great, lets start the tutorial.

    First you will need to make sure you have installed R on your computer.  To see how to do this you can follow my earlier post about installing R from Microsoft Open R project.  Once you’ve installed R open up the R console and enter the following code to install the ggplot2 package.

    install.packages("ggplot2")
    Install ggplot2 Code
    Install ggplot2 Code

    Once complete you can close the R console and enter PowerBI Desktop.  First, we will acquire some data to work with.  Click on the Home ribbon and then  select Enter Data.  You will be presented with the Create Table dialog box.  Copy and paste the following table of information into the dialog box.

    Item Sales Price Customer
    1 100 20 Customer A
    2 75 25 Customer A
    3 20 30 Customer A
    4 18 15 Customer A
    5 34 26 Customer A
    6 12 23 Customer A
    7 20 22 Customer A
    8 15 19 Customer A
    9 10 17 Customer A
    10 8 26 Customer A
    1 120 21 Customer B
    2 80 24 Customer B
    3 62 33 Customer B
    4 10 15 Customer B
    5 12 26 Customer B
    6 60 24 Customer B
    7 20 23 Customer B
    8 10 20 Customer B
    9 8 16 Customer B
    10 7 20 Customer B

    Rename your table to be titled Data Sample.

    datatable
    Data Sample Table

    Click Load to bring in the data into PowerBI.

    Next, we will need to create a cumulative calculated column measure using DAX.  On the home ribbon click the New Measure button and enter the following DAX expression.

    Cumulative = CALCULATE(  sum('Data Sample'[Sales] ) ,   FILTERS(  'Data Sample'[Customer] ) ,  FILTER( all( 'Data Sample' )  ,  'Data Sample'[Item] <= MAX( 'Data Sample'[Item] ) ) )

    This creates column value that adds all the sales of the items below the selected row.  For example if I’m calculating the cumulative total for item three, the sum() will add every item that is three and lower.

    Now, add the R visual by clicking on the R icon in the Visualizations window.

    Note: There will be an approval window that will require you to enable the R script visuals.  Click Enable to proceed.

    Enable R Visuals
    Enable R Visuals

    While selecting the R visual add the following columns to the Values field in the Visualization window.

    Add Column Data
    Add Column Data

    Note: After you add the columns to the Values the R visual renders a blank image.  Additionally, there is automatic comments entered into the R Script Editor (the # sign is a designation that denotes a text phrase).

    Next, enter the following R code into the script editor.

    library(ggplot2)   # include this package to use Graphing functions below
    
    ggplot(dataset, aes(xmin=1, x=Item)) +    # Initialize ggplot function, define the x axis with Item data
     geom_ribbon(fill=c("#D7DDE2"),           # Set the color of the Area Plot,
     aes( ymin=0, ymax=Cumulative )) +        # Define the Y-Axis data
     geom_bar(fill=c("#21406D") ,             # Define the color of the Bars
     stat = "identity" ,      # Define the Statatics property of the bars - This is a required field
     width=.6 ,               # Change the bar width to 60% - 1 would be full bar width
     aes( x=Item, y=Sales )) +          # Define the X and Y axis for bars
     geom_point( color=c("#809FFF"),    # Define the color of the dots
     size=4,                  # Define the dot size
     aes( x=Item, y=Price )) +          # Define the X and Y axis values
     theme_classic(base_size=18) +      # Remove unwanted items from plot area such as grid lines and X and Y axis lines, Change font size to 18
     theme( axis.title.x = element_text(colour = "dark grey"),     # Define the X axis text color
     axis.title.y = element_text(colour = "dark grey")) +          # Define the Y axis text color
     labs( x="Item Number", y="Sales")                             # Define the labels of the X and Y Axis

    Press the execute R Script button which is located on the right side of the R Script Editor bar.

    Execute R Script Editor Button
    Execute R Script Editor Button

    The R Script will execute and the plot will be generated.

    R Plot Generation
    R Plot Generation

     

    Great, we have completed a R visual.  So what, why is this such a big deal.  Well, it is because the R Script will execute every time a filter is applied or changed.  Lets see it in action.

    Add a slicer with the Customer column.

    Add Customer Slicer
    Add Customer Slicer

    Notice when you select the different customers, either A or B the R script Visual will change to reflect the selected customer.

    Customer B Selected
    Customer B Selected

    Now you can write the R script code once and use the filtering that is native in PowerBI to quickly change the data frame supporting the R Visuals.

    As always, thanks for following along.  Don’t forget to share if you liked this tutorial.

    Want to learn more about PowerBI and Using DAX.  Check out this great book from Rob Collie talking the power of DAX.  The book covers topics applicable for both PowerBI and Power Pivot inside excel.  I’ve personally read it and Rob has a great way of interjecting some fun humor while teaching you the essentials of DAX.

  • Using R Visuals in Power BI

    Using R Visuals in Power BI

    For those of you who have been hanging around PowerBI for a while you have likely heard about integration with R visuals.   No, this isn’t a twisted dream where Power BI now ships with Pirates… Rather, this has been a highly untapped feature.

    In a brief summary R or as it is known on its site R Project for Statistical Computing, is a statistical open source software package that enables mathematicians, statisticians, or data scientists to quickly calculate complex analysis.  It is the tool of us super nerds.  Now R by it’s self isn’t super powerful, it’s the numerous packages that have been developed by people way smarter than me that can do very amazing functions.  Packages include functions for forecasting, math functions, statistic functions and best of all charting functions.  Well, this may be fine and dandy so what?  Well here is the best part.  Microsoft has chosen to integrate and support various releases of R into it’s tools.  For example R can now be leveraged within SQL server 2016, and now visuals built in R can be leveraged in Power BI Desktop and PowerBI.com.  R can also be used to transform and prepare data during a date set load.

    The important note here is that Microsoft has released it’s own open version of R.  This distribution is called MRAN, and can be found at this site.  The MRAN has been slightly tweaked from the R Project.  In the Microsoft version of R, (which I will refer to as MRAN) there has been stability fixes and the improved performance (added Multi threaded Performance).

    So enough back ground lets fire this thing up.

    First you will need to install the latest version of MRAN.

    Navigate to the following address https://mran.microsoft.com/ Click the Download button found  at the top middle of the page.

    mran-download-page
    mran download page

    Note: At the time of this Tutorial the current version of MRAN is 3.3.1, it is likely that this will change since Microsoft is constantly updating this site and releasing new stabilized & enhanced performance versions of R.

    Select the platform that you will be using to install MRAN on.  I’m using windows, thus I’ll be downloading and installing the top installation version.

     

    windows-platform-of-mran
    Windows Platform of MRAN

    Note: If you need additional installation help you can follow / read the documentation provided by Microsoft.  It can be found here.

    In order to keep this tutorial brief I will assume you know how to install software and have made it through the MRAN installation successfully.  Once installed you should have the following program installed in your start menu.

    Installation of R
    Installation of R

    Run the new installation of R.  The R installation will open up a console window.

    R Console
    R Console

    At the bottom of the console window is a red line where you enter commands.  Enter the following code and press enter.

    install.packages(“corrplot”)

    This will install the proper R package that we will use later in PowerBI.  After running this line of code the console will download the correct package and install it on your computer.

    Install corrplot Function
    Install corrplot Function

    At this time you can close the R console program.

    Now, open up PowerBI Desktop.  Once in PowerBI desktop click on the File Button at the top left hand part of the screen.  Next, Click Options and Settings.

    Powerbi Options and Settings
    PowerBI Options and Settings

    Then click on the Options button.

    Options Button
    Options Button

    Under the Global options menu on the left verify that your new installation of MRAN is listed.  PowerBI should automatically detect the installation and show the installation with the current version number in the home directory:

    R Home Directory
    R Home Directory

    Seeing the listed installation in the Home Directory verifies that R has been properly installed on your computer. Clicking OK will close the window.

    Data Time!!  Below is the M Language that can be used in your Query Editor.  Copy the code below and enter it into the Advanced Editor found in the Query Editor.

    let
     Source = Excel.Workbook(Web.Contents("https://powerbitips03.blob.core.windows.net/blobpowerbitips03/wp-content/uploads/2016/09/CarDetails.xlsx"), null, true),
     CarData_Table = Source{[Item="CarData",Kind="Table"]}[Data],
     #"Changed Type" = Table.TransformColumnTypes(CarData_Table,{{"Year", Int64.Type}, {"Make", type text}, {"Model", type text}, {"Liters", type number}, {"Hp", Int64.Type}, {"Cylinders", Int64.Type}, {"MPG City", Int64.Type}, {"MPG Hwy", Int64.Type}})
    in
     #"Changed Type"

    Note: If you want to learn how to enter M language code into the Query Editor follow this Tutorial.

    Once you have pasted the code above into the Query Editor it should look like the following:

    Advanced Editor
    Advanced Editor

    Clicking Done will close the Advanced Editor and you will have data loaded into the Query Editor.  You must have an internet connection to connect to this data.  Rename your query to Car Data.  Then on the Home ribbon click Close & Apply to load the data into the data model.

    Car Data in Query Editor
    Car Data in Query Editor

    Generate a simple table visual to see our data in table form:

    Table Visual
    Table Visual

    Add an R visual by clicking the R inside the Visualizations bar.  When you click on the R visual you will see a pop-up, click Enable to proceed.

    Enable R Visuals
    Enable R Visuals

    Doing this will open up a visual pane on the page and reveal an R script editor at the bottom of the page window.

    R Script Editor
    R Script Editor

    While keeping the R visual selected add the following fields to the visual under the Values field:

    Add Columns to R Visual
    Add Columns to R Visual

    After adding these fields the R Script Editor will update and reveal code which informs you that your data from the selected columns will be added to a dataset.

    R Code Script Editor
    R Code Script Editor

    Next add the following code into the white area below the #dataset <- unique(dataset) statement.

    require(“corrplot”)
    library(corrplot)

    M <- cor(dataset)

    corrplot(M, method = “circle”, tl.cex=0.6, tl.srt = 45, tl.col = “black”, type= “upper”, order=”hclust”)

    This loads a package called corrplot which allows you to apply a graph that has a correlation plot between metrics.  The M <- cor(dataset), takes your data runs a function called cor and then saves the results into a new variable called M.

    Next click the Play button icon found on the right of the grey bar on the R Script Editor.

    Running the R Script
    Running the R Script

    Success! You have completed a correlation plot using R within PowerBI.  Nice job.

    Final Plot
    Final Plot

    Bonus:

    If you want to get fancy with this correlation plot you can change the circles to the actual correlation values.  Change the last line of the R Script Editor code to the following and press the run script button:

    corrplot(M, method = “number”, tl.cex=0.6, tl.srt = 45, tl.col = “black”, type= “upper”, order=”hclust”)

    This removes the circles and then populates the correlation plot with numerical values representing the correlation between the various data features.

    Correlation Numbers
    Correlation Numbers

    The blue numbers represent values that have a positive correlation, while the red numbers represent a negative correlation.  In practical terms the higher the Horsepower  (HP) of the vehicle the lower the Miles per Gallon (MPG) that are realized.

     

  • Loading Excel Files from Sharepoint

    Loading Excel Files from Sharepoint

    This is a quick tutorial on how to load Excel files from a SharePoint page.  SharePoint is a nice landing place for your data because it can be connected to the PowerBI.com service and thus can be used to schedule refreshes of data within your company (if you already have a SharePoint o365 account).

    This tutorial will be a slightly different than my previous tutorials as I don’t have a publicly available SharePoint site that can be used to connect to.  So you will have to slightly adapt what I’m presenting to you to fit your particular SharePoint needs.

    First you must start off with a SharePoint with a document library that includes an Excel file.

    Sharepoint Location
    Sharepoint Location

    The document library is titled SampleDocs, and the file we want to bring into PowerBI is called SampleData.

    Clicking on the Home in the left navigation will take you to the home location of the SharePoint site.  Copy down the HTML site address from your browser of this location it should look similar to the following:

    https://partner.onmicrosoft.com/sites/[Your Site Name]/SitePages/Home.aspx

    Open up PowerBI Desktop and on the home ribbon click Get Data.  Highlight the SharePoint Folder and click Connect to continue.

    SharePoint Folder Connection
    SharePoint Folder Connection

    Upon clicking connect you will be presented with another screen asking for the SharePoint folder location. In the URL window you will add the SharePoint site that we identified above.  However, it is important to note that you don’t need the entire web address.  Rather PowerBI only needs the specific site name, thus all that needs to be inserted into the URL field is highlighted below in Red.

    https://partner.onmicrosoft.com/sites/[Your Site Name]/SitePages/Home.aspx

    The ending “Sitepages/Home.aspx” can be removed.

    Enter Shortened Site URL
    Enter Shortened Site URL

    Clicking ok will present a authentication screen.  Depending on your company or SharePoint authentication you will need to enter the credentials to log into the SharePoint Site.  You may have to try a couple different connection methods until you are able to properly connect to the SharePoint site.  In my example I had to select Organization Account then click the Sign in.  I signed in with my credentials given me via my I.T. group.  Also, I had to use the drop down to select the proper level to apply the settings.  I used the same address as listed above: https://partner.onmicrosoft.com/sites/[Your Site Name]/

    User Sign In Page
    User Sign In Page

    After signing in click Connect to proceed.  PowerBI Desktop will then load all the files from the SharePoint site in a preview window. Click Edit to modify the query.

    Query Editor View
    Query Editor View

    We can now see our SampleData File and the folder path.  Each document library will be a separate folder path, thus if you have multiple document libraries then you will have all the files in those different folder paths.

    Next click the double down arrows to load the excel file.

    Load File
    Load File

    Power BI Desktop will then go to the SharePoint site and download the information inside your excel file.  For my data I have all the information retained in a table within my excel document.  The table name is call MyDataTable.  Thus, clicking on the Table link in the MyDataTable row I will be able to open all the data within this table.

    Load Table of Data from Excel File
    Load Table of Data from Excel File

    Finally the data is loaded from the excel table.  Click Close & Apply on the Home ribbon to load the data into PowerBI.

    Note: It is always important to check your columns and verify that your data types are correct.  Highlight each column and make sure you select the proper Data Type for each column.  Data Type can be found on the Home ribbon. 

    Final Load Data
    Final Load Data

    Thanks for visiting.  Make sure you stop by again for more great tutorials.

     

  • Building Date Table from Scratch

    Building Date Table from Scratch

    Recently at work I’ve been working with a number of large data warehouses with time series data.  Often when working on such data you need to incorporate a data calendar to compute date ranges.  So, for this tutorial we will build a custom date table directly inside PowerBI.

    Start by opening up power BI and clicking Get Data on the home ribbon, then select Blank Query.  Like always make sure you start by re-naming the query into something meaningful.  Change the name of the Query to Date List.  Next enter the following equation into the formula bar:

    List.Dates( #date(2016,1,1), 10, #duration(1,0,0,0))

    Note:  For more information on the M language you can visit here.  Also, here is the link to the List.Dates function found here.

    Once we enter the formula into the formula bar the list of dates will appear below.

    Date List
    Date List

    The quick explanation about the List.Dates function is below.  I’ve simplified the variables below:

    List.Dates(  Start Date  ,   Number of intervals   ,  Type of interval  )

    While this is interesting it does not help us make a report that updates the date range dynamically.  The real world use case for this would be you have a report with data that is being generated daily, say for example a website.  Maybe you want a custom date range that automatically changes every day you log into PowerBI.  For example if today is 08-20-2016, I want the first date to be today and then list the dates that previous 10 days.

    Now change the formula to the following formula:

    = List.Dates(  DateTime.Date(  DateTime.FixedLocalNow() ) ,  10 ,  #duration(-1,0,0,0)  )

    Note:  In this equation we have changed the duration to -1.  This is important to note because now our date table returns older dates.  In our previous equation we used a positive 1 and we return future dates.

    In this new equation we have defined the Start Date to the following statement : DateTime.Date(  DateTime.FixedLocalNow() )  This is tricky because if you only use DateTime.FixedLocalNow() the statement will error out.  The error occurs because the DateTime.FixedLoaclNow() is a date and time.  The List.Dates function is expecting a Date only value.  Hence why we use the DateTime.Date() function to remove the time stamp and only return today’s date.

    Date List Using Date of Today
    Date List Using Date of Today

    It is most likely your date ranges will be different than the ones in the example because the DateTime.FixedLocalNow() function will be pulling in your computer’s current date.

    Next modify the equation to now pull the last 90 days (highlighted in red below)

    List.Dates(DateTime.Date(DateTime.FixedLocalNow()), 90, #duration(-1,0,0,0))

    The list of dates is just that a list.  We really can’t do to many other enhancements to our data with only a list of dates.  Now transform the list into a table.  Click on the Transform ribbon and select To Table.  Notice now that we have a new column and a new applied step.

    New Column
    New Column

    The code for the new applied steps is as follows:

    Table.FromList(Source , Splitter.SplitByNothing() , null , null , ExtraValues.Error)

    I colored the first null in the equation.  This is actually a parameter that you can use to name the new column we just made.  Tricky, Tricky, PowerBI.  Modify the equation to the following:

    Table.FromList(Source , Splitter.SplitByNothing() , {"Date"} , null, ExtraValues.Error)

    Our table is updated and now has the name Date.  Nice work!

    Now lets make our date list useful.  Click on the ribbon labeled Add Column and then the button labeled Add Custom Column.  Add the following equation to the new column and name it Week #, then click OK, to continue.

    Number.RoundDown( Number.From(Date.AddDays( List.Max( Table.Column(#"Converted to Table", "Date" ) ) , -1 * Number.From( List.Max( Table.Column(#"Converted to Table", "Date" ) ) - Date.StartOfWeek( List.Max( Table.Column( #"Converted to Table", "Date" ) ) , Day.Saturday ) ) ) -[Date] ) / 7 + 1 , 0)

    This equation defines the start of the week highlighted in RED.  Since today is Tuesday 8/30/16, then the days 8/30 (Tues), 8/29 (Mon), 8/27 (Sunday) are considered week 0 or the current week.  All dates prior will start with weekly increment.

    Date List
    Date List

    Now we can add some logic to define week variables.  Click on the Add Column ribbon and select the Conditional Column button.  Using the drop downs in Column Name, Operator, Value and Output enter the following:

    Current Week Logic
    Current Week Logic

    Click OK to proceed.  We have now added an additional column with a text description of the week.

    Current Week Column
    Current Week Column

    Following the add column steps mentioned above we will now add more week descriptions.  Add the following conditional column for Last Week:

    Last Week Logic
    Last Week Logic

    From here you can make custom columns for how you want to describe your data.  In this example we will build last 2 weeks, 3 weeks and last 4 weeks.  See the add conditional column logic for each of those respective weeks.

    Conditional Column Logic for last 2 weeks:

    Last 2 Weeks Logic
    Last 2 Weeks Logic

    Note: When we added this conditional column we label week 0 as last 2 weeks.  See image below as an example:

    Last 2 Weeks Column
    Last 2 Weeks Column

    To fix this we modify the code that generated this column.  The code initially states the following:

    = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 then "Last 2 Weeks" else null )

    We modify this code to the following: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 and [Week Number] > 0 then "Last 2 Weeks" else null )

    This now removes the first three days from our Last 2 Weeks column reflecting a more accurate picture of our time ranges.

    Corrected Last 2 Weeks Column
    Corrected Last 2 Weeks Column

    Next we will add the Last 3 Weeks column and the Last 4 weeks column.  Each time we will modify the add column code to remove the first three dates of the current week.

    Last 3 Weeks Logic
    Last 3 Weeks Logic

    Last 3 Weeks auto generated code:

    = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 then "Last 3 Weeks" else null )

    We modify to the following to achieve the correct Last 3 Weeks data range: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 and [Week Number] > 0 then "Last 3 Weeks" else null )

    Add the Last 4 Weeks column:

    Last 4 Weeks Logic
    Last 4 Weeks Logic

    Last 4 Weeks auto generated code:

    = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 then "Last 4 Weeks" else null )

    Modify the code the following to correct the column: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 and [Week Number] > 0 then "Last 4 Weeks" else null )

    Nice job so far.  We are almost to the end now.  After all those additional columns you should have something that looks similar to the following:

    Date Table
    Date Table

    Next we will pivot all the data down to one column.  This will enable us to select a time period and automatically have our date table update to the specific range.

    First, shift select the following columns, Current Week, Last Week, Last 2 Weeks, Last 3 Weeks, and Last 4 Weeks.  Then on the Transform ribbon click the Unpivot Columns button.

    Unpivot Columns Command
    Unpivot Columns Command

    Next delete the Attribute column using a right click on the Attribute column and selecting Remove Columns.

    Remove Attribute Column
    Remove Attribute Column

    Rename the Value column to Selector by right clicking on the Value column.

    Rename the Value Column
    Rename the Value Column

    Modify each column to have the correct Data Type on the Home ribbon.

    Date column data type should be Date

    Week Number column data type should be Whole Number

    Selector column data type should be Text

    Note: It is important to always check your data types for each column before you leave the Query Editor.  If you don’t you’ll find that the visuals that your trying to build later on on the page view will not work as expected.

    Next, click the Home ribbon and select Close & Apply.  You can now build the following visuals:

    A slicer for the Selector column:

    Selector Column as a Slicer
    Selector Column as a Slicer

    Table visual for the Date column:

    Note: When you use the Date Column as the data source for the Table Visual the data will automatically be added as a Date Hierachy.  This does not work well with our data so you will need to change the date from a Date Hierarchy to a standard Date.  To do this click the little triangle next to the Date in the Values box.  Then select Date.

    Date Table
    Date Table

    Now you can finally play around with your data and by selecting different items in the Selector slicer you can filter down to different date ranges.  Below I selected the Last Week item, which filters down my dates to only the 7 days from last week.

    Last Week Slicer Selected
    Last Week Slicer Selected

    Nice job making a custom date table in PowerBI.  The nice part about this table is that it will always refresh with the latest dates whenever the queries are refreshed for this PowerBI file.

    Bonus:  For those of you who want to cheat and just have the M code to generate this custom date table it can be used from here:

    let
     Source = List.Dates(DateTime.Date(DateTime.FixedLocalNow()), 90, #duration(-1,0,0,0)),
     #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),
     #"Added Custom1" = Table.AddColumn(#"Converted to Table", "Week Number", each Number.RoundDown( Number.From(Date.AddDays( List.Max( Table.Column(#"Converted to Table", "Date" ) ) , -1 * Number.From( List.Max( Table.Column(#"Converted to Table", "Date" ) ) - Date.StartOfWeek( List.Max( Table.Column( #"Converted to Table", "Date" ) ) , Day.Saturday ) ) ) -[Date] ) / 7 + 1 , 0)),
     #"Added Conditional Column" = Table.AddColumn(#"Added Custom1", "Current Week ", each if [Week Number] = 0 then "Current Week" else null ),
     #"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Last Week", each if [Week Number] = 1 then "Last Week" else null ),
     #"Added Conditional Column2" = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 and [Week Number] > 0 then "Last 2 Weeks" else null ),
     #"Added Conditional Column3" = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 and [Week Number] >0 then "Last 3 Weeks" else null ),
     #"Added Conditional Column4" = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 and [Week Number] > 0 then "Last 4 Weeks" else null ),
     #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Conditional Column4", {"Date", "Week Number"}, "Attribute", "Value"),
     #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
     #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Value", "Selector"}}),
     #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Date", type date}, {"Week Number", Int64.Type}, {"Selector", type text}})
    in
     #"Changed Type"
  • Make Calendars Using DAX – Curbal

    Make Calendars Using DAX – Curbal

    Often you will need to create some custom calendars within your PowerBI reports.  Ruth Pozuelo from Curbal does a great video tutorial on using Calendar() and CalendarAuto().  I have use the Calendar() DAX function many times and find it very helpful.  The following videos are built directly within DAX.  This approach is one of many different methods that can be used to generate a list of dates.  In a previous tutorial I talked about how to build a date table within the Query Editor (build date table in the Query Editor).

    One method that Ruth talks about is the ability to use the CalendarAuto().  I have not used this expression in any previous reports, but seeing how simple it is to implement this will definitely have to be added to the toolbox.

    Microsoft Docs on Calendar()

    See Calendar DAX example below:

    = CALENDAR (DATE (2005, 1, 1), DATE (2015, 12, 31))

    Microsoft Docs on CalendarAuto()

    See CalendarAuto DAX Example below:

    = CALENDARAUTO()

    Here are the highlights from Ruth’s video:

    Using Calendar:

    Using CalendarAuto:

    Full Video:

    Curbal has been generating a lot of great content.  To learn about for more information you can visit the website found here, or visit the YouTube Channel.

    For more great videos about Power BI click the image below:

    PBI Videos

  • Load Multiple Excel (xls or xlsx) Files

    Load Multiple Excel (xls or xlsx) Files

    Previously we’ve done a tutorial on loading multiple text files within one query.  This is nice, however we will also need to import multiple Excel files.  First, to understand the procedure of querying multiple excel files you have to understand the basics between the CSV (comma separated values) file and an excel (.xls or .xlsx) files.  In a CSV file you have only one data set.  The beginning of the file starts with values and separates each file with a “,”  a carriage return starts a new row of data.  This is an easy and efficient way to store millions of rows of data.  By contrast the excel file is way more complicated.  Excel files can have multiple sheets of tables of data.  Think of this as a stack of CSV type files.  For example if you have an excel workbook with three sheets of data, Sheet 1, Sheet 2, Sheet 3.  You can think of those three sheets as grid of data, similar to the CSV file.  The multiple sheet aspects of an excel file makes the data ingestion into PowerBI a little bit more complicated.  To add to the complication, when you loading data from either multiple sheets, or selecting a specific out of many sheets of data.  For illustration purposes imagine working with two excel files with three sheets each, 2 x 3 = 6, a total of 6 sheets of data, or what I will call “pages” of data.  This is why it is more complex to load excel files than CSV files.

    Note: If you want to learn how to load multiple CSV files visit this tutorial.

    Not only do you have to figure out what data you want to ingest on the page you must all tell PowerBI which sheets do you want to look at, and from which excel file.  If that was to many words think of loading the following data sample:

    Workbook 1 – Year 2000 Olympic Medals

    • Sheet 1
      • Olympic Medals Table
        • Rank
        • Country
        • Gold
        • Silver
        • Bronze
        • Total
    • Sheet 2
    • Sheet 3

    WorkBook 2 – Year 2004 Olympic Medals

    • Sheet 1
      • Olympic Medals Table
        • Rank
        • Country
        • Gold
        • Silver
        • Bronze
        • Total
    • Sheet 2
    • Sheet 3

    The data structure for both workbook 1 and 2 are similar but the names of the files are different and there can be multiple pages.

    To resolve this we will have to write a M language function that will load each file as a function.  This will be done in later in the tutorial.

    Here is the data source information for Olympic medals won by each country from 2000 to 2012, download here.  Inside the Medal Count zip file are four xlsx files, extract them to your desktop.  Move the files into a folder on your desktop labeled Medals.

     

    Medals Folder
    Medals Folder

    Now, open up PowerBI,  We will begin shaping our data to load all the excel files.  On the Home ribbon click on the Get Data button.  Select Folder on the right side and click Connect.

    Get Folder Data
    Get Folder Data

    Next select the folder path that you want acquire the files from, Click OK to continue.

    Load Folder Screen
    Load Folder Screen

    Next we are presented with the loaded files within our selected folder.  Click Edit at the bottom of the screen to proceed.  The Query Editor window will now open.  Select the first two columns labeled Content, and Name.  With those two columns selected right click on the header and select Remove Other Columns. This will remove all the useless data associated with the files.

    Remove Other Columns
    Remove Other Columns

    Click the Add Column ribbon and press the Add Custom Column on the left side of the ribbon.

    Add Custom Column
    Add Custom Column

    Name the new column ExcelFileLoad and enter the following equation.

    Excel.Workbook Equation
    Excel.Workbook Equation

    Note: Once you type “Excel.Workbook(” you can click on the column labeled Content on the right side of the screen to have the name automatically added.  This is useful when you have many many columns to choose from or if there naming of those columns becomes complex.  This way you won’t type in the column name incorrectly.

    Click OK to proceed.  Notice we now have a new column called ExcelFileLoad.  Next click the Expand button (the one with the arrows) located at the right of our newly added column. Click OK to proceed.

    Expand Column Button
    Expand Column Button

    Now we have a new column labeled ExcelFileLoad.Data, which is the data contained in our excel files.  Now click in the Grey Area next to the word labeled Table.  This will open up the file and reveal the information present in the file.  Notice that we can see the headers and the data in our file.  Row 1 contains the headers of each column.  Rows after row 1 contains the medal data.

    View Data of File
    View Data of File

    Next select the columns labeled Name and ExcelFileLoad.Data and right click on the column header, then select Remove Other Columns

    Remove Other Columns Again
    Remove Other Columns Again

    On the Add Column ribbon click Add Custom Column again.  Name the column PromoteHeaders and enter the following formula. Click OK to proceed.

    Promote Headers Step
    Promote Headers Step

    Clicking again on the grey area in our newly created column reveals our tables with promoted headers.

    View of Data with Promoted Headers
    View of Data with Promoted Headers

    Next click the Expand Button, un-check the Use original column name as prefix and click the OK button to proceed.

    Expand Data
    Expand Data

    Remove the following columns, ExcelFileLoad.Data, Rank,  and Total, bu right selecting the columns and right clicking on the header and selecting Remove Columns.  Now we want to parse out the year name from the Name column.  To do this click on Name Column.  Then click the Transform ribbon and click the Extract button, then select First Characters from the drop down menu.

    Extract First Characters
    Extract First Characters

    In the Extract First Characters menu enter the number 4 and click OK to proceed.

    Extract First 4 Characters
    Extract First 4 Characters

    Change the following columns to whole numbers: Name, Gold, Silver, Bronze.  Do this on the Transform ribbon in the Data Type drop down.

    Change Data Types
    Change Data Types

    We are now ready to load all the data.  Rename the Query to Medals, click the Home ribbon and select Close & Apply.

    Name Query
    Name Query

    And there you have it.  We have successfully loaded four excel files into one query.

    Bonus: for added flare add the following measure.

    Total Medal Count = sum(Medals[Gold]) + sum(Medals[Silver]) + sum(Medals[Bronze])

    Now you can add the following Visualizations.

    Bar Chart Visual
    Bar Chart Visual
    Stacked Bar Chart
    Stacked Bar Chart
    Map Visual
    Map Visual