Month: August 2016

  • Building Date Table from Scratch

    Building Date Table from Scratch

    Recently at work I’ve been working with a number of large data warehouses with time series data.  Often when working on such data you need to incorporate a data calendar to compute date ranges.  So, for this tutorial we will build a custom date table directly inside PowerBI.

    Start by opening up power BI and clicking Get Data on the home ribbon, then select Blank Query.  Like always make sure you start by re-naming the query into something meaningful.  Change the name of the Query to Date List.  Next enter the following equation into the formula bar:

    List.Dates( #date(2016,1,1), 10, #duration(1,0,0,0))

    Note:  For more information on the M language you can visit here.  Also, here is the link to the List.Dates function found here.

    Once we enter the formula into the formula bar the list of dates will appear below.

    Date List
    Date List

    The quick explanation about the List.Dates function is below.  I’ve simplified the variables below:

    List.Dates(  Start Date  ,   Number of intervals   ,  Type of interval  )

    While this is interesting it does not help us make a report that updates the date range dynamically.  The real world use case for this would be you have a report with data that is being generated daily, say for example a website.  Maybe you want a custom date range that automatically changes every day you log into PowerBI.  For example if today is 08-20-2016, I want the first date to be today and then list the dates that previous 10 days.

    Now change the formula to the following formula:

    = List.Dates(  DateTime.Date(  DateTime.FixedLocalNow() ) ,  10 ,  #duration(-1,0,0,0)  )

    Note:  In this equation we have changed the duration to -1.  This is important to note because now our date table returns older dates.  In our previous equation we used a positive 1 and we return future dates.

    In this new equation we have defined the Start Date to the following statement : DateTime.Date(  DateTime.FixedLocalNow() )  This is tricky because if you only use DateTime.FixedLocalNow() the statement will error out.  The error occurs because the DateTime.FixedLoaclNow() is a date and time.  The List.Dates function is expecting a Date only value.  Hence why we use the DateTime.Date() function to remove the time stamp and only return today’s date.

    Date List Using Date of Today
    Date List Using Date of Today

    It is most likely your date ranges will be different than the ones in the example because the DateTime.FixedLocalNow() function will be pulling in your computer’s current date.

    Next modify the equation to now pull the last 90 days (highlighted in red below)

    List.Dates(DateTime.Date(DateTime.FixedLocalNow()), 90, #duration(-1,0,0,0))

    The list of dates is just that a list.  We really can’t do to many other enhancements to our data with only a list of dates.  Now transform the list into a table.  Click on the Transform ribbon and select To Table.  Notice now that we have a new column and a new applied step.

    New Column
    New Column

    The code for the new applied steps is as follows:

    Table.FromList(Source , Splitter.SplitByNothing() , null , null , ExtraValues.Error)

    I colored the first null in the equation.  This is actually a parameter that you can use to name the new column we just made.  Tricky, Tricky, PowerBI.  Modify the equation to the following:

    Table.FromList(Source , Splitter.SplitByNothing() , {"Date"} , null, ExtraValues.Error)

    Our table is updated and now has the name Date.  Nice work!

    Now lets make our date list useful.  Click on the ribbon labeled Add Column and then the button labeled Add Custom Column.  Add the following equation to the new column and name it Week #, then click OK, to continue.

    Number.RoundDown( Number.From(Date.AddDays( List.Max( Table.Column(#"Converted to Table", "Date" ) ) , -1 * Number.From( List.Max( Table.Column(#"Converted to Table", "Date" ) ) - Date.StartOfWeek( List.Max( Table.Column( #"Converted to Table", "Date" ) ) , Day.Saturday ) ) ) -[Date] ) / 7 + 1 , 0)

    This equation defines the start of the week highlighted in RED.  Since today is Tuesday 8/30/16, then the days 8/30 (Tues), 8/29 (Mon), 8/27 (Sunday) are considered week 0 or the current week.  All dates prior will start with weekly increment.

    Date List
    Date List

    Now we can add some logic to define week variables.  Click on the Add Column ribbon and select the Conditional Column button.  Using the drop downs in Column Name, Operator, Value and Output enter the following:

    Current Week Logic
    Current Week Logic

    Click OK to proceed.  We have now added an additional column with a text description of the week.

    Current Week Column
    Current Week Column

    Following the add column steps mentioned above we will now add more week descriptions.  Add the following conditional column for Last Week:

    Last Week Logic
    Last Week Logic

    From here you can make custom columns for how you want to describe your data.  In this example we will build last 2 weeks, 3 weeks and last 4 weeks.  See the add conditional column logic for each of those respective weeks.

    Conditional Column Logic for last 2 weeks:

    Last 2 Weeks Logic
    Last 2 Weeks Logic

    Note: When we added this conditional column we label week 0 as last 2 weeks.  See image below as an example:

    Last 2 Weeks Column
    Last 2 Weeks Column

    To fix this we modify the code that generated this column.  The code initially states the following:

    = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 then "Last 2 Weeks" else null )

    We modify this code to the following: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 and [Week Number] > 0 then "Last 2 Weeks" else null )

    This now removes the first three days from our Last 2 Weeks column reflecting a more accurate picture of our time ranges.

    Corrected Last 2 Weeks Column
    Corrected Last 2 Weeks Column

    Next we will add the Last 3 Weeks column and the Last 4 weeks column.  Each time we will modify the add column code to remove the first three dates of the current week.

    Last 3 Weeks Logic
    Last 3 Weeks Logic

    Last 3 Weeks auto generated code:

    = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 then "Last 3 Weeks" else null )

    We modify to the following to achieve the correct Last 3 Weeks data range: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 and [Week Number] > 0 then "Last 3 Weeks" else null )

    Add the Last 4 Weeks column:

    Last 4 Weeks Logic
    Last 4 Weeks Logic

    Last 4 Weeks auto generated code:

    = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 then "Last 4 Weeks" else null )

    Modify the code the following to correct the column: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 and [Week Number] > 0 then "Last 4 Weeks" else null )

    Nice job so far.  We are almost to the end now.  After all those additional columns you should have something that looks similar to the following:

    Date Table
    Date Table

    Next we will pivot all the data down to one column.  This will enable us to select a time period and automatically have our date table update to the specific range.

    First, shift select the following columns, Current Week, Last Week, Last 2 Weeks, Last 3 Weeks, and Last 4 Weeks.  Then on the Transform ribbon click the Unpivot Columns button.

    Unpivot Columns Command
    Unpivot Columns Command

    Next delete the Attribute column using a right click on the Attribute column and selecting Remove Columns.

    Remove Attribute Column
    Remove Attribute Column

    Rename the Value column to Selector by right clicking on the Value column.

    Rename the Value Column
    Rename the Value Column

    Modify each column to have the correct Data Type on the Home ribbon.

    Date column data type should be Date

    Week Number column data type should be Whole Number

    Selector column data type should be Text

    Note: It is important to always check your data types for each column before you leave the Query Editor.  If you don’t you’ll find that the visuals that your trying to build later on on the page view will not work as expected.

    Next, click the Home ribbon and select Close & Apply.  You can now build the following visuals:

    A slicer for the Selector column:

    Selector Column as a Slicer
    Selector Column as a Slicer

    Table visual for the Date column:

    Note: When you use the Date Column as the data source for the Table Visual the data will automatically be added as a Date Hierachy.  This does not work well with our data so you will need to change the date from a Date Hierarchy to a standard Date.  To do this click the little triangle next to the Date in the Values box.  Then select Date.

    Date Table
    Date Table

    Now you can finally play around with your data and by selecting different items in the Selector slicer you can filter down to different date ranges.  Below I selected the Last Week item, which filters down my dates to only the 7 days from last week.

    Last Week Slicer Selected
    Last Week Slicer Selected

    Nice job making a custom date table in PowerBI.  The nice part about this table is that it will always refresh with the latest dates whenever the queries are refreshed for this PowerBI file.

    Bonus:  For those of you who want to cheat and just have the M code to generate this custom date table it can be used from here:

    let
     Source = List.Dates(DateTime.Date(DateTime.FixedLocalNow()), 90, #duration(-1,0,0,0)),
     #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),
     #"Added Custom1" = Table.AddColumn(#"Converted to Table", "Week Number", each Number.RoundDown( Number.From(Date.AddDays( List.Max( Table.Column(#"Converted to Table", "Date" ) ) , -1 * Number.From( List.Max( Table.Column(#"Converted to Table", "Date" ) ) - Date.StartOfWeek( List.Max( Table.Column( #"Converted to Table", "Date" ) ) , Day.Saturday ) ) ) -[Date] ) / 7 + 1 , 0)),
     #"Added Conditional Column" = Table.AddColumn(#"Added Custom1", "Current Week ", each if [Week Number] = 0 then "Current Week" else null ),
     #"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Last Week", each if [Week Number] = 1 then "Last Week" else null ),
     #"Added Conditional Column2" = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 and [Week Number] > 0 then "Last 2 Weeks" else null ),
     #"Added Conditional Column3" = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 and [Week Number] >0 then "Last 3 Weeks" else null ),
     #"Added Conditional Column4" = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 and [Week Number] > 0 then "Last 4 Weeks" else null ),
     #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Conditional Column4", {"Date", "Week Number"}, "Attribute", "Value"),
     #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
     #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Value", "Selector"}}),
     #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Date", type date}, {"Week Number", Int64.Type}, {"Selector", type text}})
    in
     #"Changed Type"
  • Make Calendars Using DAX – Curbal

    Make Calendars Using DAX – Curbal

    Often you will need to create some custom calendars within your PowerBI reports.  Ruth Pozuelo from Curbal does a great video tutorial on using Calendar() and CalendarAuto().  I have use the Calendar() DAX function many times and find it very helpful.  The following videos are built directly within DAX.  This approach is one of many different methods that can be used to generate a list of dates.  In a previous tutorial I talked about how to build a date table within the Query Editor (build date table in the Query Editor).

    One method that Ruth talks about is the ability to use the CalendarAuto().  I have not used this expression in any previous reports, but seeing how simple it is to implement this will definitely have to be added to the toolbox.

    Microsoft Docs on Calendar()

    See Calendar DAX example below:

    = CALENDAR (DATE (2005, 1, 1), DATE (2015, 12, 31))

    Microsoft Docs on CalendarAuto()

    See CalendarAuto DAX Example below:

    = CALENDARAUTO()

    Here are the highlights from Ruth’s video:

    Using Calendar:

    Using CalendarAuto:

    Full Video:

    Curbal has been generating a lot of great content.  To learn about for more information you can visit the website found here, or visit the YouTube Channel.

    For more great videos about Power BI click the image below:

    PBI Videos

  • Load Multiple Excel (xls or xlsx) Files

    Load Multiple Excel (xls or xlsx) Files

    Previously we’ve done a tutorial on loading multiple text files within one query.  This is nice, however we will also need to import multiple Excel files.  First, to understand the procedure of querying multiple excel files you have to understand the basics between the CSV (comma separated values) file and an excel (.xls or .xlsx) files.  In a CSV file you have only one data set.  The beginning of the file starts with values and separates each file with a “,”  a carriage return starts a new row of data.  This is an easy and efficient way to store millions of rows of data.  By contrast the excel file is way more complicated.  Excel files can have multiple sheets of tables of data.  Think of this as a stack of CSV type files.  For example if you have an excel workbook with three sheets of data, Sheet 1, Sheet 2, Sheet 3.  You can think of those three sheets as grid of data, similar to the CSV file.  The multiple sheet aspects of an excel file makes the data ingestion into PowerBI a little bit more complicated.  To add to the complication, when you loading data from either multiple sheets, or selecting a specific out of many sheets of data.  For illustration purposes imagine working with two excel files with three sheets each, 2 x 3 = 6, a total of 6 sheets of data, or what I will call “pages” of data.  This is why it is more complex to load excel files than CSV files.

    Note: If you want to learn how to load multiple CSV files visit this tutorial.

    Not only do you have to figure out what data you want to ingest on the page you must all tell PowerBI which sheets do you want to look at, and from which excel file.  If that was to many words think of loading the following data sample:

    Workbook 1 – Year 2000 Olympic Medals

    • Sheet 1
      • Olympic Medals Table
        • Rank
        • Country
        • Gold
        • Silver
        • Bronze
        • Total
    • Sheet 2
    • Sheet 3

    WorkBook 2 – Year 2004 Olympic Medals

    • Sheet 1
      • Olympic Medals Table
        • Rank
        • Country
        • Gold
        • Silver
        • Bronze
        • Total
    • Sheet 2
    • Sheet 3

    The data structure for both workbook 1 and 2 are similar but the names of the files are different and there can be multiple pages.

    To resolve this we will have to write a M language function that will load each file as a function.  This will be done in later in the tutorial.

    Here is the data source information for Olympic medals won by each country from 2000 to 2012, download here.  Inside the Medal Count zip file are four xlsx files, extract them to your desktop.  Move the files into a folder on your desktop labeled Medals.

     

    Medals Folder
    Medals Folder

    Now, open up PowerBI,  We will begin shaping our data to load all the excel files.  On the Home ribbon click on the Get Data button.  Select Folder on the right side and click Connect.

    Get Folder Data
    Get Folder Data

    Next select the folder path that you want acquire the files from, Click OK to continue.

    Load Folder Screen
    Load Folder Screen

    Next we are presented with the loaded files within our selected folder.  Click Edit at the bottom of the screen to proceed.  The Query Editor window will now open.  Select the first two columns labeled Content, and Name.  With those two columns selected right click on the header and select Remove Other Columns. This will remove all the useless data associated with the files.

    Remove Other Columns
    Remove Other Columns

    Click the Add Column ribbon and press the Add Custom Column on the left side of the ribbon.

    Add Custom Column
    Add Custom Column

    Name the new column ExcelFileLoad and enter the following equation.

    Excel.Workbook Equation
    Excel.Workbook Equation

    Note: Once you type “Excel.Workbook(” you can click on the column labeled Content on the right side of the screen to have the name automatically added.  This is useful when you have many many columns to choose from or if there naming of those columns becomes complex.  This way you won’t type in the column name incorrectly.

    Click OK to proceed.  Notice we now have a new column called ExcelFileLoad.  Next click the Expand button (the one with the arrows) located at the right of our newly added column. Click OK to proceed.

    Expand Column Button
    Expand Column Button

    Now we have a new column labeled ExcelFileLoad.Data, which is the data contained in our excel files.  Now click in the Grey Area next to the word labeled Table.  This will open up the file and reveal the information present in the file.  Notice that we can see the headers and the data in our file.  Row 1 contains the headers of each column.  Rows after row 1 contains the medal data.

    View Data of File
    View Data of File

    Next select the columns labeled Name and ExcelFileLoad.Data and right click on the column header, then select Remove Other Columns

    Remove Other Columns Again
    Remove Other Columns Again

    On the Add Column ribbon click Add Custom Column again.  Name the column PromoteHeaders and enter the following formula. Click OK to proceed.

    Promote Headers Step
    Promote Headers Step

    Clicking again on the grey area in our newly created column reveals our tables with promoted headers.

    View of Data with Promoted Headers
    View of Data with Promoted Headers

    Next click the Expand Button, un-check the Use original column name as prefix and click the OK button to proceed.

    Expand Data
    Expand Data

    Remove the following columns, ExcelFileLoad.Data, Rank,  and Total, bu right selecting the columns and right clicking on the header and selecting Remove Columns.  Now we want to parse out the year name from the Name column.  To do this click on Name Column.  Then click the Transform ribbon and click the Extract button, then select First Characters from the drop down menu.

    Extract First Characters
    Extract First Characters

    In the Extract First Characters menu enter the number 4 and click OK to proceed.

    Extract First 4 Characters
    Extract First 4 Characters

    Change the following columns to whole numbers: Name, Gold, Silver, Bronze.  Do this on the Transform ribbon in the Data Type drop down.

    Change Data Types
    Change Data Types

    We are now ready to load all the data.  Rename the Query to Medals, click the Home ribbon and select Close & Apply.

    Name Query
    Name Query

    And there you have it.  We have successfully loaded four excel files into one query.

    Bonus: for added flare add the following measure.

    Total Medal Count = sum(Medals[Gold]) + sum(Medals[Silver]) + sum(Medals[Bronze])

    Now you can add the following Visualizations.

    Bar Chart Visual
    Bar Chart Visual
    Stacked Bar Chart
    Stacked Bar Chart
    Map Visual
    Map Visual
  • Using Variables for File Locations

    Using Variables for File Locations

    This week I encountered an issue when working with multiple queries in my data model.  Here is the source files in case you want to follow along.

    Here’s what happened.  I had a PBIX file that had four queries in it, one file for the summer the Olympic metal count for the following years, 2000, 2004, 2008, and 2012.

     

    Olympic Metal Count
    Olympic Metal Count

    After a bit of working I figured that my desktop screen was going to get to cluttered if I continued to collect Olympic metal data.  Thus, I moved my excel files which were my source data into a folder called Olympic Medals.

    File Move
    File Move

    By doing this I broke all the links for all four files.  This was discovered when I tried to refresh my queries and noticed that all the queries failed. Power BI gave me a nice little message notifying me that there was a data source error.

    DataSource.Error: Could not fine the file:
    Missing File Error
    Missing File Error

    To fix this I had to open the query editor and change each file’s location to the new folder that I just made.  Seeing that this is not an efficient use of my time, I decided to spend more time to figure out a way to make a variable that would be my file location for all my queries.

    Lets begin by making a new blank query by clicking on the bottom half of the New Source button on the Home ribbon.  Then click the item labeled Blank Query.

    Start Blank Query
    Start Blank Query

    With the new query open type in the file location where you will obtain all your working files.  For me my file location was on my desktop, thus the file location is listed below.  Rename the new query to Folder.

    Folder Query
    Folder Query

    Note: Since we are working on building a file structure for Power BI to load the excel files you will want to be extra careful to add a “\” back slash at the end of the file location.

    Next on the query for Medals 2000, we click the Source under the applied steps window on the right.  This will expose the code in the formula bar at the top of the window.

    Select the Source Applied Step
    Select the Source Applied Step

    Note: If you don’t see the formula bar as I have illustrated in the image above, you can turn this feature on by click the View ribbon and checking the box next to the words Formula Bar.  This will expose the formula bar so you can edit the source step.

    This is where the magic happens.  We can now insert our new blank query into this step.  Our current file contents looks like the following:

    = Excel.Workbook(  File.Contents(  "C:\Users\Mike\Desktop\Olympic Medals\2000 Medals.xlsx") ,  null ,  true )

    Now remove the first part of the file location and make the equation match the following:

    = Excel.Workbook(  File.Contents(   Folder  &   "2000 Medals.xlsx") ,   null , true )

    Not only does this shorten our equation, it now uses the folder location we identified earlier and then we can pick up the file name 2000 Medals.xlsx.  This makes is very easy to add additional queries with the same steps.  Also, if you move your files to a new folder location, you only have to change the Folder query to reflect the new file location.  To test this make a new folder on your desktop called New Folder.  Move all the Olympic medal files to the new folder.  Now in Power BI Desktop press the Refresh on the Home ribbon.  This should result in the Data.Source.Error that we saw earlier.  To fix this click the Edit Queries on the Home ribbon, select the Folder query and change the file directory to the new folder that you made on your desktop.  It should look similar to the following:

    New Folder Image
    New Folder Image

    Once you’ve modified the Folder query, click Close & Apply on the Home ribbon and all your queries will now reload.  Success!!

    New Queries Loaded
    New Queries Loaded

     

    Hope this tutorial helps and solves some of the problems when moving data files and storing information for Power BI desktop.  Please Share if you like the tutorials. Thanks.