Tag: Table of Data

  • Building Date Table from Scratch

    Building Date Table from Scratch

    Recently at work I’ve been working with a number of large data warehouses with time series data.  Often when working on such data you need to incorporate a data calendar to compute date ranges.  So, for this tutorial we will build a custom date table directly inside PowerBI.

    Start by opening up power BI and clicking Get Data on the home ribbon, then select Blank Query.  Like always make sure you start by re-naming the query into something meaningful.  Change the name of the Query to Date List.  Next enter the following equation into the formula bar:

    List.Dates( #date(2016,1,1), 10, #duration(1,0,0,0))

    Note:  For more information on the M language you can visit here.  Also, here is the link to the List.Dates function found here.

    Once we enter the formula into the formula bar the list of dates will appear below.

    Date List
    Date List

    The quick explanation about the List.Dates function is below.  I’ve simplified the variables below:

    List.Dates(  Start Date  ,   Number of intervals   ,  Type of interval  )

    While this is interesting it does not help us make a report that updates the date range dynamically.  The real world use case for this would be you have a report with data that is being generated daily, say for example a website.  Maybe you want a custom date range that automatically changes every day you log into PowerBI.  For example if today is 08-20-2016, I want the first date to be today and then list the dates that previous 10 days.

    Now change the formula to the following formula:

    = List.Dates(  DateTime.Date(  DateTime.FixedLocalNow() ) ,  10 ,  #duration(-1,0,0,0)  )

    Note:  In this equation we have changed the duration to -1.  This is important to note because now our date table returns older dates.  In our previous equation we used a positive 1 and we return future dates.

    In this new equation we have defined the Start Date to the following statement : DateTime.Date(  DateTime.FixedLocalNow() )  This is tricky because if you only use DateTime.FixedLocalNow() the statement will error out.  The error occurs because the DateTime.FixedLoaclNow() is a date and time.  The List.Dates function is expecting a Date only value.  Hence why we use the DateTime.Date() function to remove the time stamp and only return today’s date.

    Date List Using Date of Today
    Date List Using Date of Today

    It is most likely your date ranges will be different than the ones in the example because the DateTime.FixedLocalNow() function will be pulling in your computer’s current date.

    Next modify the equation to now pull the last 90 days (highlighted in red below)

    List.Dates(DateTime.Date(DateTime.FixedLocalNow()), 90, #duration(-1,0,0,0))

    The list of dates is just that a list.  We really can’t do to many other enhancements to our data with only a list of dates.  Now transform the list into a table.  Click on the Transform ribbon and select To Table.  Notice now that we have a new column and a new applied step.

    New Column
    New Column

    The code for the new applied steps is as follows:

    Table.FromList(Source , Splitter.SplitByNothing() , null , null , ExtraValues.Error)

    I colored the first null in the equation.  This is actually a parameter that you can use to name the new column we just made.  Tricky, Tricky, PowerBI.  Modify the equation to the following:

    Table.FromList(Source , Splitter.SplitByNothing() , {"Date"} , null, ExtraValues.Error)

    Our table is updated and now has the name Date.  Nice work!

    Now lets make our date list useful.  Click on the ribbon labeled Add Column and then the button labeled Add Custom Column.  Add the following equation to the new column and name it Week #, then click OK, to continue.

    Number.RoundDown( Number.From(Date.AddDays( List.Max( Table.Column(#"Converted to Table", "Date" ) ) , -1 * Number.From( List.Max( Table.Column(#"Converted to Table", "Date" ) ) - Date.StartOfWeek( List.Max( Table.Column( #"Converted to Table", "Date" ) ) , Day.Saturday ) ) ) -[Date] ) / 7 + 1 , 0)

    This equation defines the start of the week highlighted in RED.  Since today is Tuesday 8/30/16, then the days 8/30 (Tues), 8/29 (Mon), 8/27 (Sunday) are considered week 0 or the current week.  All dates prior will start with weekly increment.

    Date List
    Date List

    Now we can add some logic to define week variables.  Click on the Add Column ribbon and select the Conditional Column button.  Using the drop downs in Column Name, Operator, Value and Output enter the following:

    Current Week Logic
    Current Week Logic

    Click OK to proceed.  We have now added an additional column with a text description of the week.

    Current Week Column
    Current Week Column

    Following the add column steps mentioned above we will now add more week descriptions.  Add the following conditional column for Last Week:

    Last Week Logic
    Last Week Logic

    From here you can make custom columns for how you want to describe your data.  In this example we will build last 2 weeks, 3 weeks and last 4 weeks.  See the add conditional column logic for each of those respective weeks.

    Conditional Column Logic for last 2 weeks:

    Last 2 Weeks Logic
    Last 2 Weeks Logic

    Note: When we added this conditional column we label week 0 as last 2 weeks.  See image below as an example:

    Last 2 Weeks Column
    Last 2 Weeks Column

    To fix this we modify the code that generated this column.  The code initially states the following:

    = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 then "Last 2 Weeks" else null )

    We modify this code to the following: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 and [Week Number] > 0 then "Last 2 Weeks" else null )

    This now removes the first three days from our Last 2 Weeks column reflecting a more accurate picture of our time ranges.

    Corrected Last 2 Weeks Column
    Corrected Last 2 Weeks Column

    Next we will add the Last 3 Weeks column and the Last 4 weeks column.  Each time we will modify the add column code to remove the first three dates of the current week.

    Last 3 Weeks Logic
    Last 3 Weeks Logic

    Last 3 Weeks auto generated code:

    = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 then "Last 3 Weeks" else null )

    We modify to the following to achieve the correct Last 3 Weeks data range: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 and [Week Number] > 0 then "Last 3 Weeks" else null )

    Add the Last 4 Weeks column:

    Last 4 Weeks Logic
    Last 4 Weeks Logic

    Last 4 Weeks auto generated code:

    = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 then "Last 4 Weeks" else null )

    Modify the code the following to correct the column: (changes highlighted in bold)

    = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 and [Week Number] > 0 then "Last 4 Weeks" else null )

    Nice job so far.  We are almost to the end now.  After all those additional columns you should have something that looks similar to the following:

    Date Table
    Date Table

    Next we will pivot all the data down to one column.  This will enable us to select a time period and automatically have our date table update to the specific range.

    First, shift select the following columns, Current Week, Last Week, Last 2 Weeks, Last 3 Weeks, and Last 4 Weeks.  Then on the Transform ribbon click the Unpivot Columns button.

    Unpivot Columns Command
    Unpivot Columns Command

    Next delete the Attribute column using a right click on the Attribute column and selecting Remove Columns.

    Remove Attribute Column
    Remove Attribute Column

    Rename the Value column to Selector by right clicking on the Value column.

    Rename the Value Column
    Rename the Value Column

    Modify each column to have the correct Data Type on the Home ribbon.

    Date column data type should be Date

    Week Number column data type should be Whole Number

    Selector column data type should be Text

    Note: It is important to always check your data types for each column before you leave the Query Editor.  If you don’t you’ll find that the visuals that your trying to build later on on the page view will not work as expected.

    Next, click the Home ribbon and select Close & Apply.  You can now build the following visuals:

    A slicer for the Selector column:

    Selector Column as a Slicer
    Selector Column as a Slicer

    Table visual for the Date column:

    Note: When you use the Date Column as the data source for the Table Visual the data will automatically be added as a Date Hierachy.  This does not work well with our data so you will need to change the date from a Date Hierarchy to a standard Date.  To do this click the little triangle next to the Date in the Values box.  Then select Date.

    Date Table
    Date Table

    Now you can finally play around with your data and by selecting different items in the Selector slicer you can filter down to different date ranges.  Below I selected the Last Week item, which filters down my dates to only the 7 days from last week.

    Last Week Slicer Selected
    Last Week Slicer Selected

    Nice job making a custom date table in PowerBI.  The nice part about this table is that it will always refresh with the latest dates whenever the queries are refreshed for this PowerBI file.

    Bonus:  For those of you who want to cheat and just have the M code to generate this custom date table it can be used from here:

    let
     Source = List.Dates(DateTime.Date(DateTime.FixedLocalNow()), 90, #duration(-1,0,0,0)),
     #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),
     #"Added Custom1" = Table.AddColumn(#"Converted to Table", "Week Number", each Number.RoundDown( Number.From(Date.AddDays( List.Max( Table.Column(#"Converted to Table", "Date" ) ) , -1 * Number.From( List.Max( Table.Column(#"Converted to Table", "Date" ) ) - Date.StartOfWeek( List.Max( Table.Column( #"Converted to Table", "Date" ) ) , Day.Saturday ) ) ) -[Date] ) / 7 + 1 , 0)),
     #"Added Conditional Column" = Table.AddColumn(#"Added Custom1", "Current Week ", each if [Week Number] = 0 then "Current Week" else null ),
     #"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Last Week", each if [Week Number] = 1 then "Last Week" else null ),
     #"Added Conditional Column2" = Table.AddColumn(#"Added Conditional Column1", "Last 2 Weeks", each if [Week Number] < 3 and [Week Number] > 0 then "Last 2 Weeks" else null ),
     #"Added Conditional Column3" = Table.AddColumn(#"Added Conditional Column2", "Last 3 Weeks", each if [Week Number] < 4 and [Week Number] >0 then "Last 3 Weeks" else null ),
     #"Added Conditional Column4" = Table.AddColumn(#"Added Conditional Column3", "Last 4 Weeks", each if [Week Number] < 5 and [Week Number] > 0 then "Last 4 Weeks" else null ),
     #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Conditional Column4", {"Date", "Week Number"}, "Attribute", "Value"),
     #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
     #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Value", "Selector"}}),
     #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Date", type date}, {"Week Number", Int64.Type}, {"Selector", type text}})
    in
     #"Changed Type"
  • Loading Data From Folder

    Loading Data From Folder

    Let me setup a scenario for you.  You get a data file from an automated system, it has the same number of columns but the data changes for every new file.  Being the data savvy person that you are you’ve spent some time working in excel to make a template where you can copy your new data into and then automatically all your equations and graphs magically work.  You pat your self on the back and happily send out your fantastic report to everyone you know.  Then tomorrow when the data comes to you again you repeat the same process over again.  Still enamored by your awesome report, you send it out again knowing you have saved your self so much time not having to do the analysis or creation of your reports over and over again.  Now, fast forward 3 months.  That stupid report shows up again, and now you have to lug all that data from file to file and begrudgingly you sent out your report.  Thus, is the store of the analyst.  You love data, but you hate it as well.  Well in this tutorial I’ll show you how to remove some of the pain of that continual data loading process by loading new data from a folder.

    My previous post (found here) talks about loading data from a folder.  In this tutorial we will add some logic to this method that will look at a folder but only load the most recently added item from that folder.

    Data for this tutorial is located this link Monthly Data Zip File.  This data in the ZIP file is a monthly data sample from Feb 2016 to April of 2016.

    Download the zip file mentioned above and extract the Monthly Data folder down to your desktop.  Open up PowerBI Desktop and click on the Get Data button and select All on the left side.  Click on the item labeled Folder and click Connect to continue.

    Get Data from Folder
    Get Data from Folder

     

    Select the newly unzipped Monthly Data folder that should be on your desktop.  Click OK to continue. Upon opening that folder location you will be presented with the multiple files.  Click Edit to edit the query.

    Edit Query for Folder Load
    Edit Query for Folder Load

    Now you are in the Query Editor.  This is where the fancy query editing will work to our advantage.  We could load all the data into one large query.  However, depending on the size of your data sets or how you want to report your data this may not always be desirable.  Instead you may only want data from April, then May when the new data is sent next month.

    Thus, our first step to start pairing down the data will be to first filter the files in sequential order.  In this case because I have named the files with a Year-Month-Day format I can sort the files according to their names.

    Note:  When using PowerBI desktop it is a good practice to name the files  beginning with a YYYY-MM-DD file name.  This makes it really easy when sorting and ingesting information into PowerBI.  I have used other columns of information such as Date Accessed or Date Created before but have gotten inconsistent results as these dates can change depending on when a file was moved or copied from one place to another.

    Click the drop down next to Name and sort the files in Sort Descending.

    Name in Descending Sort
    Name in Sort Descending

    This places the files with the most recent file at the top of the list.

    File List in Descending Order
    File List in Descending Order

    Next click on the Keep Rows button on the Home ribbon, select Keep Top Rows.

    Keep Top Rows
    Keep Top Rows

    Enter the number when the popup appears.  Click OK to continue.

    Keep Top Rows Menu
    Keep Top Rows Menu

    Now you’ll notice you have only one file selected which is our latest file from April.  Click the Load File button found in the Content column.

    Load File Button
    Load File Button

    We have completed the activities in the Query Editor and can now load the data.  Click Close & Apply found on the Home ribbon.  All our April data has loaded.  by making a simple table we can now see all the data that was just loaded.

    Loaded Data from April
    Loaded Data from April

    Now we will remove some data from our desktop folder labeled monthly data.  Open the folder on the desktop labeled Monthly Data and delete the filed labeled 2016-04-01 April.  You should now have a folder labeled Monthly Data with only two files in it, one for Feb and one for March.

    Two Files Left
    Two Files Left

    Return back to Power BI Desktop and click the Refresh button on the Home ribbon.  Notice now how all our data has changed.  We are now looking at the March data because it is the most recent file in our folder based on the file name.

    March Data Load
    March Data Load

    To verify this we open the query editor (Click the Edit Queries on the Home ribbon).  Click Refresh Preview on the Home ribbon and finally select the Applied Step called Kept First Rows.  This will reveal the month of March as our data source.

    Month of March Loaded
    Month of March Loaded

    Now, every time you add a new file to our folder and refresh PowerBI the latest file (based on the naming convention we talked about earlier) will always be loaded.

    Note: This method works great when your data source is coming from an automated system.  The file format must always be the same for this to work reliability.  If the file naming convention changes, or the number of columns or location of those columns changes then the query will most likely fail.

    Good luck and thanks for following along.

  • Manually Enter Data

    Manually Enter Data

    There are often times when you need a small data set in order to make a visual behave exactly how you want it to.  This may mean you need a small table to represent a range of numbers or text values.

    Here are the Resources for this tutorial:

    • Power BI Desktop (I’m using the March 2016 version, 2.33.4337.281) download the latest version from Microsoft Here.

    To enter your own data Click the Enter Data button on the Home ribbon.

    Enter Data
    Enter Data Button

    Next you are prompted with the Create Table window.  In this window you are given the layout of a unfilled table.  To begin entering data you can click in the first cell in Column one and start entering data.  By pressing enter a new cell will populate below.  You can Rename the column by double clicking the column name.  To add a second column you Click on the symbol next to your existing column.  Finally to edit the table name you can type in the desired table name in the Name input box in the bottom left hand portion of the window.

    Create Table
    Create Table Window

    Finally, you can either to choose to Load the data as is or Edit the data to make additional changes (this can be useful to edit the data types of each column or to populate equations in subsequent columns).  For the sake of this tutorial we will simply load the data.  Click Load to load the data into the data model.

    Now drag over the columns into the page view to begin generating visuals.  By default PowerBI makes a table of data to show you the values you just entered.

    Sales Table
    Visual of Sales Table

    Select the table visual (you know it is highlighted when it has the trim boarder as shown above) and Click the Doughnut Visual.  This transforms the data into a doughnut, and who doesn’t like a nice data doughnut?  Click anywhere in the page to de-select the new doughnut visual.  Add a second table by dragging over he Region and Sales columns.  We can now see the pretty graphic and the numbers supporting that visual.

    Visuals
    Visuals Made with our Custom Data

    I bet you didn’t notice that something changed here.   Look closely at the data we see now vs. what we entered earlier.  Go ahead, scroll up, I’ll wait…  Did you catch it?

    We now have 5 rows of data but we entered 6 before.  That is because the Sales column is a number column and can be aggregated.  Look in the fields column and you see there is a little sum symbol in front of the Sales column.  This means that this column has a default summarization associated.  To see what is the default summarization highlight Sales by clicking on the column name in the grey area.  Then Click the ribbon titled Modeling, and there it is in the properties section the Default Summarization is Sum.  Every time you use the Sales column it will be summarized in the tables and visuals views.  Our visual table shows Brazil with a total sales of 600, because we had two Regions labeled as Brazil 500 and 100.

    Now you can click on any of the data points in the doughnut.  Notice the table automatically filters down to only show the areas you selected.

    Brazil Data
    Data Filtered to only Brazil

    ProTip: you can select multiple selections by holding down CTRL and selecting multiple items in the visual.  You can only do this inside of one visual.  As soon as you click another visual all filtering will disappear.

    Again, I hope you enjoyed this quick tutorial. If you liked it make sure you share it below.