Tag: DAX Expressions

  • All Columns have Data Types in DAX, Or do they?

    All Columns have Data Types in DAX, Or do they?

    To say this another way. Not all columns have datatypes in DAX, specifically speaking to using dynamic ranking with an “Other” category. Let me explain.

    The Tabular model is comprised of tables with typed columns. This is a truth learned in one’s Tabular infancy. Whether the table is imported, calculated, or Directly Queried does not matter; all tables have columns with explicitly defined types.

    DAX is the language of the Tabular model. It has less type discipline than tables in the Tabular model. Measures can have a Variant data typeImplicit type conversions abound. Nor are DAX expressions limited to returning scalar values. DAX is a query language. An expression may return a table. When defining a table in DAX, projected columns have no explicit type. Each row’s value is calculated independently and its type depends on its value. This is the phenomenon we seek to explore in this post. If this seems unsurprising, you can probably stop here.

    A practical scenario

    Here is the scenario where I discovered this. I was helping a colleague with a dynamic ranking measure and aggregation. The requirement read:

    Show the top for production plants as ranks 1 through 4. Group all others ranked 5 or lower together as “Other”.

    We have three tables:

    1. a fact table 'Fact'
    2. a dimension table 'Dim', related to 'Fact' with the relationship 'Fact'[DimKey] <-N:1- 'Dim'[DimKey]
    3. a disconnected table, 'Rank Selector', with the values: “1”, “2”, “3”, “4”, and “Other”.

    We aggregate 'Fact'[Value]. Here is a Power BI file with sample data and code.

    Our first try

    We started with a measure [Agg] as our base measure.

    Agg = SUM ( 'Fact'[Value] )

    [_Rank (variant)] is a helper ranking measure, returning a rank 1-4 or “Other”.

    _Rank (variant) =
    VAR rawRank = RANKX ( ALL ( 'Dim' ), [Agg],, DESC )
    // note this value is either a number 1-4 OR the string, "Other"
    VAR groupedRank = IF ( rawRank <= 4, rawRank, "Other" )
    RETURN
        groupedRank

    Below is a screengrab of these two measures evaluated against our 'Dim'[Plant]. As you can see, [Agg] returns a value for each plant. The top four plants have ranks of 1, 2, 3, and 4. The other plants all have the same rank, “Other”. This is all as expected.

    'Dim'[Plant] with [Agg] and [_Rank (variant)] measures, working largely as expected. Each dim value has a value for [Agg] and a rank of 1-4 or the string, “Other”.

    The issue reared its head when we attempted to use these ranks to define a new visual based on 'Rank Selector'[Rank]. We defined a measure to show [Agg] based on the plant’s rank (1-4 or “Other”), rather than the plant name. Note that 'Rank Selector'[Rank] is a column in a data model table, with Data type: Text. Our expectation is that we will see a table like the image below.

    A table visualization showing six rows with labels: “1”, “2”, “3”, “4”, “Other”, and “Total”; and showing an aggregate alongside each.

    Wherein our heroes encounter a challenge

    [Ranked Agg (variant)] is the display measure to show aggregations against 'Rank Selector'[Rank]. We wrote the variant version you see below first, not realizing the perils that laid ahead. (Correct implementations follow later.)

    Ranked Agg (variant) = 
    VAR rankContext = VALUES ( 'Rank Selector'[Rank] )
    VAR ranked =
        ADDCOLUMNS (
            ALL ( 'Dim' ),
            "@rank", [_Rank (variant)],
            "@val", [Agg]
        )
    VAR currentRank = FILTER ( ranked, [@rank] IN rankContext )
    VAR result = SUMX ( currentRank, [@val] )
    RETURN
        result

    When we visualize this measure against 'Rank Selector'[Rank], we get the result below.

    A table visualization showing only the 'Rank Selector'[Rank] value “Other” and “Total”. Specifically, we do not see any value for ranks “1”, “2”, “3”, or “4”.

    A table visualization showing only the 'Rank Selector'[Rank] value “Other” and “Total”. Specifically, we do not see any value for ranks “1”, “2”, “3”, or “4”.

    What is going on here? We see data only for the Rank Selector'[Rank] label “Other” but none of the numbers. Let us walk through the measure step-by-step to understand this result.

    VAR rankContext = VALUES ( 'Rank Selector'[Rank] )

    rankContext captures whatever the current filter context is for 'Rank Selector'[Rank]. This is one or a collection of the values: “1”, “2”, “3”, “4”, and “Other”. For a detail row of a table visualization, the context contains exactly one of those values. For a total row (assuming no other slicers or filters), the context contains all of those values.

    VAR ranked =
        ADDCOLUMNS (
            ALL ( 'Dim' ),
            "@rank", [_Rank (variant)],
            "@val", [Agg]
        )

    ranked is a table with all rows and columns of the data model table 'Dim' and two projected columns, [@rank] and [@val]. For any row, i.e., for any specific plant, [@rank] is one of 1-4 or “Other”. Many rows may have [@rank] = "Other". We do not expect ties, so there are exactly four rows with numeric ranks, one each for the ranks 1-4. (This table looks like the first table visualization screenshot above.)

    The critical part of both the ranking logic and the phenomenon we are exploring is in currentRank. This is a subset of the table, ranked, holding only the row or rows which have a [@rank] value that exists in the filter context captured in rankContext (again, one or a collection of the values “1”, “2”, “3”, “4”, and “Other”).

    Note, that we see data only for the Rank Selector'[Rank] label “Other”. As you recall, our [_Rank (variant)] seemed to work correctly above – it definitely returned ranks 1-4, not just “Other”. 'Rank Selector'[Rank] has the correct values. We checked that our logic in [Ranked Agg (variant)] was correct. We verified in DAX Studio that ranked held the correct table. We even got to the point that we checked whether VALUES was behaving as we expected in rankContext. (I will note that if you find yourself verifying the behavior of an expression as basic as VALUES ( 'Rank Selector'[Rank] ), then you, too, may find yourself questioning what has brought you to this point.)

    VAR currentRank = FILTER ( ranked, [@rank] IN rankContext )

    We continued checking and troubleshooting and identified that currentRank had binary state: either it would be an empty table or it would be a table holding only rows with [@rank] values of “Other”. It would never hold a row with a rank value of 1, 2, 3, or 4. It seemed that the predicate in our FILTER would never find 1 to be equal to 1, or 2 equal to 2, and so on.

    How could basic logic be so broken just in this measure? There was much gnashing of teeth. Several head-shaped indents were beaten into the nearest wall.

    DAX, the language, has a more permissive type system than the Tabular model

    You may have observed some apparent inconsistency in my quoting or not-quoting the rank values above. In fact, I have been scrupulous to always quote values when referring to 'Rank Selector'[Rank], the model column with type Text, and to not-quote the rank values returned from [_Rank (variant)]. The column 'Rank Selector'[Rank] has exclusively string values. The measure [_Rank (variant)] sometimes returns a numerically typed value in the range 1-4 and sometimes returns the string “Other”.

    In DAX, 1 = 1 is an equality test that returns TRUE. Similarly, 1 IN {1, 2, 3, 4} returns TRUE, because 1 = 1 and 1 exists in the table there. In DAX, 1 = "1" is an equality test that throws a type error. Numbers cannot be tested for equality with strings. They are different types. Therefore, a number can never be equal to a string. Thus, 1 IN {"1", "2", "3", "4", "Other"} also throws a type error.

    The lightbulb moment

    In [Ranked Agg (variant)]currentRank has a column [@rank] with type Variant. Sometimes, [@rank] has a value in the range 1-4, type Whole Number. Sometimes it has the value “Other”, type Text. When we evaluate the predicate [@rank] IN rankContext, there are exactly two possible results. Either we ask for an impossible membership test, “Is this numerically typed rank value in the set of Text values?”, or we ask whether the string “Other” is in that set. The first cannot succeed. The second only succeeds for the “Other” rank.

    The Fix

    The fix, then, is straightforward. We must always return a value of type Text in our ranking measure. Correct measures are below with comments highlighting the modifications.

    _Rank (typed) = 
    VAR rawRank = RANKX ( ALL ( 'Dim' ), [Agg],, DESC )
    VAR groupedRank =
        // with an explicit typecast, our measure now *always* returns a string-typed value
        IF (
            rawRank <= 4,
            FORMAT ( rawRank, "0" ), // force string type
            "Other"
        )
    RETURN
        groupedRank
    
    Ranked Agg (typed) = 
    VAR rankContext = VALUES ( 'Rank Selector'[Rank] )
    VAR ranked =
        ADDCOLUMNS (
            ALL ( 'Dim' ),
            "@rank", [_Rank (typed)], // the only difference here is using our typed rank helper measure
            "@val", [Agg]
        )
    VAR currentRank = FILTER ( ranked, [@rank] IN rankContext )
    VAR result = SUMX ( currentRank, [@val] )
    RETURN
        result

    The measures above give us the results we expected all along, as you can see in the table visualization below.

    A table visualization showing 6 rows with labels: “1”, “2”, “3”, “4”, “Other”, and “Total”; and showing the correct values of [Ranked Agg (typed)] alongside.

    Below is a screengrab of all the measures in visualizations, so you can see the differences.

    A screenshot with all three of the table visualizations shown prior in this post: the 'Dim'[Plant] table with [Agg][_Rank (variant)], and [_Rank (typed)]; the incorrect behavior with [Ranked Agg (variant)]; and the correct behavior with [Ranked Agg (typed)].

    Summary

    Tabular model tables have columns of explicit type. DAX expressions have implicit types. Columns defined in DAX have Variant type. Each row has an implicit type based on its value. If such a column is a Calculated Column added to a data model table, then an explicit type must be assigned and all values in that column will be cast to that type at load time. When such a column is used as an intermediate step in some calculation, each row may have a different type. If your operations depend on types, you must account for this explicitly. This is especially relevant when operating on data model and DAX tables together.

    Insights from the community

    When talking about this with some peers, Maxim Zelensky (blogTwitter) observed that there is no type indication in Power BI Desktop. Nevertheless, you can identify this sort of thing in DAX Studio. The result of a query such as that below will right-align Variant-typed values and left-align Text-typed values. This query references the included Power BI file and its results are shown in a screenshot below. You can observe similar alignment in the table visual shown above. In general, when visualized, numeric data is presented with right-alignment and textual data left-aligned. This practice predates Power BI; it is a general rule that Power BI and DAX Studio wisely adopted.

    EVALUATE
    ADDCOLUMNS (
        VALUES ( 'Dim'[Plant] ),
        "@text", [_Rank (typed)],
        "@variant", [_Rank (variant)]
    )
    A screenshot showing the results of the above query, with [@text] left-aligned and [@variant] right-aligned.

    Using the sample code and PBIX

    Here is a link to download the PBIX that was used for examples above. This has four visuals to demonstrate the phenomenon discussed in this blog post. The first visual is a table with 'Dim'[Plant][Agg][_Rank (variant)], and [_Rank (typed)]. Of note in the first visual is left-vs-right alignment of the two [_Rank ...] measures. The second visual is a table with 'Rank Selector'[Rank] and [Ranked Agg (variant)], showing the incorrect results of the measure depending on a Variant-typed intermediate column, with a value only for rank “Other”. The third visual contrasts with the second, using [Ranked Agg (typed)] to show the correct behavior of the dynamic ranking and grouping logic, with a value for all ranks and a correct total. Lest one protest that the phenomenon above may be an artifact of using VARs, the last visual, on the bottom, shows an example that uses no VARs and references no other measures. The behavior of this last visual is identical to the incorrect [Ranked Agg (variant)] version.

  • Santa Loves Power BI and R

    Santa Loves Power BI and R

    This past week I was talking with the big guy up north, jolly old fella, and the discussion came up about his toy production levels.  Santa was complaining about how hard it was to measure the performance of all his elves.  Naturally I started babbling about how much I enjoy Power BI and that I use it on all kinds of sources of data, google analytics, excel sheets, sharepoint, and SQL data warehouses just to name a few.  Now by this point most people would have wandered off looking for another conversation, but I must have struck a chord with Santa.  He jumped right in the conversation and told me how he had just moved all his local data centers into Azure and more specifically SQL data warehouses.  It was saving him loads of money in addition it has freed up all his I.T. elves to move to more important tasks, building the NES Classic for Nintendo, they are way behind in production.  To make a long story longer, I was able to convince Santa to give me a small sample of data so I could show him how to use R to visualize his data in PowerBI.  Here is what I came up with:

    Santa Production Levels
    Santa Production Levels

    Needless to say he was very pleased.  I explained the chart to Santa, each bar represents the average production volume for each elf.  Then the whiskers at the end of the bar represent the +1 and -1 standard deviation away from that mean.  It essentially tells you how consistent each elf is able to produce products and what is the average production rate.  For example, Buddy the Elf can produce an average 148 items in a day, he has a daily variance of 10 items.  Charlie can produce on average more items but has a wider daily variance.  Snowflake has the lowest average production level but is one of the more consistent producers.  Santa gave me a big smile and said “nice job.”

    Let’s walk through how I did this.

    Open up PowerBI Desktop, Click the Get Data button on the Home ribbon and select Blank Query.  Click Connect to open the Query Editor.  Click Advanced Editor on the View ribbon.  While in the Advanced Editor paste the following code into the editor window, click Done to complete the data load.

    Note: If you need some more help loading the data follow this tutorial about loading data using the Advanced Query Editor.  This tutorial teaches you how to copy and paste M code into the Advanced Editor.

    let
        Source = Excel.Workbook(Web.Contents("https://powerbitips03.blob.core.windows.net/blobpowerbitips03/wp-content/uploads/2016/12/Santa-Production.xlsx"), null, true),
        Production_Table = Source{[Item="Production",Kind="Table"]}[Data],
        #"Changed Type" = Table.TransformColumnTypes(Production_Table,{{"Elf", type text}, {"Toy", type text}, {"Prodution Volume", Int64.Type}})
    in
        #"Changed Type"

    Before you exit the query editor Rename the query to Production. It should look similar to the following:

    Production Query
    Production Query

    Click Close & Apply on the home ribbon.

    Add the following measures by click on the New Measure button on the Home ribbon.

    Avg = AVERAGE(Production[Prodution Volume])

    The Avg measure will determine the height of each bar in the bar chart.

    StdDev = STDEV.P('Production'[Prodution Volume])

    The StdDev will calculate the standard deviation for each elf.

    Ymax = [Avg]+ [StdDev]

    The Ymax calculation adds the Avg measure value to the standard deviation for production.  This produces the upper arm of the whisker.

    Ymin = [Avg]-[StdDev]

    The Ymin calculation is subtracts the standard deviation from the Avg measure value.  This produces the lower arm of the whisker.

    Once you have completed making all the measures you should have a Production table with the following fields:

    Added Measures
    Added Measures

    Add the table visual by click on the Table visual in the Visualizations pane.  Add the Fields which are shown below.  Your table should look identical to this:

    Production Table
    Production Table

    Next, add the R Visual from the visualization Pane.  When you click on this you will get a message stating “Enable Script Visuals” click Enable to proceed.

    Note: If you have not installed R or enabled the preview features of R in Power BI you should follow this tutorial which helps you get everything set up.  For this particular visual we are using ggplot2.  This is a package for R and should be installed in the R environment.  You can follow this tutorial on how to install ggplot2.

    Add the following fields into the R visual:

    Add Fields to R Visual
    Add Fields to R Visual

    Next in the R Script Editor add the following code to generate the R Script.

    library (ggplot2) # Load ggplot to run visuals

    # Set up graph
    ggplot(dataset, aes(x = Elf, y = Avg) ) +

    # Insert the bar chart using acutal values passed to visual
    # Stat = “identity” does not count items uses actual values
    # set up transparency to 70% with Alpha
    geom_bar( stat = “identity”, aes( alpha= 0.7, fill = Elf ) ) +
    # draw the error bars, use pass Ymin & Ymax from PBI
    geom_errorbar(aes(width = .5, colour = Elf , ymin = Ymin, ymax = Ymax)) +

    # Change the Labels
    labs(x = “Elf Name”, y = “Production Vol.” ) +

    # Make the theme simple and remove gridlines
    # Change the font size to 14
    theme_classic( base_size = 18 ) +

    # Remove the legend
    theme( legend.position = “none”) +

    # Change elements of the Axis, Angle, horizontal & Vertical position
    theme( axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.3),
    axis.text = element_text(colour = “black”),
    axis.ticks = element_line(colour = “black”),
    axis.title = element_text(colour = “black”),
    plot.background = element_rect(colour = NA) )

    Note: This code uses the R package ggplot2.  It will error out if you don’t have ggplot2 installed. 

    Click the run icon to execute the R script.

    Add R Script
    Add R Script & Run Script

    When the script runs you will have a beautiful production chart.

    R Chart
    R Chart

    Thanks for following along.  Like always be sure to share if you liked this content.  I am always, looking for feedback and possible topics so make sure you leave a comment below.

    If you want to download a similar example already completed you can download this example from the R Script Showcase (don’t forget to give me a thumbs up).

    Merry Christmas!

  • Measures – Year Over Year Percent Change

    Measures – Year Over Year Percent Change

    This tutorial is a variation on the month to month percent change tutorial.  This specific exploration in year over year performance was born out of reviewing my google analytics information.  The specific analysis question I am trying to answer is, how did this current month of website visitors compare to the same month last year.  For example I want to compare the number of visitors for November 2016 to November 2015.  Did I have more users this year in this month or last year?  What was my percent changed between the two months?

    Here is a sample of the analysis:

    let’s begin with loading our data and data transformations.  Open up PowerBI Desktop, Click the Get Data button on the Home ribbon and select Blank Query.  Click Connect to open the Query Editor.  Click Advanced Editor on the View ribbon.  While in the Advanced Editor paste the following code into the editor window, click Done to complete the data load.

    Note: If you need some more help loading the data follow this tutorial about loading data using the Advanced Query Editor.  This tutorial teaches you how to copy and paste M code into the Advanced Editor.

    let
     Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VdDBDcQwCETRXnyOFMAYcC1W+m9jV8BhfH1ygJ9zBr/8CvEaz+DYNL7nDAFjnWkTTNsUbIqnLfyWa56BOXOagy2xtMB5Vjs2mPFOYwIkikIsWd6IKb7qxH5o+bBNwIwIk622OCanTd2YXPNUMNnqFwomp0XvDTAPw+Q2uZL7QL+SC1Wv5Dpx/lO+Hw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Start of Month" = _t, Sales = _t]),
     #"Changed Type" = Table.TransformColumnTypes(Source,{{"Start of Month", type date}, {"Sales", Int64.Type}}),
     #"Inserted Month" = Table.AddColumn(#"Changed Type", "Month", each Date.Month([Start of Month]), type number),
     #"Inserted Year" = Table.AddColumn(#"Inserted Month", "Year", each Date.Year([Start of Month]), type number)
    in
     #"Inserted Year"

    While still in the Query Editor rename the query to Data.  Then click Close & Apply to complete the data load into the data model.

    Load Monthly Data
    Load Monthly Data

    Next, make four measures.  On the Home ribbon click the New Measure button.  Enter the following to establish a reference date to the subsequent equations:

    Date Reference = DATE(2016,12,31)

    Enter in the following equation to calculate the last year monthly sales amount.

    LastYear = 
      VAR 
        CurrentDate = [Date Reference]
      RETURN
        CALCULATE( 
         SUM(Data[Sales]), 
         Data[Year] = YEAR(CurrentDate)-1
        )

    Note: Using the NOW() function calls the current time when the query was last run.  Thus, if you refresh your data next month the NOW() function wrapped in a YEAR() will return the current year from the date-time observed by PowerBI.

    Following the same process enter the following additional measures.  The ThisYear measure calculates the sales for the current month.

    ThisYear = 
      VAR 
       CurrentDate = [Date Reference] 
      RETURN
       CALCULATE(SUM(Data[Sales]),Data[Year] = YEAR(CurrentDate))

    Finally, add the calculation for the Year to Year comparison.

    YoY Percent Change = DIVIDE([ThisYear], [LastYear], 0)-1

    Since the YoY Percent Change is a real percentage we need to change the formatting to a percent.  Click on the YoY Percent Change measure then on the Modeling ribbon click the symbol in the formatting section of the ribbon.

    Change Measure Format
    Change Measure Format

    Next, add a Stacked Column Chart with the following columns selected.

    Add Stacked Column Chart
    Add Stacked Column Chart

    OK, we have a chart, but it is kinda awkward looking right now.  The x-axis is the month number but we don’t have a month 0.  That simply does not make sense.  Let’s change some of the chart properties.  While having the Stacked Column Chart selected click on the Paint Roller in the Visualizations pane.  First, click on the X-Axis and change the Type to Categorical.

    Change X-Axis
    Change X-Axis

    Then click on the Data Colors and turn on Diverging.  Change the Minimum color to Red and the Maximum color to Green.  Set the Center to a value of 0.

    Change Colors
    Change Colors

    Click on the Title change it something meaningful, Center the text and increase the font size.

    Change Title
    Change Title

    Our bar chart looks much better.  However, the month numbers do not look quite right.  Visually the month indicators would be cleaner if we didn’t have any decimals.  Click on the Month field and then on the Modeling ribbon change the Data Type to Whole Number.  There will be a warning letting you know that you are changing the Data Type of the Whole number.  Click OK to proceed with the change.

    Change Month to Whole Number
    Change Month to Whole Number

    Another successful percent change tutorial completed.  I hope you enjoyed this year over year month comparison example.  Make sure you share if you like what you see.

  • Pareto Charting in PowerBI

    Pareto Charting in PowerBI

    The Pareto chart is a handy visual, but is not so easy to build in either excel or PowerBI.  In a Pareto chart, information is provided about an individual product or category as a bar, and a cumulative scale as a line which compairs all bars.  This type of visual can be extremely helpful when conducting failure mode analysis, causes of a problem, or even product portfolio balances.  For some more information on Pareto charts you can learn more here or here.  If you’re interested in building a Pareto chart in excel, I have found this post from Excel Easy to be helpful.

    To give you a little teaser of what we will be building today, below you will see an image of the final Pareto chart.  On the left side we have sales of units, and on the right is the cumulative percent of all sales.  Using the Pareto chart a user has the ability to see which products comprise the majority of your sales.  For example, the first 4 bars total approximately 50% of all sales.

    Pareto Final Product
    Pareto Final Product

    Alright, let’s get started.

    Open up PowerBI Desktop, Click the Get Data button on the Home ribbon and select Blank Query.  Click Connect to open the Query Editor.  Click Advanced Editor on the View ribbon.  While in the Advanced Editor paste the following code into the editor window.

    Note: If you need some more help loading the data follow this tutorial about loading data using the Advanced Query Editor.  This tutorial teaches you how to copy and paste M code into the Advanced Editor.

    let
     Source = Excel.Workbook(Web.Contents("https://powerbitips03.blob.core.windows.net/blobpowerbitips03/wp-content/uploads/2016/10/Sample-Data.xlsx"), null, true),
     Table1_Table = Source{[Item="Table1",Kind="Table"]}[Data],
     #"Changed Type" = Table.TransformColumnTypes(Table1_Table,{{"Item", type text}, {"Sales", Int64.Type}, {"Segment", type text}})
    in
     #"Changed Type"

    Rename the Query to Data.  Once you’ve completed the data load your data should look like the following.

    Load Data to Query Editor
    Load Data to Query Editor

    On the Home ribbon click Close & Apply to complete the data load.

    Close and Apply
    Close and Apply

    Let’s begin with a little exploration of our data.

    Pro Tip: When I am building reports I often load the data and then immediately start building a couple of tables and slicers.  It helps me understand how my data reacts to the slicers and helps me determine how to shape the data so that the visuals will work properly.  For this example, we only have one table, but when loading data things can get rather complex due to loading multiple tables with multiple relationships.

    Add a Slicer for the Segment.  Enhance the look of the slicer by changing it from a vertical to a horizontal slicer.  While the slicer is highlighted, click the Paint Roller expand the General section and change the orientation from vertical to Horizontal.

    Segment Slicer
    Segment Slicer

    Repeat the same process to add a Slicer for the item field.

    Item Slicer
    Item Slicer

    Next, add a table view of all the fields.  Start with Segment, then Item and finally add Sales to the Table Visual.

    Data Table
    Data Table

    Notice, now that we added all the Fields, there are a number of repeating values.  We have Category 1 and Item 1 repeated 9 times.  In some cases, it will be necessary to have this level of data brought into the data model within PowerBI.  A common reason is that this level of granularity is required for other report pages, or visuals.  It is OK to bring large amounts of data, but as a method of best practice it is recommended that you bring in the data required to support the visuals.

    Now, to address these multiple items that we see in our data.  In the sample Pareto image provided at the beginning of this Tutorial we only had one bar for Category 2 Item 3.  Thus, we need to summarize each grouping of every Category and Item combination.  To do this we will construct a summary table.

    First, we will create a unique Key that will be used to summarize each combination of Category and Item pair.  Click the bottom half of the New Measure button located on the Home ribbon.

    Calculated Column
    Calculated Column

    Enter the following DAX expression.  This new column titled Blend will be the unique Key that is utilized to summarize the data.

    Blend = Data[Segment]  &  "-"  &  Data[Item]

    Select the Modeling ribbon and then click on the New Table button.  Enter the following DAX expression.

    Summary = SUMMARIZE('Data', Data[Blend], "Sum Sales", SUM(Data[Sales]) )

    For more information on the SUMMARIZE function you can visit the Mircosoft Summarize documentation page.  In this equation we first select the table and in this case it is ‘Data’.  Then the column we want to summarize or group by is the Segment column noted as Data[Blend].  The next field is the title of the summarized field column, noted as “Sum Sales”.  Then DAX function that calculates the Sum of the column labeled Data[Sales], noted as SUM(Data[Sales]).  It is relevant to point out here that the SUMMARIZE function will only work with building a new table and not as a calculated column or measure.

    Add a new Table visual to the report and include the two newly created fields from the Summary table.

    Summary Table Visual
    Summary Table Visual

    We have a field titled Blend which is our Key for all the summarized groupings.  Next, we will want to parse out the Segments and Items from this blend column.  We will want to use Category 1 & 2 in a slicer and the same for Items 1 to 5.  Highlight the summary table by clicking the grey space next to the word Summary.  Click the New Column button on the Modeling ribbon and enter the following DAX expression.

    Segment = PATHITEM(
       SUBSTITUTE(Summary[Blend], "-" , "|" ),
       1 )

    In this expression the Substitute function replaced the dash “-” with a “|” character.  Then the PATHITEM function can then parse the text into segments.  By entering a 1 we select the first item in the sequence.  For our example we only have two items, but when you’re working with file paths you can have multiple items in the path such as “\users\mike\my documents\my folder\”, which would equate to users = position 1, mike = position 2, my documents = position 3, etc..

    Add another new column with the following DAX expression for the item column.

    Item = PATHITEM( 
      SUBSTITUTE(Summary[Blend], "-" , "|" ),
      2 )

    Note: We changed the PATHITEM position from 1 to 2.

    Next add the newly created Segment and Item columns to our summary table visual that we created earlier.

    Add New Fields
    Add New Fields

    Nice job so far.  Now we have to modify our slicers to point to the new Item and Segment fields we created in the Summary table.  Select the Segment Slicer Visual and add the Segment Field from the Summary table.

    Update Segment Slicer
    Update Segment Slicer

    Update Item Slicer
    Update Item Slicer

    Now that we have updated the slicers, we can now can control the table visual made from the Summary table.

    Select Category 1 and Items 1 to 3
    Select Category 1 and Items 1 to 3

    Pro Tip: To select multiple items in a slicer you can hold down the Ctrl button on the key board and click multiple slicer items.  This is how I was able to select Items 1 to 3.

    Now we are ready to build the measures that will support the Pareto chart.  Click on the bottom half of the New Measure button on the Home ribbon and select New Column.  Add the following DAX expression to rank all the items in the Summary table.

    Ranking = RANKX(  'Summary',   'Summary'[Sum Sales])

    Add a measure for the Cumulative total according to the new ranking column we created.  Click the top half of the New Measure button on the Home ribbon.  Add the following DAX expression.

    Cumulative Total = CALCULATE(
        SUM( Summary[Sum Sales] ),
        FILTER( ALLSELECTED( Summary ),
            Summary[Ranking] <= MAX( Summary[Ranking] )
        ))

    Repeat the add measure process and add a Total measure which will total only the items from the summary table that have been selected in the report view.  Add the following DAX expression.

    Total Sales = CALCULATE(
     SUM( Summary[Sum Sales] ) ,
     ALLSELECTED( Summary )
     )

    For the last measure, repeat the process to add another measure.  Enter the following DAX expression as a measure.

    Cumulative Percent = [Cumulative Total] / [Total Sales]

    The Cumulative Percent measure is a calculated as a percentage, thus we need to change this measure’s formatting to percentage.  Click the measure labeled Cumulative Percent then change the Format to Percentage which is found on the Modeling ribbon.

    Change Formatting
    Change Formatting

    Your Summary table should now look like the following.

    Updated Fields List
    Updated Fields List

    To see all the calculations that we just created add all the fields from the Summary table to the Summary table visual we created earlier.

    Full Summary Table Visual
    Full Summary Table Visual

    At last, we are ready to add the Pareto chart.  Add the following fields to the line and stacked column chart.

    Add Line and Stacked Bar Chart
    Add Line and Stacked Bar Chart

    Order the data in descending order by the number of sales by click the visual’s Ellipsis and selecting Sort By Sum Sales.

    Sort by Sales
    Sort by Sales

    This changes the order of the items to make a Pareto chart.

    Final Pareto Chart
    Final Pareto Chart

    Thanks for following along.  Share if you enjoyed this tutorial.

  • Map with Data Labels in R

    Map with Data Labels in R

    Mapping is one of the better features of PowerBI.  It is one of the more distinguishing feature differences between Excel and PowerBI.  You can produce a map inside an excel document using Bing maps, however, the experience has always felt a little like an after-thought.  Mapping within PowerBI has a planned, and thoughtful integration.  While the mapping functionalities within PowerBI Desktop are far improved when compared to excel, there are still some limitations to the mapping visuals.  This past week I encountered such an example.  We wanted to draw a map of the United States, add state name labels and some dimensional property like year over year percent change.

    I started with the standard map visual, but this didn’t work because there is no ability to shade each state individually.  This just looked like a bubbled mess.

    Globe Map Visual
    Globe Map Visual

    Next, I tried the Filled Map visual.  While this mapping visual provides the colored states it lacks the ability to add data labels onto the map.  Clicking on the map would filter down to the selected state, which could show a numerical value.  Alternatively, you can place your mouse over a state and the resulting tag will show the details of the state (hovering example provided below).

    Filled Map Visual
    Filled Map Visual

    Still this did not quite meet my visual requirements.  I finally decided to build the visual in R which provided the correct amount of flexibility. See below for final result.  You can download the pbix file from the Microsoft R Script Showcase.

    R Map Visual
    R Map Visual

    In this visual, each state is shaded with a gradient color scale.  The states with the lowest sales are grey and the states with higher sales numbers transition to dark blue.  The darker the blue the more sales the state saw.  Each state has an applied label.  The color of the label denotes the percent change in sales.  If the color is green then the sales this year were higher than last year, red means that the state sales were lower this year.  The state name is listed in the label as well as the calculation for the year over year percent change.

    Alright, let’s start the tutorial.

    First, before we open PowerBI we need to load the appropriate packages for R.  For this visual you will need to load both the maps and the ggplot2 packages from Microsoft R Open.

    Open the R console and use the following code to install maps.

    install.packages('maps')

    Install Maps Package
    Install Maps Package

    Repeat this process for installing ggplot2.

    install.packages('ggplot2')

    After installing the R packages we are ready to work in PowerBI Desktop.  First, we need to load our sample data.  Open up PowerBI Desktop and start a blank query.  On the View ribbon in the query editor open the Advanced Editor and enter the following M code.

    Note: If you need some more help loading the data follow this tutorial about loading data using the Advanced Query Editor.  This tutorial teaches you how to copy and paste M code into the Advanced Editor.

    let
      Source = Excel.Workbook(Web.Contents("https://powerbitips03.blob.core.windows.net/blobpowerbitips03/wp-content/uploads/2016/10/State-Data.xlsx"), null, true),
      StateData_Table = Source{[Item="StateData",Kind="Table"]}[Data],
      #"Changed Type" = Table.TransformColumnTypes(StateData_Table,{{"StateName", type text}, {"Abb", type text}, {"TY Sales", Int64.Type}, {"state", type text}, {"Latitude", type number}, {"Longitude", type number}, {"LY Sales", Int64.Type}, {"Chng", type number}}),
      #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"TY Sales", "Sales"}})
    in
      #"Renamed Columns"

    After pasting the code into the Advanced Editor click Done to load the data.  While in the Query Editor, rename the query to be StateData, then click Close & Apply on the Home ribbon.

    Load Mapping Data
    Load Mapping Data

    We still need to prepare the data further by adding two calculated columns.  Click the bottom half of the New Measure button on the Home ribbon and select New Column.

    Add New Column
    Add New Column

    Enter the following code into the formula bar that appears after clicking New Column.

    Change = StateData[Abb] & " " & ROUND(100*StateData[Chng],0) & "%"

    Change Column Measure
    Change Column Measure

    Again, click on the New Column button found on the Home ribbon and add the code for a color column.

    Color = if(StateData[Chng] > 0 , "Dark Green", "Dark Red")

    Color Column Measure
    Color Column Measure

    The Fields list should now look like the following.

    Fields List
    Fields List

    Add the R visual with the following fields.

    R Visual Fields
    R Visual Fields

    Add the following R script into the R Script Editor.

    # Load the ggplot2 and maps packages
     library(ggplot2)
     library(maps)
    
    # Load the mapping data into a dataframe called states_map
     states_map <- map_data("state")
    
    # Start ggplot2 by sending it the dataset and setting the map_id variable to state
     ggplot(dataset, aes(map_id = state)) +
    
    # Add the map layer, define the map as our data frame defined earlier
     # as states_map, and define the fill for those states as the Sales data
     geom_map(map = states_map, aes(fill=Sales)) +
    
    # Add the data for the labels
     # the aes defines the x and y cordinates for longitude and latitude
     # colour = white defines the text color of the labels
     # fill = dataset$Color defines the label color according to the column labeled Color
     # label = dataset$Change defines the text wording of the label
     # size = 3 defines the size of the label text
     geom_label( aes(x=Longitude, y=Latitude), 
      colour="white", 
      fill=dataset$Color, 
      label=dataset$Change, size=3
      ) +
    
    # define the x and y limits for the map
     expand_limits(x = states_map$long, y = states_map$lat) +
    
    # define the color gradient for the state images
     scale_fill_gradient( low = "dark grey", high = "#115a9e") +
    
    # remove all x and y axis labels
     labs(x=NULL, y=NULL) +
    
    # remove all grid lines
     theme_classic() +
    
    # remove other elements of the graph
     theme(
      panel.border = element_blank(),
      panel.background = element_blank(),
      axis.ticks = element_blank(),
      axis.text = element_blank()
      )

    After adding the R script press the execute button to reveal the map.

    Paste R Script
    Paste R Script

    Final Map Product
    Final Map Product

    Notice how we have data included for Alaska and Hawaii but those states are not drawn.  We want to remove the Alaska and Hawaii data points.  Add the StateName field to the Page Level Filters and then click Select All.  Now, un-check the boxes next to Alaska and Hawaii.  The data is now clean and the map correctly displays only the continental United States.

    Page Level Filters
    Page Level Filters

    Here is the filtered final map product.

    Filtered Final Map
    Filtered Final Map

    Thanks for following along.  I hope you enjoyed this tutorial.  Please share if you liked this content.  See you next week.

     

  • Fixing Measure Madness

    Fixing Measure Madness

    Often times when you’re working with large data models you will have multiple tables with many relationships.  It could be complex maybe you’ve seen something like the following:

    Large Data Model
    Large Data Model – Photo Credit ( www.biinsight.com )

    Once all the tables have been loaded the manic measure building begins to support all the different visuals.  A couple of sums here, a number of calculates over there, and boom, a beautiful report.  You stand back and survey the work and realize you’ve built measures all over the place, in different tables, maybe even stuck a couple of measures in the wrong place.  Whoops.

    Tons of Measures
    Tons of Measures

    Maybe we should think about cleaning things up a bit, if only there was a way to group the measures.  How do I group my measures?  I’m glad you asked.  With a little trickery we can make a measure table.  Let’s begin.

    First we will load a little data.  For this tutorial we will simply copy and paste in some data.

    Note: For the full tutorial on manually entering in data visit this page.

    On the Home ribbon click the Enter Data button.  Copy in the table below into the Create Table window.  Rename the table Sales Data and click Load to exit.

    Salesman Item Unit Sales Revenue
    Salesman 3 Item 4 405 1357
    Salesman 1 Item 3 339 1649
    Salesman 1 Item 3 315 1332
    Salesman 3 Item 3 418 1531
    Salesman 1 Item 3 482 1633
    Salesman 2 Item 4 448 1676
    Salesman 1 Item 4 391 1432
    Salesman 2 Item 1 341 1539
    Salesman 3 Item 1 419 1482
    Salesman 2 Item 4 414 1610
    Salesman 1 Item 4 351 1670
    Salesman 3 Item 3 449 1795

    Manually Enter Data
    Manually Enter Data

    Upon loading our data table we now have the following fields.

    Fields of Data
    Fields of Data

    Now, let’s make a measure that calculates the revenue per unit.  On the Home ribbon click the New Measure button and enter the following DAX measure.

    Revenue Per Unit = SUM('Sales Data'[Revenue]) / SUM('Sales Data'[Unit Sales])

    Next, make a table with the following fields.

    Salesman Table
    Salesman Table

    Great! but, as we all know this is how the measure madness begins.  From here we refine and finesse the data to craft the data story, and end up with tons of additional tables and measures.

    Pro Tip: You can use the search window at the top of the Fields window to help you find buried measures or fields of data. 

    Using Search in Fields
    Using Search in Fields

    Let’s make the measure table.  Start by clicking Enter Data on the Home ribbon.  Rename the new table to My Calcs, and rename Column1 to Calcs.   You don’t have to re-name column1, but since I’m OCD about my data I like to rename the column to the same name as the table.  Then click Load to exit the screen.

    Measure Table Load
    Measure Table Load.

    We now have a new table labeled My Calcs with one column labeled Calcs.  Next highlight the measure we created Revenue Per Unit.  Then on the Modeling ribbon change the home table from Sales Data to My Calcs.  This will move the measure.

    Home Table for Measure
    Home Table for Measure

    Right click on the Calcs column in the My Calcs table and then select Hide.

    Hide Calcs Column
    Hide Calcs Column

    Next Save and then reopen the document (it’s a Microsoft thing I guess).  After the document has reopened the My Calcs table has changed it’s icon from a table to a Measure icon.

    Completed Measures Table
    Completed Measures Table

    For kicks and giggles add the following measure to the My Calcs table.

    Total Revenue = SUM('Sales Data'[Revenue])

    Ok, one more.

    Total Unit Sales = SUM('Sales Data'[Unit Sales])

    There you go.  A very straight forward approach to cleaning up all the random measures in your data model.  I have found that when others team members are working with your data model this helps other people understand which fields have been calculate and which ones were imported via a query.  This also helps you group logical calculations, further creating clarity within your data model.

    If you want to read up more on making measure tables check out this great site (also linked below).  In addition to walking you through creating a measure table it also explains how to make a measure table when using direct query mode.  As the article explains, while you are in direct query mode you are unable to manually enter data.  Nice job, Soheil Bakhshi, well done.

    How to Define A Measure Table in Power BI Desktop

    If you want to take your DAX skills to the next level, try jumping into this book by Rob Collie and Avichal Singh.  It’s an easy read but very insightful.

    If you liked this tutorial make sure you share.  See you next week!

     

  • Digging Deeper with R Visuals for PowerBI

    Digging Deeper with R Visuals for PowerBI

    Back by popular demand, we have another great tutorial on using R visuals.  There are a number of amazing visuals that have been supplied with the PowerBI desktop tool.  However, there are some limitations.  For example you can’t merge a scatter plot with a bar chart or with a area chart.  In some cases it may be applicable to display one graph with multiple plot types.  Now, to be fair Power BI desktop does supply you with a bar chart and line chart, Kudos Microsoft, #Winning…. but, I want more.

    This brings me to the need to learn R Visuals in PowerBI.  I’ve been interested in learning R and working on understanding how to leverage the drawing capabilities of R inside PowerBI.  Microsoft recently deployed the R Script Showcase, which has excellent examples of R scripts.  I took it upon myself to start learning.  Here is what I came up with.

    R Plot in PowerBI Desktop
    R Plot in PowerBI Desktop

    This is an area plot in the background, a bar chart as a middle layer and dots for each bar.  The use case for this type of plot would be to plot sales by item number,  sales are in the dark blue bars, and the price is shown as the light blue dots.  The area behind the bars represent a running total of all sales for all items.  Thus, when you reach item number 10, the area represents 100% of all sales for all items listed.

    If you want to download my R visual script included in the sample pbix file you can do so here.

    Great, lets start the tutorial.

    First you will need to make sure you have installed R on your computer.  To see how to do this you can follow my earlier post about installing R from Microsoft Open R project.  Once you’ve installed R open up the R console and enter the following code to install the ggplot2 package.

    install.packages("ggplot2")

    Install ggplot2 Code
    Install ggplot2 Code

    Once complete you can close the R console and enter PowerBI Desktop.  First, we will acquire some data to work with.  Click on the Home ribbon and then  select Enter Data.  You will be presented with the Create Table dialog box.  Copy and paste the following table of information into the dialog box.

    Item Sales Price Customer
    1 100 20 Customer A
    2 75 25 Customer A
    3 20 30 Customer A
    4 18 15 Customer A
    5 34 26 Customer A
    6 12 23 Customer A
    7 20 22 Customer A
    8 15 19 Customer A
    9 10 17 Customer A
    10 8 26 Customer A
    1 120 21 Customer B
    2 80 24 Customer B
    3 62 33 Customer B
    4 10 15 Customer B
    5 12 26 Customer B
    6 60 24 Customer B
    7 20 23 Customer B
    8 10 20 Customer B
    9 8 16 Customer B
    10 7 20 Customer B

    Rename your table to be titled Data Sample.

    datatable
    Data Sample Table

    Click Load to bring in the data into PowerBI.

    Next, we will need to create a cumulative calculated column measure using DAX.  On the home ribbon click the New Measure button and enter the following DAX expression.

    Cumulative = CALCULATE(  sum('Data Sample'[Sales] ) ,   FILTERS(  'Data Sample'[Customer] ) ,  FILTER( all( 'Data Sample' )  ,  'Data Sample'[Item] <= MAX( 'Data Sample'[Item] ) ) )

    This creates column value that adds all the sales of the items below the selected row.  For example if I’m calculating the cumulative total for item three, the sum() will add every item that is three and lower.

    Now, add the R visual by clicking on the R icon in the Visualizations window.

    Note: There will be an approval window that will require you to enable the R script visuals.  Click Enable to proceed.

    Enable R Visuals
    Enable R Visuals

    While selecting the R visual add the following columns to the Values field in the Visualization window.

    Add Column Data
    Add Column Data

    Note: After you add the columns to the Values the R visual renders a blank image.  Additionally, there is automatic comments entered into the R Script Editor (the # sign is a designation that denotes a text phrase).

    Next, enter the following R code into the script editor.

    library(ggplot2)   # include this package to use Graphing functions below
    
    ggplot(dataset, aes(xmin=1, x=Item)) +    # Initialize ggplot function, define the x axis with Item data
     geom_ribbon(fill=c("#D7DDE2"),           # Set the color of the Area Plot,
     aes( ymin=0, ymax=Cumulative )) +        # Define the Y-Axis data
     geom_bar(fill=c("#21406D") ,             # Define the color of the Bars
     stat = "identity" ,      # Define the Statatics property of the bars - This is a required field
     width=.6 ,               # Change the bar width to 60% - 1 would be full bar width
     aes( x=Item, y=Sales )) +          # Define the X and Y axis for bars
     geom_point( color=c("#809FFF"),    # Define the color of the dots
     size=4,                  # Define the dot size
     aes( x=Item, y=Price )) +          # Define the X and Y axis values
     theme_classic(base_size=18) +      # Remove unwanted items from plot area such as grid lines and X and Y axis lines, Change font size to 18
     theme( axis.title.x = element_text(colour = "dark grey"),     # Define the X axis text color
     axis.title.y = element_text(colour = "dark grey")) +          # Define the Y axis text color
     labs( x="Item Number", y="Sales")                             # Define the labels of the X and Y Axis

    Press the execute R Script button which is located on the right side of the R Script Editor bar.

    Execute R Script Editor Button
    Execute R Script Editor Button

    The R Script will execute and the plot will be generated.

    R Plot Generation
    R Plot Generation

     

    Great, we have completed a R visual.  So what, why is this such a big deal.  Well, it is because the R Script will execute every time a filter is applied or changed.  Lets see it in action.

    Add a slicer with the Customer column.

    Add Customer Slicer
    Add Customer Slicer

    Notice when you select the different customers, either A or B the R script Visual will change to reflect the selected customer.

    Customer B Selected
    Customer B Selected

    Now you can write the R script code once and use the filtering that is native in PowerBI to quickly change the data frame supporting the R Visuals.

    As always, thanks for following along.  Don’t forget to share if you liked this tutorial.

    Want to learn more about PowerBI and Using DAX.  Check out this great book from Rob Collie talking the power of DAX.  The book covers topics applicable for both PowerBI and Power Pivot inside excel.  I’ve personally read it and Rob has a great way of interjecting some fun humor while teaching you the essentials of DAX.

  • Measures – Intelligent Card Visual – Using DAX

    Measures – Intelligent Card Visual – Using DAX

    As I have been exploring PowerBI and building dashboards I have noticed that often the visuals can obscure your data.  As you click on different visuals there is a need to highlight different pieces of data.  Take for example the following dashboard:

    Sample Visual Example
    Sample Visual Example

    Notice the different car types in the bar chart.  As you click on each vehicle type, Diesel, Hatchback, etc.. you expect the data to change accordingly.  In some cases it is helpful to present a card visual to show the user what you selected and any relevant data points you want to highlight.  For example if I select the Diesel vehicle type I may want to know the average sales amount, total sales in dollars, or number of units sold.  This is where we can build specific measures that will intelligently highlight selected data within your PowerBI visual.

    Here is a sample of what we will be building today:

    lets begin with starting with some data.  In honor of your news feed being bombarded with Pokemon Go articles lets enter some data on Pokemon characters.

    We will enter our data manually.  For a full tutorial on manually entering in data visit here.

    Click the Enter Data button on the Home ribbon and enter the following information into the displayed table.

    Pokemon XP
    Pikachu 1200
    Weedle 650
    Pidgey 800
    Golbat 300

    Rename the table to Characters.  Once you are finished entering in the data it should look like the following:

    Create Table of Characters
    Create Table of Characters

    Click Load to continue.

    Start to examine your data by building a table visual.

    Table Visual
    Table Visual

    Next add a Bar chart.

    Bar Chart
    Bar Chart

    Note: I added the XP column twice.  Once to the Value attribute and to the Color Saturation.  This enhances the look of your visual by coloring the bars with a gradient.  The largest bar will have the darkest color, and the smallest bar will have the lightest color.

    Next, we will begin building some measures.  The first measure will be a total of all the experience points (XP) for each character. Click the New Measure button on the Home ribbon and enter the following DAX expression:

    Total XP = Sum(Characters[XP])

    Now, add a Card visual and add the new measure we created Total XP.

    Total XP Card Visual
    Total XP Card Visual

    This measure totals all the experience points for all the selected characters within the visual.  Since all characters are now selected the total XP for all characters is 2,950.

    The next, and final measure, will be the intelligent card.  For this measure we want to display the characters name when we select them in the bar chart.  Click the New Measure button on the Home ribbon and enter the following DAX expression:

    Character(s) = IF( DISTINCTCOUNT(Characters[Pokemon]) = 1 , FIRSTNONBLANK('Characters'[Pokemon],'Characters'[Pokemon]) , DISTINCTCOUNT('Characters'[Pokemon]) & " Selected")

    Update: As of Mid 2017 Microsoft introduced a new DAX expression called SELECTEDVALUE which greatly simplifies this equation.  Below is an example of how you would change the DAX equation to use SELECTEDVALUE.

    Selected = SELECTEDVALUE(  Characters[Pokemon],  DISTINCTCOUNT(  Characters[Pokemon]  )  &  " Selected" )

    Explanation of this measure:

    This measure first checks to see how many distinct items are in the column Pokemon of our dataset.  If there is only one selected character then we will display the FIRSTNONBLANK character, which will be the name of our selected character.  If there are more than one characters selected. The measure will count the number of characters selected and return a text string with the count and the word SelectedThus, showing us how many items have been selected.

    Add the measure titled Character(s) to a card visual.

    Add Character Card Visual
    Add Character Card Visual

    We can now see that there are 4 characters selected.  Clicking on Pikachu in the bar chart resolves with the character’s name being displayed and the XP of Pikachu being displayed in the Total XP card visual.

    Selecting Pikachu
    Selecting Pikachu

    You can select multiple items by holding down Ctrl and clicking multiple items in the bar chart.

    Selecting Pikachu and Pidgey

    Well, that is it.  I hope you enjoyed this Pokemon themed tutorial.  Thanks for visiting.

    Want to learn more about PowerBI and Using DAX.  Check out this great book from Rob Collie talking the power of DAX.  The book covers topics applicable for both PowerBI and Power Pivot inside excel.  I’ve personally read it and Rob has a great way of interjecting some fun humor while teaching you the essentials of DAX.

  • Measures – Month to Month Percent Change

    Measures – Month to Month Percent Change

    I had an interesting comment come up in conversation about how to calculate a percent change within a time series data set.  For this instance we have data of employee badges that have been scanned into a building by date.  Thus, there is a list of Badge IDs and date fields.  See Example of data below:

    Employee ID and Dates
    Employee ID and Dates

    Looking at this data I may want to understand an which employees and when do they scan into a building over time.  Breaking this down further I may want to review Q1 of 2014 to Q1 of 2015 to see if the employee’s attendance increased or decreased.

    Here is the raw data we will be working with, Employee IDs Raw Data.  Our first step is to Load this data into PowerBI.  I have already generated the Advanced Editor query to load this file.  You can use the following code to load the Employee ID data:

    let
     Source = Csv.Document(File.Contents("C:\Users\Mike\Desktop\Employee IDs.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
     #"Promoted Headers" = Table.PromoteHeaders(Source),
     #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Employee ID", Int64.Type}, {"Date", type date}}),
     #"Sorted Rows1" = Table.Sort(#"Changed Type",{{"Date", Order.Ascending}}),
     #"Calculated Start of Month" = Table.TransformColumns(#"Sorted Rows1",{{"Date", Date.StartOfMonth, type date}}),
     #"Grouped Rows" = Table.Group(#"Calculated Start of Month", {"Date"}, {{"Scans", each List.Sum([Employee ID]), type number}})
    in
     #"Grouped Rows"

    Note: I have highlighted Mike in red because this is custom to my computer, thus, when you’re using this code you will want to change the file location for your computer.  For this example I extracted the Employee ID.csv file to my desktop. For more help on using the advanced editor reference this tutorial on how to open the advance editor and change the code, located here

    Next name the query Employee IDs, then Close & Apply on the Home ribbon to load the data.

    Close and Apply
    Close and Apply

    Next we will build a series of measures that will calculate our time ranges which we will use to calculate our Percent Change (% Change) from month to month.

     

    Now build the following measures:

    Total Scans, sums up the total numbers of badge scans.

    Total Scans = SUM('Employee IDs'[Scans])

    Prior Month Scans, calculates the sum of all scans from the prior month.  Note we use the PreviousMonth() DAX formula.

    Prior Month Scans = CALCULATE([Total Scans], PREVIOUSMONTH('Employee IDs'[Date]))

    Finally we calculate the % change between the actual month, and the previous month with the % Change measure.

    % Change = DIVIDE([Total Scans], [Prior Month Scans], blank())-1

    Completing the new measures your Fields list should look like the following:

    New Measures Created
    New Measures Created

    Now we are ready to build some visuals.  First we will build a table like the following to show you how the data is being calculated in our measures.

    Table of Dates
    Table of Dates

    When we first add the Date field to the chart we have a list of dates by Year, Quarter, Month, and Day.  This is not what we want.  Rather we would like to just see the actual date values.  To change this click the down arrow next to the field labeled Date and then select from the drop down the Date field.  This will change the date field to be viewed as an actual date and not a date hierarchy.

    Change from Date Hierarchy
    Change from Date Hierarchy

    Now add the Total Scans, Prior Month Scans, and % Change measures.  Your table should now look like the following:

    Date Table
    Date Table

    The column that has % Change does not look right, so highlight the measure called % Change and on the Modeling ribbon change the Format to Percentage.

    Change Percentage Format
    Change Percentage Format

    Finally now note what is happening in the table with the counts totaled next to each other.

    Final Table
    Final Table

    Now adding a Bar chart will yield the following.  Add the proper fields to the visual.  When your done your chart should look like the following:

    Add Bar Chart
    Add Bar Chart

    To add a bit of flair to the chart you can select the Properties button on the Visualizations pane.  Open the Data Colors section change the minimum color to red, the maximum color to green and then type the numbers in the Min, Center and Max.

    Changing Bar Chart Colors
    Changing Bar Chart Colors

    Well, that is it, Thanks for stopping by.  Make sure to share if you like what you see.  Till next week.

    Want to learn more about PowerBI and Using DAX.  Check out this great book from Rob Collie talking the power of DAX.  The book covers topics applicable for both PowerBI and Power Pivot inside excel.  I’ve personally read it and Rob has a great way of interjecting some fun humor while teaching you the essentials of DAX.

  • Measures – Dynamic Percent Change – Using DAX

    Measures – Dynamic Percent Change – Using DAX

    This tutorial will produce a measure that will dynamically calculate a percent change every time an item is selected in a visual.  The previous tutorial can be found here.  In the previous tutorial we calculated the percent change between two time periods, 2014 and 2013.  In practice it is not always desirable to force your measure to only look at two time periods.  Rather it would be nice that your measure calculations change with changes in your selections on visuals.  Thus, for this tutorial we will add some dynamic intelligence to the measures. Below is an example of what we will be building:

    First here is the data we will be using.  This data is the same data source as used in the previous % change tutorial.  To make things easy I’ll give you the M code used to generate this query.  Name this query Auto Production.

    let
     Source = Web.Page(Web.Contents("https://en.wikipedia.org/wiki/Automotive_industry")),
     Data0 = Source{0}[Data],
     #"Removed Columns" = Table.RemoveColumns(Data0,{"Change", "Source"}),
     #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Production", Int64.Type}, {"Year", Int64.Type}})
    in
     #"Changed Type"

    Note: the code shown above should be added as a blank query into the query editor.  Add the code using the Advanced Editor.  Another tutorial showing you how to add advanced editor code is here

    Once you’ve loaded the query called Auto Production.  The Field list should look like the following:

    Production
    Auto Production

    Next add a Table with Production and Year.  this will allow us to see the data we are working with.  When you initially make the table the Year and Production columns are automatically summed, thus why there is one number under year and production.

    Table of Data
    Table of Data

    Rather we want to see every year and the production values for each of those years.  To change this view click on the triangle in the Values section of the Visualizations pane.  This will reveal a list, in this list it shows that our numbers are aggregated by Sum change this to Don’t Summarize.

    Change to Don't Summarize
    Change to Don’t Summarize

    Now we have a nice list of yearly production levels with a total production at the bottom of our table.

    Table of Production Values by Year
    Table of Production Values by Year

    Next we will build our measure using DAX to calculate the percent changes by year.  Our Calculation for % change is the following:

    % Change  =  ( New Value / Old Value ) - 1

    Below is the DAX statement we use as our measure.  Copy the below statement into a new measure.

    % Change = 
    DIVIDE(
     CALCULATE(
     SUM('Auto Production'[Production]), 
     FILTER('Auto Production','Auto Production'[Year]=MAX('Auto Production'[Year])
     )
     ), 
     CALCULATE(
     SUM('Auto Production'[Production]),
     FILTER('Auto Production','Auto Production'[Year]=MIN('Auto Production'[Year])))
    ,0) 
    - 1

    I color coded the DAX expression between the two equations to show which parts correlated.  Note we are using the DIVIDE function for division.  This is important because if we run into a case where we have a denominator = 0 then an error is returned.  Using DIVIDE allows us to return a zero instead of an error.

    Next add our newly created measure as a Card.

    Add Card
    Add Card

    Change the % Change measure format from General to Percentage, do this on the Modeling ribbon under Formatting.

    Change Measure Formatting
    Change Measure Formatting

    Next add a slicer for Year. 

    Slicer for Year
    Slicer for Year

    Now you can select different year and the % change will automatically change based on our selection.  The % change will always select the smallest year’s production and the largest year’s production to calculate the % Change.   By Selecting the Year 2013 and 2007, the percent change is 19.15%.  The smallest year is 2007 and the largest is 2013.

    Selecting Two Years
    Selecting Two Years

    If we select a year between 2013 and 2007 the measure will not change.

    Multiple Years Selected
    Multiple Years Selected

    The measure will only change when the starting and ending years are changed.  By selecting the year 2014, the measure finally changes.

    Selecting Additional Year
    Selecting Additional Year

    Pretty cool wouldn’t you say?  Thanks for taking the time to walk through another tutorial with me.

    Want to learn more about PowerBI and Using DAX.  Check out this great book from Rob Collie talking the power of DAX.  The book covers topics applicable for both PowerBI and Power Pivot inside excel.  I’ve personally read it and Rob has a great way of interjecting some fun humor while teaching you the essentials of DAX.