Pareto Charting in PowerBI

Pareto Charts in PowerBI
Pareto Charts in PowerBI

The Pareto chart is a handy visual, but is not so easy to build in either excel or PowerBI.  In a Pareto chart, information is provided about an individual product or category as a bar, and a cumulative scale as a line which compairs all bars.  This type of visual can be extremely helpful when conducting failure mode analysis, causes of a problem, or even product portfolio balances.  For some more information on Pareto charts you can learn more here or here.  If you’re interested in building a Pareto chart in excel, I have found this post from Excel Easy to be helpful.

To give you a little teaser of what we will be building today, below you will see an image of the final Pareto chart.  On the left side we have sales of units, and on the right is the cumulative percent of all sales.  Using the Pareto chart a user has the ability to see which products comprise the majority of your sales.  For example, the first 4 bars total approximately 50% of all sales.

Pareto Final Product
Pareto Final Product

Alright, let’s get started.

Open up PowerBI Desktop, Click the Get Data button on the Home ribbon and select Blank Query.  Click Connect to open the Query Editor.  Click Advanced Editor on the View ribbon.  While in the Advanced Editor paste the following code into the editor window.

Note: If you need some more help loading the data follow this tutorial about loading data using the Advanced Query Editor.  This tutorial teaches you how to copy and paste M code into the Advanced Editor.

let
 Source = Excel.Workbook(Web.Contents("http://powerbi.tips/wp-content/uploads/2016/10/Sample-Data.xlsx"), null, true),
 Table1_Table = Source{[Item="Table1",Kind="Table"]}[Data],
 #"Changed Type" = Table.TransformColumnTypes(Table1_Table,{{"Item", type text}, {"Sales", Int64.Type}, {"Segment", type text}})
in
 #"Changed Type"

Rename the Query to Data.  Once you’ve completed the data load your data should look like the following.

Load Data to Query Editor
Load Data to Query Editor

On the Home ribbon click Close & Apply to complete the data load.

Close and Apply
Close and Apply

Let’s begin with a little exploration of our data.

Pro Tip: When I am building reports I often load the data and then immediately start building a couple of tables and slicers.  It helps me understand how my data reacts to the slicers and helps me determine how to shape the data so that the visuals will work properly.  For this example, we only have one table, but when loading data things can get rather complex due to loading multiple tables with multiple relationships.

Add a Slicer for the Segment.  Enhance the look of the slicer by changing it from a vertical to a horizontal slicer.  While the slicer is highlighted, click the Paint Roller expand the General section and change the orientation from vertical to Horizontal.

Segment Slicer
Segment Slicer

Repeat the same process to add a Slicer for the item field.

Item Slicer
Item Slicer

Next, add a table view of all the fields.  Start with Segment, then Item and finally add Sales to the Table Visual.

Data Table
Data Table

Notice, now that we added all the Fields, there are a number of repeating values.  We have Category 1 and Item 1 repeated 9 times.  In some cases, it will be necessary to have this level of data brought into the data model within PowerBI.  A common reason is that this level of granularity is required for other report pages, or visuals.  It is OK to bring large amounts of data, but as a method of best practice it is recommended that you bring in the data required to support the visuals.

Now, to address these multiple items that we see in our data.  In the sample Pareto image provided at the beginning of this Tutorial we only had one bar for Category 2 Item 3.  Thus, we need to summarize each grouping of every Category and Item combination.  To do this we will construct a summary table.

First, we will create a unique Key that will be used to summarize each combination of Category and Item pair.  Click the bottom half of the New Measure button located on the Home ribbon.

Calculated Column
Calculated Column

Enter the following DAX expression.  This new column titled Blend will be the unique Key that is utilized to summarize the data.

Blend = Data[Segment]  &  "-"  &  Data[Item]

Select the Modeling ribbon and then click on the New Table button.  Enter the following DAX expression.

Summary = SUMMARIZE('Data', Data[Blend], "Sum Sales", SUM(Data[Sales]) )

For more information on the SUMMARIZE function you can visit the Mircosoft Summarize documentation page.  In this equation we first select the table and in this case it is ‘Data’.  Then the column we want to summarize or group by is the Segment column noted as Data[Blend].  The next field is the title of the summarized field column, noted as “Sum Sales”.  Then DAX function that calculates the Sum of the column labeled Data[Sales], noted as SUM(Data[Sales]).  It is relevant to point out here that the SUMMARIZE function will only work with building a new table and not as a calculated column or measure.

Add a new Table visual to the report and include the two newly created fields from the Summary table.

Summary Table Visual
Summary Table Visual

We have a field titled Blend which is our Key for all the summarized groupings.  Next, we will want to parse out the Segments and Items from this blend column.  We will want to use Category 1 & 2 in a slicer and the same for Items 1 to 5.  Highlight the summary table by clicking the grey space next to the word Summary.  Click the New Column button on the Modeling ribbon and enter the following DAX expression.

Segment = PATHITEM(
   SUBSTITUTE(Summary[Blend], "-" , "|" ),
   1 )

In this expression the Substitute function replaced the dash “-” with a “|” character.  Then the PATHITEM function can then parse the text into segments.  By entering a 1 we select the first item in the sequence.  For our example we only have two items, but when you’re working with file paths you can have multiple items in the path such as “\users\mike\my documents\my folder\”, which would equate to users = position 1, mike = position 2, my documents = position 3, etc..

Add another new column with the following DAX expression for the item column.

Item = PATHITEM( 
  SUBSTITUTE(Summary[Blend], "-" , "|" ),
  2 )

Note: We changed the PATHITEM position from 1 to 2.

Next add the newly created Segment and Item columns to our summary table visual that we created earlier.

Add New Fields
Add New Fields

Nice job so far.  Now we have to modify our slicers to point to the new Item and Segment fields we created in the Summary table.  Select the Segment Slicer Visual and add the Segment Field from the Summary table.

Update Segment Slicer
Update Segment Slicer
Update Item Slicer
Update Item Slicer

Now that we have updated the slicers, we can now can control the table visual made from the Summary table.

Select Category 1 and Items 1 to 3
Select Category 1 and Items 1 to 3

Pro Tip: To select multiple items in a slicer you can hold down the Ctrl button on the key board and click multiple slicer items.  This is how I was able to select Items 1 to 3.

Now we are ready to build the measures that will support the Pareto chart.  Click on the bottom half of the New Measure button on the Home ribbon and select New Column.  Add the following DAX expression to rank all the items in the Summary table.

Ranking = RANKX(  'Summary',   'Summary'[Sum Sales])

Add a measure for the Cumulative total according to the new ranking column we created.  Click the top half of the New Measure button on the Home ribbon.  Add the following DAX expression.

Cumulative Total = CALCULATE(
    SUM( Summary[Sum Sales] ),
    FILTER( ALLSELECTED( Summary ),
        Summary[Ranking] <= MAX( Summary[Ranking] )
    ))

Repeat the add measure process and add a Total measure which will total only the items from the summary table that have been selected in the report view.  Add the following DAX expression.

Total Sales = CALCULATE(
 SUM( Summary[Sum Sales] ) ,
 ALLSELECTED( Summary )
 )

For the last measure, repeat the process to add another measure.  Enter the following DAX expression as a measure.

Cumulative Percent = [Cumulative Total] / [Total Sales]

The Cumulative Percent measure is a calculated as a percentage, thus we need to change this measure’s formatting to percentage.  Click the measure labeled Cumulative Percent then change the Format to Percentage which is found on the Modeling ribbon.

Change Formatting
Change Formatting

Your Summary table should now look like the following.

Updated Fields List
Updated Fields List

To see all the calculations that we just created add all the fields from the Summary table to the Summary table visual we created earlier.

Full Summary Table Visual
Full Summary Table Visual

At last, we are ready to add the Pareto chart.  Add the following fields to the line and stacked column chart.

Add Line and Stacked Bar Chart
Add Line and Stacked Bar Chart

Order the data in descending order by the number of sales by click the visual’s Ellipsis and selecting Sort By Sum Sales.

Sort by Sales
Sort by Sales

This changes the order of the items to make a Pareto chart.

Final Pareto Chart
Final Pareto Chart

Thanks for following along.  Share if you enjoyed this tutorial.

20 Comments

  1. First of all, I should mention that I am new to Power BI so still learning. I am not a data analyst or programmer by any means and I have great respect for what they are able to do. Your tutorial was very good and easy to follow even for a noob like me.

    I guess my comment is really that I can produce a Pareto chart in Excel within a few minutes with no “coding” needed. What I see here, and tried myself in PBI was a fairly lengthy process to produce the same type of chart.

    Is PBI designed to be this way? I thought this would make my data presentations faster to build and more dynamic. Maybe I have a lot more to learn about my data and how to manipulate it properly in the tool to be effective but was really hoping for more intuitive solutions for the “non-analyst” type that are just trying to present data in an interesting and dynamic way.

    • Thanks for your question. You have discovered what I went through when I was beginning to learn PowerBI. At first when your a “Noob” you will find that certain things are difficult. However, as you become more familiar you will feel more at home. PowerBI will become your natural choice for BI reporting. What I can tell you is hang in there and keep learning. It will get easier. Here are my top two items why I enjoy developing reports more in Power BI rather than Excel.
      1. Data refreshing and ETL (Export, Transform, and Loading) can be automated in Power BI and in Power Query for excel. This is huge!! and an must learn. While it may take a little more time on the front end to load the data, but every time you refresh the data or get new data you will be thankful that you have done the hard work upfront.
      2. Interactivity & beauty. Power BI allows you to click on bar charts and filter the data down. This is something that just isn’t possible in Excel. If your really good you can add slicers, filters and other fancy things in excel, but those are a lot of work in order to make your excel reports function as fluidly as Power BI. And, lastly, PowerBI in my opinion just looks better than reports in Excel.

      Good Luck and keep up the good work. Fight the urge to leave PowerBI and keep pushing on. Here are some great materials to get you started. I would highly recommend the book on DAX Learn to Write DAX: A practical guide to learning Power Pivot for Excel and Power BI. This really helped me.

  2. I have a concern here. By making a Calculated Column and not a Measure, the ranking is not dynamic. If I’m doing this by Month/Year over a long period of time, I want my top performers(the 20%) to change every month, because they’re different every month.

    Theoretically, how would you go about this?

  3. Hi,

    While I was following the steps, I encountered following error for Ranking column. Was wondering if there is any missing code in the query:

    “A single value for column ‘Sum Sales’ in table ‘Summary’ cannot be determined. This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count, or sum to get a single result.”

    • I think you might have missed a ) at the end of your statement. I just checked and it seems like it is working fine.

      try using this equation: Summary = SUMMARIZE(‘Data’, Data[Blend], “Sum Sales”, SUM(Data[Sales]) )

    • in the formula the table is inside quotes. remove the quotes it will work.
      formula given:
      Ranking = RANKX(‘Summary’,’Summary'[Sum Sales])
      change needed:
      Ranking = RANKX(‘Summary’,Summary[Sum Sales])

  4. I am trying to implement the above based on the store selected. It works fine when a single store is selected, but if all the stores are selected, the resulting the chart looks different.

  5. Power BI did not let me create the unique column using measure feature… I had to create a new column. When dropping Blend = Data[Segment] & “-” & Data[Item] into a new measure the error message displays: A single value for column ‘Segment’ in table ‘Data’ cannot be determined. This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count or sum to get a single result.

  6. Thank you for the help but it is completely nuts that creating a pareto chart requires all this. It’s like giving a user Visual Studio and telling them they can create any solution they need with it. It’s true but not in the least bit practical. This is major hole Microsoft needs to fill.

  7. Hi, Thank you for the tips. It is a good idea. but I was trying to plot a pareto chart where every bar represent the total of item1, item2, item3 etc. and “category” will show as different colour within one horizontal bar, instead of seperate to few bars. but i am not able to figure out the correct cumulative percentage. i guess there is problem on cumulative total? can you please give me some ideas how to complete this? thanks..

    • Essentially what your asking for would be a new chart type, a stacked column chart with a line. This is technically not supported, however if your willing to venture into R, you can build a visual that will do this. Here is an example of a visual that was created using three different chart types, Line, Bar and points. This type of visual is much more technical and you would need a strong knowledge of ggplot, or some other visualization packages for R. Here is the post addressing this topic: https://powerbi.tips/2016/09/digging-deeper-with-r-visuals-for-powerbi/

  8. Thanks for the post. I would like to calculate the amount of Items that make up to 50% of the sales. Which measure can you use for that?

    • I have recently done this calculation and it is difficult. In order to do this, you would need to first order your items Highest to lowest. Then calculate the % of total sales for each line item. Finally, you would need to create a cumulative total across each row of data. This produces a cumulative percent of business in order of each item, ranked from the highest volume to the lowest volume.

      These steps are complicated and calculation intensive. Thus, I found this is hard to do as a measure and easier to accomplish within the Query Editor. I would recommend that you pre-calculate this in the query editor before creating a measure. This way you can filter a column based on the cumulative % of business column, < .50 or < 50%

  9. In My case model is inside SSAS Tabular cubes. Do I still needs to create Summary Table.
    can I achieve same without Summary table.

    • This is how I was able to create a Pareto chart in this example. I am sure there are other ways of making the same visual using DAX exclusively. Working with the SSAS does have some limitations. Making a Pareto chart is a little more difficult and I like to have that data set up in the ETL process, or by making a summary table.

  10. I replicated all DAX but it doesn´t work as explained. The calculation for every selection of item or segment is to be re-calculated as table show the subtotal of sales. Is not correct to show the values on displayed ranking, cumulative sales and total sales need to be dynamically calculated for any slicer selections. Cumulative percentage is to be matching with selected data on slicers. The “MAX” filter is getting a flat line on cumulative percentage. I used “Earlier” and it shows the cumulative line on chart.

1 Trackback / Pingback

  1. How to Create a Better RMA Dashboard For Mammoth Saving – My quest in SQL, BI & Dynamics GP

Comments are closed.