Using R Visuals in Power BI

R Feature Image

For those of you who have been hanging around PowerBI for a while you have likely heard about integration with R visuals.   No, this isn’t a twisted dream where Power BI now ships with Pirates… Rather, this has been a highly untapped feature.

In a brief summary R or as it is known on its site R Project for Statistical Computing, is a statistical open source software package that enables mathematicians, statisticians, or data scientists to quickly calculate complex analysis.  It is the tool of us super nerds.  Now R by it’s self isn’t super powerful, it’s the numerous packages that have been developed by people way smarter than me that can do very amazing functions.  Packages include functions for forecasting, math functions, statistic functions and best of all charting functions.  Well, this may be fine and dandy so what?  Well here is the best part.  Microsoft has chosen to integrate and support various releases of R into it’s tools.  For example R can now be leveraged within SQL server 2016, and now visuals built in R can be leveraged in Power BI Desktop and PowerBI.com.  R can also be used to transform and prepare data during a date set load.

The important note here is that Microsoft has released it’s own open version of R.  This distribution is called MRAN, and can be found at this site.  The MRAN has been slightly tweaked from the R Project.  In the Microsoft version of R, (which I will refer to as MRAN) there has been stability fixes and the improved performance (added Multi threaded Performance).

So enough back ground lets fire this thing up.

First you will need to install the latest version of MRAN.

Navigate to the following address https://mran.microsoft.com/ Click the Download button found  at the top middle of the page.

mran-download-page
mran download page

Note: At the time of this Tutorial the current version of MRAN is 3.3.1, it is likely that this will change since Microsoft is constantly updating this site and releasing new stabilized & enhanced performance versions of R.

Select the platform that you will be using to install MRAN on.  I’m using windows, thus I’ll be downloading and installing the top installation version.

 

windows-platform-of-mran
Windows Platform of MRAN

Note: If you need additional installation help you can follow / read the documentation provided by Microsoft.  It can be found here.

In order to keep this tutorial brief I will assume you know how to install software and have made it through the MRAN installation successfully.  Once installed you should have the following program installed in your start menu.

Installation of R
Installation of R

Run the new installation of R.  The R installation will open up a console window.

R Console
R Console

At the bottom of the console window is a red line where you enter commands.  Enter the following code and press enter.

install.packages(“corrplot”)

This will install the proper R package that we will use later in PowerBI.  After running this line of code the console will download the correct package and install it on your computer.

Install corrplot Function
Install corrplot Function

At this time you can close the R console program.

Now, open up PowerBI Desktop.  Once in PowerBI desktop click on the File Button at the top left hand part of the screen.  Next, Click Options and Settings.

Powerbi Options and Settings
PowerBI Options and Settings

Then click on the Options button.

Options Button
Options Button

Under the Global options menu on the left verify that your new installation of MRAN is listed.  PowerBI should automatically detect the installation and show the installation with the current version number in the home directory:

R Home Directory
R Home Directory

Seeing the listed installation in the Home Directory verifies that R has been properly installed on your computer. Clicking OK will close the window.

Data Time!!  Below is the M Language that can be used in your Query Editor.  Copy the code below and enter it into the Advanced Editor found in the Query Editor.

let
 Source = Excel.Workbook(Web.Contents("https://powerbi.tips/wp-content/uploads/2016/09/CarDetails.xlsx"), null, true),
 CarData_Table = Source{[Item="CarData",Kind="Table"]}[Data],
 #"Changed Type" = Table.TransformColumnTypes(CarData_Table,{{"Year", Int64.Type}, {"Make", type text}, {"Model", type text}, {"Liters", type number}, {"Hp", Int64.Type}, {"Cylinders", Int64.Type}, {"MPG City", Int64.Type}, {"MPG Hwy", Int64.Type}})
in
 #"Changed Type"

Note: If you want to learn how to enter M language code into the Query Editor follow this Tutorial.

Once you have pasted the code above into the Query Editor it should look like the following:

Advanced Editor
Advanced Editor

Clicking Done will close the Advanced Editor and you will have data loaded into the Query Editor.  You must have an internet connection to connect to this data.  Rename your query to Car Data.  Then on the Home ribbon click Close & Apply to load the data into the data model.

Car Data in Query Editor
Car Data in Query Editor

Generate a simple table visual to see our data in table form:

Table Visual
Table Visual

Add an R visual by clicking the R inside the Visualizations bar.  When you click on the R visual you will see a pop-up, click Enable to proceed.

Enable R Visuals
Enable R Visuals

Doing this will open up a visual pane on the page and reveal an R script editor at the bottom of the page window.

R Script Editor
R Script Editor

While keeping the R visual selected add the following fields to the visual under the Values field:

Add Columns to R Visual
Add Columns to R Visual

After adding these fields the R Script Editor will update and reveal code which informs you that your data from the selected columns will be added to a dataset.

R Code Script Editor
R Code Script Editor

Next add the following code into the white area below the #dataset <- unique(dataset) statement.

require(“corrplot”)
library(corrplot)

M <- cor(dataset)

corrplot(M, method = “circle”, tl.cex=0.6, tl.srt = 45, tl.col = “black”, type= “upper”, order=”hclust”)

This loads a package called corrplot which allows you to apply a graph that has a correlation plot between metrics.  The M <- cor(dataset), takes your data runs a function called cor and then saves the results into a new variable called M.

Next click the Play button icon found on the right of the grey bar on the R Script Editor.

Running the R Script
Running the R Script

Success! You have completed a correlation plot using R within PowerBI.  Nice job.

Final Plot
Final Plot

Bonus:

If you want to get fancy with this correlation plot you can change the circles to the actual correlation values.  Change the last line of the R Script Editor code to the following and press the run script button:

corrplot(M, method = “number”, tl.cex=0.6, tl.srt = 45, tl.col = “black”, type= “upper”, order=”hclust”)

This removes the circles and then populates the correlation plot with numerical values representing the correlation between the various data features.

Correlation Numbers
Correlation Numbers

The blue numbers represent values that have a positive correlation, while the red numbers represent a negative correlation.  In practical terms the higher the Horsepower  (HP) of the vehicle the lower the Miles per Gallon (MPG) that are realized.

 

7 Comments

  1. Thank you so much! A few tips/trips I came across.
    1. When I copied some of your code the parenthesis ” were not being recognized in the R script so I had to delete them and reinput them. Maybe it had something to do with the font used on the webpage?
    2. When I tried to run the correlation plot on my own data it was erroring out. Your code work, but it was due to my data having some BLANK values. So all I did was filter the blanks out in PowerBI Visual filter and it worked!

    Thanks again!

    • Thanks so much for the input. I will check into the copy and pasting the R code. I was able repeat the problem in an older version of internet explorer, however, when I use the newest version of chrome I was able to successfully copy over the code. Maybe try doing a copy in Chrome.

  2. Hi Mike, Thanks so much for your site content and the post here. I’m finding my correlation tables are providing blank figures for my pairs? I tried scrubbing my data for blanks, making sure types were proper, using the “number” method + tested in excel. Each without any luck. Any other suggestions? Thanks again and Regards! John

    • John, thanks for the comment. I was able to replicate your issue. There is data that is being brought in from your excel workbook that has nulls. when the data is passed to the R Script, the visual is unable to handle null values and therefore will error out. To fix this you will need to modify the query for bringing in data and do a replace function. Replace all nulls with a 0. This fixes this type of issue. For reference the replace function can be found on the Transform ribbon in the Any Column section.

  3. Thank you Mike for your Post. I’m tryigng to get de data but it seems that your data source is local “C:\Users\Mike\AppData\Local\Microsoft\Windows\INetCache\IE\5EX6J9L0\CarDetails[1].xlsx”
    Can you share your file or the internet url.

    • Great catch, I did make a mistake. I have updated the post to reflect the proper changes. You should be able to use the following query to get the date that is talked about above.

      let
      Source = Excel.Workbook(Web.Contents(“https://powerbi.tips/wp-content/uploads/2016/09/CarDetails.xlsx”), null, true),
      CarData_Table = Source{[Item=”CarData”,Kind=”Table”]}[Data],
      #”Changed Type” = Table.TransformColumnTypes(CarData_Table,{{“Year”, Int64.Type}, {“Make”, type text}, {“Model”, type text}, {“Liters”, type number}, {“Hp”, Int64.Type}, {“Cylinders”, Int64.Type}, {“MPG City”, Int64.Type}, {“MPG Hwy”, Int64.Type}})
      in
      #”Changed Type”

  4. Thank you for this,worked flawlessly for be but I would prefer one of the axis to be used for a summarizing variable, so that each row would show the correlation coefficients for a specific value of another variable. Currently I have to use a drill down filter to create multiple versions of the visual, but since I don’t need each row of the visual (usually you only try to explain one variable through correlation of multiple variables), I would prefer the rows to be used as a summarizing category.

    I have no experience in R or Power BI, but if anyone finds this useful and could provide guidance on how to achieve this I would be very grateful!

Comments are closed.