Tag: Tutorials

  • From Wikipedia to a Colorful Map

    From Wikipedia to a Colorful Map

    For this tutorial we are going to get some real data from the web.  One of the easiest sources to acquire information from is Wikipedia.  I will caveat this by saying, it is easy to get data from Wikipedia, but I don’t know if you can always trust the reliability.  That being said, we are going to acquire the U.S. population and growth rate from 2010 to 2015 from the Wikipedia Web page.

    Materials:

    • Power BI Desktop (I’m using the March 2016 version, 2.33.4337.281) download the latest version from Microsoft Here.
    • Link to the data from Wikipedia, Here.  ( https://en.wikipedia.org/wiki/List_of_U.S._states_by_population_growth_rate )

    Let’s begin.

    Open up Power BI Desktop.  Click on the Get Data button.  On the left of the Get Data menu click Other then select the first item titled Web.  Click Connect to continue.

    Get Data from Web
    Get Data from Web

    In the From Web window enter in the following web address.  You can copy and paste it from below.

    https://en.wikipedia.org/wiki/List_of_U.S._states_by_population_growth_rate

    Click OK to move to the next menu.  After a bit of thinking the Power BI will present the Navigator window.  This is what Power BI has found at that specific web address.  On the left side of the screen there is a folder.  This is the web page folder location that we loaded earlier.  Power BI then intelligently looks through the website code for tables it can distinguish.  By clicking on each table you can see a preview of the data returned on the right side of the window.

    Try clicking on the various tables such as Document, External links, or Table 0.  For our example lets click on Table 0.  Click on the button at the right hand corner labeled Edit.  We are going to slightly modify this data before we load it to the data model.

    Navigator Window
    Navigator Window

    You’ll notice once we load the data there are some items we’d like to remove.  In row #2 the label is District of Columbia, which technically isn’t a state.  Also further down we see in row #25, the entire U.S. population is shown.  Again, we don’t want these values to show, we only want the 50 states.  To remove this data we will use a text filter to remove any item in the Rank column that has a “–” (which is called an em-dash, see note below for more details on how to select this text character).

    Note: There are two kinds of dashes that your computer uses.  One is called the en-dash(-), the second being the em-dash(–).  It is very hard to distinguish the difference between the two dashes.  The image below shows a better contrast when used in Microsoft Word.

    Em-Dash vs. En-Dash
    Em-Dash vs. En-Dash

    The en-dash is shorter than the Em-dash. The Key for the en-dash is next to the number 0 on your keyboard.  To select the em-dash you need to use a bit of Microsoft trickery.  The Em-dash will be presented when you hold the Alt key and type 0151 on a keypad.  This selects the specific ASCII character for the em-dash.  For more information on selecting the em-dash visit here.

    Click the drown down button in the column labeled Rank.  Select the item labeled Text Filters, and then Click Does Not Contain…

    Text Filter on Rank Column
    Text Filter on Rank Column

    Enter in the em-dash code by using Alt 0151 to enter in the correct dash into the Filter Rows dialog box.  Click OK to proceed.

    Enter EM-Dash in Filter Rows Dialog
    Enter EM-Dash in Filter Rows Dialog

    If we entered the correct em-dash we will now be presented with a cleaned list of U.S. states with only numbered items in the Rank column.

    Next we will clean up the query slightly to make it easier to deal with.  Delete the column labeled Rank, and Change.  Rename the query to something a little more meaning full such as US Census.

    Remove Columns
    Remove Columns & Rename Query

    Note:  You can delete a column by pressing the Remove Columns button on the Home ribbon.  A second method is to right click with your mouse on the column you want to remove and selecting Remove.

    Next we will add our own calculated column which will calculate the 2010 to 2015 percent change.  Click the ribbon labeled Add Column and select the first icon on the far left labeled Add Custom Column.  The Add Custom Column dialog box will open. Enter the name for the new column, then by clicking on the columns in the available columns on the right you can build an equation.  For this example we are using the percent change calculation which is the following:

    Percent Change = [New Value / Old Value ]- 1

    Using the columns we imported from Wikipedia we will have the following equation:

    = [2015 estimate] / [2010 Census] - 1
    
    Update: this formula has now changed to 2016 estimate as time has progressed since this first tutorial was posted.
    The new column should have this following formula: = [2016 estimate] / [2010 Census] - 1
    
    

    This inserts a new column with the calculated percent change between the 2010 census and the 2015 census.  Click OK to proceed.

    Add Custom Column
    Add Custom Column

    Finally we want to change the type of data in the % Change column so our data model will operate as expected when producing visuals.  Click the Home ribbon, then click the % Change column.  Change the Data Type: from Any to Decimal Number.  This informs the data Model how to treat the data held the % Change column.  We are finished data modeling and now click Close & Apply on the Home ribbon.

    Now we have all our data loaded into the data model ready to build a map.

    Click the Column labeled State and then click % Change.  This yields a map with circles on it.  Change the visual to a filled map by selecting a different visual, the Filled Map icon (circled in red below).  Doing so produces a shaded map of the US, where each state is colored according to the % Change.

    Filled Map Selection
    Filled Map Selection

    Finally lets add some color to the data. Click the visual’s Format properties (the little paint brush in the visuals window). Expand the Data Colors section by clicking on the title Data colors. Diverging is set to off.  Change it to On. Change the Minimum color to Green, the Center color to Yellow, and the Maximum color to Red.

    Colored Map
    Colored Map

    The states with the largest population change are in Red, while all the states with the smallest population change.

    Please share if you liked this tutorial.  Thanks.

  • Map it, Map it Real Good

    Map it, Map it Real Good

    This tutorial is a real simple mapping exercise.  I was talking with a colleague today about Power BI and I was challenged to map something using latitude and longitude.  I had played with mapping before but not using latitude or longitude.

    I’d have to say if you want to impress someone with your PowerBI skills adding a map is a good way to do so.  Typically this a functionality that you can’t add into excel, well at least not with out some serious effort.

    Alright, here we go..

    Resources for this project are:

    • Power BI Desktop (I’m using the March 2016 version, 2.33.4337.281) download the latest version from Microsoft Here.
    • Excel file with a table in it with our location information that can be downloaded here: Locations Data Set

    After downloading the Locations Data Set, Open up PowerBI and load the Excel file into Power BI.  If you need to learn how to load Excel files you can follow the loading excel tutorial.

    Click the Get Data on the Home ribbon.  Select the first option Excel and click Connect at the bottom of the Get Data window.

    Navigate to the downloaded file called Locations.xlsx and open the file by clicking Open in the bottom right hand corner.

    Next, the navigator window will open.  Select the table (denoted with the grid with a blue top header) called Locations.  Then Click Load to load the data into the data model.

    Navigator Window
    Navigator Window Selection

    Note: there are two different icons in the Navigator window. One is called Locations which is a Table within the Excel document.  While the other is called Sheet1, which is simply the first sheet in the excel workbook.  For Future references it is much easier to make tables in excel and use them to load data in to PowerBI than using just a worksheet.  So whenever possible try to form your data in Excel into Tables.  When loading a table the headers of the table automatically load into the column names in the PowerBI data models.

    We now have loaded the data into a new Table in PowerBI called Locations.

    To make the map check the boxes for Latitude and Longitude.  Power BI intelligently understands that latitude and Longitude are mapping functions and we are now presented with a map with tiny blue dots.

    Map from our Data
    Map from our Data

    Lets add some more data to enhance the map.  We can change the size of the circles at each location by dragging the column called Attenders over to the Values field for this visual.

    Change Bubble Size
    Change Bubble Size

    We have now changed the size of the circles relative to each other to show the number of people that we saw at each location.  To add color to the map drag the column called Event to the Legend option of the visual.  This yields a map that now has each circle with a different color according to the event name.

    Colored Bubble Map
    Colored Bubbles on a Map

    To enhance our visual further we will add a bar chart with the total count of attenders per event.  To do this click any where on the visual page (this will de-select the map visual on the page).  Now click the Event column and then the Attenders column.  This will present you with a table list of events and the corresponding attendees.  Leaving the table visual highlighted click the Stacked Bar Chart which is in the upper left hand corner of the Visualizations window.

    Adding a Bar Chart
    Adding a Bar Chart

    I circled the triple dots on the bar chart.  Click the triple dots and a menu will appear. First click Sort By, then click Attenders.  This will sort the attenders in descending order from the largest amount at Kohl’s Corp. down to Harley Davidson.  Drag the column labeled Event to the visualization option called Legend.  This colors the bar chart.

    Colored Bar Chart
    Colored Bar Chart

    Note: The colors in the bar chart match the colors in the map we made earlier.  This build uniformity in your reports and when your filtering items colors across visuals make sense.

    Take some time to click on each of the bars on the bar chart.  Notice how the map re-draws with only the data for that selected item.  To select multiple bars on the bar chart hold the CTRL button and click on the multiple bars.

    Nice job.  We have finished the mapping tutorial.  Share if you liked it below.

  • Manually Enter Data

    Manually Enter Data

    There are often times when you need a small data set in order to make a visual behave exactly how you want it to.  This may mean you need a small table to represent a range of numbers or text values.

    Here are the Resources for this tutorial:

    • Power BI Desktop (I’m using the March 2016 version, 2.33.4337.281) download the latest version from Microsoft Here.

    To enter your own data Click the Enter Data button on the Home ribbon.

    Enter Data
    Enter Data Button

    Next you are prompted with the Create Table window.  In this window you are given the layout of a unfilled table.  To begin entering data you can click in the first cell in Column one and start entering data.  By pressing enter a new cell will populate below.  You can Rename the column by double clicking the column name.  To add a second column you Click on the symbol next to your existing column.  Finally to edit the table name you can type in the desired table name in the Name input box in the bottom left hand portion of the window.

    Create Table
    Create Table Window

    Finally, you can either to choose to Load the data as is or Edit the data to make additional changes (this can be useful to edit the data types of each column or to populate equations in subsequent columns).  For the sake of this tutorial we will simply load the data.  Click Load to load the data into the data model.

    Now drag over the columns into the page view to begin generating visuals.  By default PowerBI makes a table of data to show you the values you just entered.

    Sales Table
    Visual of Sales Table

    Select the table visual (you know it is highlighted when it has the trim boarder as shown above) and Click the Doughnut Visual.  This transforms the data into a doughnut, and who doesn’t like a nice data doughnut?  Click anywhere in the page to de-select the new doughnut visual.  Add a second table by dragging over he Region and Sales columns.  We can now see the pretty graphic and the numbers supporting that visual.

    Visuals
    Visuals Made with our Custom Data

    I bet you didn’t notice that something changed here.   Look closely at the data we see now vs. what we entered earlier.  Go ahead, scroll up, I’ll wait…  Did you catch it?

    We now have 5 rows of data but we entered 6 before.  That is because the Sales column is a number column and can be aggregated.  Look in the fields column and you see there is a little sum symbol in front of the Sales column.  This means that this column has a default summarization associated.  To see what is the default summarization highlight Sales by clicking on the column name in the grey area.  Then Click the ribbon titled Modeling, and there it is in the properties section the Default Summarization is Sum.  Every time you use the Sales column it will be summarized in the tables and visuals views.  Our visual table shows Brazil with a total sales of 600, because we had two Regions labeled as Brazil 500 and 100.

    Now you can click on any of the data points in the doughnut.  Notice the table automatically filters down to only show the areas you selected.

    Brazil Data
    Data Filtered to only Brazil

    ProTip: you can select multiple selections by holding down CTRL and selecting multiple items in the visual.  You can only do this inside of one visual.  As soon as you click another visual all filtering will disappear.

    Again, I hope you enjoyed this quick tutorial. If you liked it make sure you share it below.

  • Query Settings – Fixing a Missing File

    Query Settings – Fixing a Missing File

    One of the most important concepts to learn within Power BI Desktop is how to build a Data Model.

    Note: In simple terms the Data Model is data that is collected from the get data function.  In your data model you can build multiple queries.  This data is stored in the file.  The data storage is very efficient as the data compressed down to approximately a 4:1 ratio.  1000 KB file will compact down to approximately 250 KB when loaded into Power BI.  From my current understanding all data is loaded into the memory of the computer.  Thus, if you are having performance issues it could be in part due to the RAM of your computer.  

    As you begin to craft more data models you will learn little tips and tricks along the way to make an efficient Data Model for your visualizations.  I have found that the most challenging part of building the data model is structuring the data in a way that will make your selected visual make sense.  This may mean you need to add a measure or a calculated column or a ranking to a data set. Alright lets get started.

    Here are the Resources for this tutorial:

    • Power BI Desktop (I’m using the March 2016 version, 2.33.4337.281) download the latest version from Microsoft Here.
    • We are going to work through the Power BI Desktop file that we built in the Loading Excel Files Into Power BI Tutorial.  You can follow the link to create the Power BI Desktop (pbix) file in the tutorial.  For convenience, the completed file can be downloaded here: Import Excel Tutorial.

    I’m going to start off by extracting the Import Excel Tutorial.zip file to my desktop.  Once the file has been extracted we can open the containing folder.  In this folder there are two files the source data in the excel file and the Power BI Desktop file.

    Note: A Power BI Desktop file has a .pbix file format ending.

    Open the Import Excel.pbix file.  First click the Home ribbon and then click the Refresh button.  Most likely there will be an error similar to the following message.

    Can't find file
    Message box when file can’t be found

    This type of message occurs when you refresh a query and the file is missing or can’t be found.  This is because when i originally built the Power BI Desktop tutorial the excel file that is supply the information was located on my desktop.  This is a common problem when you build connections to local files stored on your computer.  If you move a file into a different folder then the connection will break.

    To resolve this close the message window by clicking Close. on the Home ribbon click the Edit Queries button.  The Query Editor window will be presented.  In a large yellow bar in the data view portion of the window (circled in red) is the error message.

    Note: Circled in blue is the Query Settings window.  This window is the window for all the applied steps to transform the data.  You can change the name of the query in the name box.  From the view we have selected we can see that the step entitled Changed Type is currently selected (seen circled in blue).

    Click the grey button labeled Go To Error which is found in the yellow error box.

    Go To Error
    Error seen inside Query Editor

    Upon clicking the Go To Error button the selection in the Query Settings button to the Source Step.  This is where the query has failed.  More information about the failure is shown in another yellow error box.  This time click Edit Settings in the error box.

    Edit Settings
    Edit Settings in Error window

    Now we have the Load Excel file window prompt open.  In this window Click Browse, navigate to where you extracted all the files downloaded earlier in the tutorial and select the excel document entitled Book1.  Click Open and the new file location will be loaded into the Load Excl Window.  Click OK to complete the settings change.

    New File Location
    New File Location

    Now the data is correctly loaded into the data model.  Notice we are still on the step called Source.  Take some time to click through each step, Source – Navigation – Promoted Headers – Changed Type.  As you click on each step you can see how the data is transforming.

    To see the code that is being used to make each step click the View ribbon and check the little box entitled Formula Bar.  This will make a formula bar appear.  When you click on a step the formula bar will reveal the code needed to complete the selected step.

    Toggle the Formula Bar
    Toggling on the Formula Bar

    We can now see the equation, which is similar to how you would write an equation in excel.  The code in the Changed Type step is here:

    = Table.TransformColumnTypes(#"Promoted Headers",{{"ID", Int64.Type}, {"Sales", Int64.Type}, {"Category", type text}})

    The equation is using the M language to transform the data.  More information on the usage of the M language can be found here.

    Note: Couple of pointers about the data shown in the formula.  The function is called Table.TranformColumnTypes.  The source of the data is a variable called #”Promoted Headers”.  The pound sign and the words following in quotations is how the M language passes variables that have a space contained in the language.  Since the prior step has the name “Promoted (space) Headers” the program has to add the pound sign and the quotation marks.  If there is no space in the naming convention such as “PromotedHeaders” then only the PromotedHeaders would be seen in the code and the pound sign and quotes will be gone. See modify coded when I remove the space from the Promoted Headers applied step.

    = Table.TransformColumnTypes(PromotedHeaders,{{"ID", Int64.Type}, {"Sales", Int64.Type}, {"Category", type text}})

    Notice the the pound sign and quotations are missing.

    The second part of the formula is an array which has been written out in curly brackets:

    {
    {"ID", Int64.Type}, 
    {"Sales", Int64.Type}, 
    {"Category", type text}
    }

    I changed the code by adding line returns to make it easier to read.  The coded array has beginning bracket and an ending bracket.  Each parameter is contained in it’s own curly brackets and separated with a comma.  The array is a 2 x 3 array, it has 3 rows and two data points on each row, just like a matrix.  The first data point is the column name.  In the first row the column that is being address is called ID.  The data transformation parameter is called Int64.Type.  This means that the data is an integer type 64 bit.  This repeats for each row until all parameters have been addressed.

    So there you go, we have opened up a query repaired it and learned a little about the formula bar.

    As a side note, as you build queries each button press that you make on the various ribbons in the Query Editor will make a minimum of one step the in the Query Editor.

    Hope you enjoyed this short tutorial about the Query Editor.  Make sure you share below if you liked it.

  • Folder of Files Loaded to Power BI Desktop

    Folder of Files Loaded to Power BI Desktop

    Ok, I’ve got to be honest the first two tutorials (Loading Excel Files, Loading CSV Files) were only there to get things kicked off. Now we are getting to some of the good stuff.

    When I first saw this feature in Power Query for excel I nearly had a conniption.  My first thought is this is going to CHANGE EVERYTHING, and to be perfectly honest it has. My entire view of Excel and Power BI has been shaped by this simple but powerful idea; Automated Data Loading.

    In all my years as an engineer I would have to constantly copy and paste data from one excel file to another. Then perform some transformations just to produce a bar chart or a line graph, uggh.  This is slow and boring.  I was really good at being boring, and I felt like I was able to become quite ingenious by writing macros and automating parts of my data transformations.  Now I have seen the light,  The simple ability of being able to load a group files from a folder is AWESOME!  Had I had this feature in my engineering days I could have saved so much time.  So in true homage of my engineering roots this post is for you, the all mighty data hungry engineer.

    Alright, enough of be babbling, Lets get to it.

    Materials for this Tutorial:

    • Zip file with (3) three excel files download Data Set.
    • Power BI Desktop (I’m using the March 2016 version, 2.33.4337.281) download the latest version from Microsoft Here.

    Lets start off by downloading the Data Set and unzipping the file to a folder called DataSet.  For this demo I unzipped the files to my desktop folder.

    UnZipped Files
    Location for UnZipped Excel Files

    Next we will open up Power BI Desktop.  On the Home ribbon select the Get Data button.  The Get Data window will be presented and this time we will select the Folder icon in the menu.

    Get Data Folder
    Get Data Folder Icon Selection

    Click the Connect button at the bottom right of the screen.  A folder window will display.  This is where we will select the location of our data in the folder we unzipped earlier.  Click OK once you’ve selected the location of the folder.

    Folder Path
    Folder Path Location

    The next window to open shows the files that Power BI Desktop is able to see in the folder location.  Normally we would press Load and move forward but in this case we want to further manipulate our query to load the data.  Therefore, Click the Edit button to modify the query to load data.

    Folder Location
    View of Files in Selected Folder

    We are now in the Query Editor.  This is where we can manipulate the incoming data before we visualize it.

    Note:  The Query Editor is a graphical representation of the M-language which is used to load data.  Each button press in the Query Editor performs a transformation to your data.  Each step writes a little line of code that handles the transformations.  To see the code Click the View ribbon then click the button labeled Advance Editor.  For more documentation on the M language look at the Microsoft documentation located here.

    Here is an image of the files we loaded in from our folder location in the Query Editor.

    Query Editor
    View of Query Editor

    The next step is to combine all the files into one combined data model.  To do this click the Double Down Arrows that are circled in red on the left side in the column called content.  

    Note: I also circled the Query Settings in Red on the right.  The Query Settings window will become very useful, especially when trouble shooting a query.  You will notice as we make additional data transformations more steps will accumulate in the query settings.

    We now have a final view of all the data from each of the three CSV files.

    Data Model
    Loaded Files into the Data model

    The file needs a little clean up to remove some unwanted data rows.  Notice now that we have loaded all three files.  In each file we had a header row.  Now in our data model we have three rows with headers.  We want to use the first row as column names.  To do this, Click the Use First Row as Headers button on the Home ribbon.

    Header Row
    Use First Row as Headers

    Also, notice there are rows of data that contain the initial header rows from the other two files.

    Other Headers
    Header Rows from Other Two Files

    Now we will apply a filter to remove these rows.  Click the Arrow in the ID row.  This will present a menu.  There are various transformations on this screen, you can sort a row in Ascending, or Descending order, Filter out text items, etc…

    Filter ID Row
    Filter for ID Row

    Click Text Filters and select Does Not Equal and enter ID into the filter.  Click OK to proceed. This will add a step to remove any row that had ID listed in the ID column.

    Filter Rows
    Filter out the Text “ID” from column

    We have transformed our data and now have cleaned the data and it’s ready for use.  Click Close and Apply to load the data to the data model.  Now the data is ready for visualizations.  Thanks for following along.

    Make sure you take some time to share if you enjoyed this tutorial.

  • Import CSV file to Power BI

    Import CSV file to Power BI

    This post is going to be similar to my previous post about Getting Data.  I figure we better cover some of the basics before going crazy with deeper topics.

    Materials for this tutorial:

    • CSV file with some random data, linked here: SampleData in CSV format
    • Power BI Desktop (I’m using the March 2016 version, 2.33.4337.281)

    After I read the previous version I thought it would be helpful to put the materials up at the top and what version I was using.  If you didn’t know Microsoft has been very active in the development of PowerBI.com and Power BI Desktop.  Right now there are weekly updates to PowerBI.com and monthly updates to Power BI Desktop.

    Starting off like before here is a sample of the data from the csv file.  I’m showing the data in notepad to prove it is a comma separated value file (hence the CSV name).

    csvfile
    CSV File opened in Note Pad

    Alright, lets go get some data.  Open up Power BI Desktop.  Click on the Home ribbon.  Select the Get Data icon.

    Get Data Button
    Button for Get Data

    Now the Get Data window will open.  Next, select the second item labeled CSV from the top of the list on the right.

    Get CSV selection
    CSV selection in the Get Data screen

    Click the Connect button at the bottom right hand of the Get Data screen to proceed to the next screen.  Now the open window will let you navigate to the CSV file you would like to import.  Click the Open button at the right of Open window to load the CSV file.  Finally you’ll be presented with the data view of the contents contained inside your CSV file.

    View Of CSV Data
    View of CSV Data file

    Once loaded we now have our view of all the columns of data in the Fields viewing pane on the right.  From here we can build our visuals.

    Loaded CSV Columns
    Loaded Columns from CSV file load

    Now, lets throw together a quick visual of the data.

    Start by clicking the check box next to the label titled Category and then click the box next to the label titled Sales.  This will automatically populate a table with the categories in the first column and the sales for each category in the second column.

    Table Visual
    Table of Data

    To open up the Visualizations bar click on the word Visualizations.  This will present all the information relating to the visuals. Upon opening up the visualizations pane there is a small yellow square showing you which visual is selected.

    Selected Visual
    Showing the Selected Visual

    Note: The blue pen highlighting shows the selected visual on the page.  As you build more complex visuals there will be multiple visualizations on your page.  When you select a specific visual, all the properties in the Visualizations Bar show all the properties for the selected visual.  The Table visual is highlighted by the red highlight circle.

    To change our selected visual to a new visual we will simply select a new icon in the Visualizations bar. Click the icon that looks like a pie chart.

    Pie Chart
    Pie Chart Visualization

    Cool, but what if I want more awesomeness on my page.  No problem.  Let’s copy our visual.  You can do this by selecting the visual.  To know it is selected look for the slight grey bar at the top of the visual.

    Gray Bar on Visual
    Gray Bar denoting that visual is selected

    Copy the visual by using Ctrl + C.  Click any where on the white space on the page.  This will deselect the current visual.  Then paste an identical version of the visual by using Ctrl + V.

    Two Visuals
    Copy and Paste of new Visual

    Ta-da! Now we are really getting somewhere.  Two Amazing visuals, well not quite.  Two identical visuals isn’t very compelling.  Lets change one of the visuals to a different visual.

    Select the top visual by clicking on it.  Then select the Stacked Column Chart which is the second icon from the left in the top row.  Selecting this icon will change the visual.

    Bar Chart
    Bar Chart Visual

    And there you have it.  You’ve imported a CSV file and generated two visuals.  Nice job.

    Hope you enjoyed this tutorial.  Leave comments if you have questions or if you want to see something else in a tutorial. If you like what you see please share this post on your selected social network of choice below.