Tuesday, June 27, 2017

Tutorial: Create a Power View report with Azure Marketplace data

Tutorial: Create a Power View report with Azure Marketplace data

In this Power View tutorial, you download free data from Microsoft Azure Marketplace, add more data, and create relationships between tables. You create a series of visualizations in Power View that illustrate which visualization is right for what kind of data: When to use a map, bar chart, or line chart; how to sort and filter data in visualizations; how drill-down works. You also create some simple calculated fields in Power Pivot so Power View can better consume the data in the workbook. First you need the data.

Power View sheet using Windows Azure Marketplace data with map, bar, and line charts

Make a Power View sheet with a map of US airports, a bar chart of airline average delays, and a line chart showing delays by month.

In this article

Download data from Microsoft Azure Marketplace

Create a Power View sheet with a clustered chart

Change the sort order

Change the aggregate from a sum to an average

Get airline data

Relate tables in Power View

Filter the chart

Change the chart layout

Create another chart

Change the sort order of the chart

Get airport code data

Relate the Airports table to the other tables

Create a map

Filter the map

Cross-filter visualizations

Add a time visualization

Create a Month Name table

Add the table to the model

Create a MonthName Year calculated field

Use the new MonthName Year calculated field in Power View

Sort one field by another field

Create a line chart by months

Add drill-down to a line chart

More fun with airline delay data

Themes and formatting

Related information

Download data from Microsoft Azure Marketplace

  1. In Excel, on the Power Pivot tab > Manage Data Model.

Don't see a Power Pivot tab? Start Power Pivot in Microsoft Excel 2013 add-in.

  1. Click Get External Data > From Data Service > From Microsoft Azure Marketplace.

  2. In the Table Import Wizard, under Type > Data, and under Price > Free.

  3. Search for "air carriers".

  4. Next to US Air Carrier Delays, click Subscribe.

  5. Sign in with a Microsoft account. If you don't have one, follow the steps to create one.

This window shows some sample data.

  1. Scroll to the end of the sample data and click Select Query.

  2. You can type a friendly name for the connection, and then click Next.

  3. Click Preview and Filter.

  4. In this window you can pick which columns you want to import, by unchecking the check boxes. We want all of them, so click OK, and then click Finish.

The download process may take some time. The dataset has more than 2.4 million rows.

  1. Click OK.

You have just created a Data Model in Excel.

Create a Power View sheet with a clustered chart

  1. In Excel on the Insert tab > Power View Reports.

There in the Field List is the On_Time_Performance table.

  1. Click the arrow next to the On_Time_Performance table.

Those are the fields (columns) in the table.

  1. Check the Carrier box.

This shows a list of the abbreviations for carriers.

  1. Check the ArrDelayMinutes box.

You get an aggregate number per carrier.

  1. In the Values box, click the arrow next to ArrDelayMinutes.

You see that you're getting a sum of minute delays per carrier.

  1. On the Design tab, click the arrow under Bar Chart and click Clustered Bar.

  2. Resize the chart to make it taller: Drag the handle on the lower edge when the cursor changes to a pointing hand. Pointing hand cursor in Power View

  3. In the Field List, drag DepDelayMinutes to the Values box under ArrDelayMinutes.

Now we see some carriers are better at on-time arrivals, while others are better at on-time departures.

Change the sort order

  1. Hover over the chart to see sort by in the upper-left corner. Click Carrier.

Now it's sorted by ArrDelayMinutes.

  1. Again in the upper-left corner, click asc. Now it's sorted descending instead.

Clearly WN has the largest number.

Change the aggregate from a sum to an average

But maybe summing the delays isn't right: An airline with more flights might be on time more often, but still have a larger total.

  1. In the Values box, click the arrow next to ArrDelayMinutes and change it from Sum to Average.

  2. Do the same for DepDelayMinutes.

  3. Resort by ArrDelayMinutes.

That changes the picture completely: WN has nowhere near the worst on-time delay average. Now we see on average XE is the worst.

Read more about changing sums to averages or other aggregates in Power View.

Get airline data

There's just one problem: Who knows the carrier abbreviations? Which airline is XE? But of course, everything is on the web.

  1. Go to this site with a list of the two-letter airline codes: http://www.airfarewatchdog.com/pages/3799702/airline-letter-codes/.

  2. Select the two columns of data and copy it.

  3. In your Excel workbook, click a blank worksheet and type AirlineCode in cell A1 and AirlineName in cell B1.

  4. Click in cell A2 and paste the data.

  5. Format the data as a table (Ctrl + T) and name it Airlines.

  6. Rename the tab Airlines, too.

Relate tables in Power View

  1. Go back to the Power View sheet in your Excel workbook.

  2. You see the Airlines table is already in the Field List.

If you don't see it, click All in the Field List.

  1. Make sure the bar chart is selected and remove Carrier from the Axis box.

This shows the overall average arrival and departure delay for all carriers.

  1. Expand the Airlines table and check the AirlineName box.

Now the chart has airline names. But there's a problem: All the values are the same. Power View displays a message in the Field List: "Relationships between tables may be needed."

  1. Click Create.

  2. In Table, click On_Time_Performance.

  3. In Column (Foreign), click Carrier.

  4. In Related Table, click Airlines.

  5. In Related Column (Primary), click. AirlineCode.

This creates a relationship between the two tables on the AirlineCode and Carrier fields.

  1. Click OK.

  2. Sort the table again so the ArrDelayMinutes field is sorted descending.

Look: The numbers in the table aren't all the same anymore, and we can see the names of the airlines instead of the carrier codes. So XE is the code for ExpressJet.

Filter the chart

Notice the (Blank) value in the list of airlines? Let's get rid of it by filtering the data.

  1. With the airlines bar chart selected, in the Filters Area click Chart.

    Tip:  Can't see the Filters Area? On the Power View tab > View > Filters Area.

    Note the Filters Area is already populated with the fields in the chart. The filters in the Chart filter will filter only this chart, and not any other visualization on the sheet.

  2. Click AirlineName, check the (All) box, and then uncheck the (Blank) box.
    (Blank) is gone from the chart.

Change the chart layout

  1. Make sure the column chart is selected.

  2. On the Layout tab > Legend > Show Legend at Top.

This layout makes better use of space.

  1. With the chart selected, click the sizing handle on the right side and make the chart narrower.

  2. Now you have room for another visualization.

Create another chart

  1. Click the blank sheet to start another visualization.

Tip:  This is key: As long as a visualization is selected, any fields you select are added to that visualization, instead of starting a new one.

  1. In the Field List, in the On_Time_Performance table, check the Origin box.

Origin is another list of codes – these ones are for airports.

  1. In the On_Time_Performance table, check the DepDelayMinutes box.

  2. In the Fields box, click the arrow next to DepDelayMinutes and click Average.

The number has a lot of decimal places.

  1. Click in the Average of DepDelayMinutes column and on the Design tab > Decrease Decimal.

Now the number has two decimal places.

  1. On the Design tab, click the arrow under Bar Chart > Stacked Bar.

  2. Drag the top and bottom sizing handles to show as many bars in the chart as possible.

Change the sort order of the chart

  1. Hover over the chart to see sort by in the upper-left corner. Click Origin.

Now it's sorted by DepDelayMinutes.

  1. Again in the upper-left corner, click asc. Now it's sorted descending instead.

Clearly JLN has the worst on-time arrival record. Where's that? Once again we have interesting data, but we can't make sense of it because we don't know most of the airport codes.

Get airport code data

  1. Go to http://www.airportcodes.us/us-airports.htm.

  2. On the web site, select and copy the four columns of data—Code, Name, City, and State—without the table heading, "US Airports, Sorted by Airport Code".

  3. In Excel, click the plus (+) sign to add a blank worksheet.

  4. Click in cell A1 and paste the data.

  5. Rename the columns:

    • Code = AirportCode

    • Name = AirportName

    • City = AirportCity

    • State = AirportState

  6. Format the data as a table (Ctrl + T) and name it Airports.

  7. Rename the tab Airports, too.

Relate the Airports table to the other tables

  1. Go back to the Power View sheet in your Excel workbook.

  2. The Airports table is already in the Field List.

    If you don't see it, click All in the Field List.

  3. Make sure the column chart is selected and remove Origin from the Axis box.

  4. This shows the overall average departure delay for all airports.

  5. Expand the Airports table and check the AirportName box.

    Now the chart has airline names. But again, all the values are the same and you see the message in the Field List: "Relationships between tables may be needed."

    Let's try a different way to create relationships.

  6. On the Power Pivot tab > Manage Data Model.

  7. In the Power Pivot window, on the Home tab > Diagram View.

    Now you see the three tables in your model. There's a line from On_Time_Performance to Airlines, but not to Airports.

  8. Find the Origin field in the On_Time_Performance table, and drag from it to the AirportCode field in the Airports table.

  9. Go back to the Power View sheet in Excel.

  10. Click OK for the message about the changed Data Model.

    The numbers in the table aren't all the same anymore, and we can see the names of the airports instead of the carrier codes.

  11. Sort the chart descending by Average of DepDelayMinutes.

So JLN is the code for Joplin Regional Airport.

Read more about creating relationships between tables.

Create a map

Wouldn't this data be more interesting if we saw it on a map? Now that we have the city and state names in the Airports table, we can map the data. First, let's move the airline chart so we have more room.

  1. Drag the moving handles for the airlines chart to put it along the left side of the chart, and drag the sizing handles to make it tall and narrow.

  2. Select the Airports chart and on the Design tab > Map.

The first time you create a map in Power View, you see a privacy warning that your data needs to be geocoded by sending it to Bing through a secure web connection.

  1. Click OK.

  2. Power View might put AirportName in the Color box. If so, drag it to the Locations box.

  3. Drag the sizing handles to make the map fit the whole remaining width of the sheet.

That's too many dots! Let's focus on airports where delays are longest. And you notice the dot north of Australia? That's Guam.

Filter the map

  1. With the map selected, in the Filters Area click Map.

  2. Click Average of DepDelayMinutes.

You see the scroll bar from 0 to 25.37 minutes.

  1. Drag the left side of the scroll bar so you're only showing delays greater than 10 minutes.

Tip:  For a more precise number, you can also do this in Advanced filter mode. Click the icon to the right of the filter name (Average of DepDelayMinutes), and Show items for which the value > is greater than or equal to > type 10 in the box.

  1. Now let's filter out Guam. In the Filters Area, click AirportName. Select the All box, in the Search box, type Guam, and then click the Search magnifying glass icon.

Guam International Airport is the only result.

  1. Clear the check box next to Guam.

  2. Click the plus sign in the upper-right corner of the map to zoom in and use the cursor to reposition so you're only showing the continental US and Alaska.

  3. Hover over the large dot in the Bering Sea: Adak Airport, almost 24 minutes' delay.

Cross-filter visualizations

Now comes the fun stuff – playing with the interaction between visualizations.

  1. Click the Adak Airport bubble.

  2. Look what happened to the Airlines bar chart! Apparently Alaska is the only airline that flies to Adak Airport. Notice how you can still see the Alaska total average and all the other airlines? If the average of the individual value is more than the total, Power View makes the individual average bar narrower than the total bar, so you can still see the total.

  3. Zoom in and reposition further to just the continental US.

  4. In the bar chart, click ExpressJet.

Mostly eastern and central US.

  1. Click the bar for Frontier.

Just three: San Francisco, Palm Springs, and Greater Rockford airports. San Francisco might be the biggest airport on this map.

  1. On the map, click the bubble for San Francisco.

Oh, my! Almost every airline that flies out of San Francisco has a worse average there than elsewhere.

Read more about filtering and highlighting in Power View.

Add a time visualization

Let's see how the airlines did in different months. We can guess which months were worst; let's see if we're right. The On_Time_Performance table contains several date fields, including one, FlightDate, formatted M/D/YYYY HH:MM:SS AM – for example, 1/1/2012 12:00:00 AM.

There's also a Month field. It's just numeric – no month names, but let's start with that.

  1. Start a new Power View sheet: in Excel > Insert > Power View Report.

  2. In the Field List, expand the On_Time_Performance table and check Month.

Because it's a numeric field (marked with a Sigma Σ), Power View add the values.

  1. In the Fields box, click the arrow next to Month > Do Not Summarize.

Power View creates a table with the months in order. 1, 2, 10, 11, and 12. There's only one problem: They're not actually in order.

  1. With the Month table selected on the sheet, in the On_Time_Performance table, drag Year and put it above Month in the Fields box.

  2. In the Fields box, click the arrow next to Year > Do Not Summarize.

  3. Now you can see months 10, 11, and 12 are actually before months 1 and 2, because they're in 2011.

Create a Month Name table

  1. In Excel on a new sheet, start a table in cell A1:

MonthNumber

MonthName

1

January

2

February

  1. Select cells A2 to B3 and then drag the fill handle in the lower-right corner of cell B3 [ZA010278334] down to cell B13.

fill handle

This fills rows 4 to 13 with the rest of the month numbers and names.

  1. Select rows 1 to 13, columns A and B, and create a table (Ctrl + T).

  2. Make sure My table has headers is selected.

  3. On the Excel Design tab > Table Name, type MonthNames.

  4. Rename the worksheet Month Names.

Add the table to the model

  1. With the table selected, in Excel on the Power Pivot tab > Add to Data Model.

    This opens the Power Pivot window on the MonthNames sheet.

  2. In the Power Pivot window on the Home tab > View > Diagram View.

  3. Create a relationship between the tables by dragging from the Month field in the On_Time_Performance table to the MonthNumber field in the MonthNames table.

Create a MonthName Year calculated field

  1. On the Power PivotHome tab > View > Data View.

  2. Go to the On_Time_Performance table and click in the Add Column column.

    You're going to create a basic calculated column.

  3. Type

    =RELATED

  4. Click the MonthNames sheet, and click the MonthName column.

    This brings the MonthName field from the MonthNames table and adds (MonthNames[MonthName] to your formula in the On_Time_Performance table.

  5. Type

    ) & " " &

    Make sure they're straight quotation marks, not curly.

  6. Click the Year column in the On_Time_Performance table.

    This adds [Year] to your formula. The whole formula looks like this:

    =RELATED(MonthNames[MonthName]) & " " & [Year]

  7. Press Enter.

    Power Pivot populates a field with the month name from your MonthNames table, a space, and the year from the On_Time_Performance table – for example, October 2011.

  8. Select the column, right-click, and rename it MonthName Year.

Use the new MonthName Year calculated field in Power View

  1. Go back to Excel and the Power View sheet.

  2. Drag your new MonthName Year field to the sheet.

Hmmm, something's wrong: The months are in alphabetical, not chronological, order!

Sort one field by another field

  1. Go back to the Power Pivot window.

  2. Go to the On_Time_Performance table and click in the Add Column column to create another basic calculated column.

  3. Type

    =YEAR([

    This brings up the list of columns in the table.

  4. Click the FlightDate column and then press the Tab key.

  5. Type

    ) & FORMAT(MONTH([

  6. Click the FlightDate column again in the list of columns in the table and then press the Tab key.

  7. Type

    ), "00"

    The whole formula looks like this:

    =YEAR([FlightDate]) & FORMAT(MONTH([FlightDate]),"00″)

  8. Press Enter.

    Power Pivot creates a column with just the year from the FlightDate field and just the month number from the same field. Because of the "00", the month number is displayed with two digits. For example, 201201 is January 2012.

  9. Select the column, right-click, and rename it MonthNumber Year.

  10. Select the MonthName Year column and on the Power PivotHome tab > Sort by Column.

    MonthName Year is in the Sort Column field.

  11. In the By Column field, click MonthNumber Year > OK.

Create a line chart by months

  1. Go back to Excel and on the Power View sheet, click OK for the message that the Data Model has changed.

    Indeed it has changed: the months are now ordered chronologically, October 2011 through February 2012. Much better!

  2. With the MonthName Year table selected, in the Field List > On_Time_Performance table, check the DepDelayMinutes field.

    The field is added as a sum.

  3. In the Fields box, click DepDelayMinutes > Average.

  4. On the Design tab > Other Chart > Line.

    Aha! As expected, December was the worst month, closely followed by January.

  5. Now add a second line: With the MonthName Year table selected, in the Field List > On_Time_Performance table, drag the ArrDelayMinutes field to the Values box.

    That field is added as a sum, too.

  6. In the Fields box, click ArrDelayMinutes > Average.

Notice that DepDelayMinutes – departures – were worst in December, but ArrDelayMinutes – arrivals – were worst in January.

Let's see which days of the month were worst.

Add drill-down to a line chart

  1. With the line chart selected, from the Field List > On_Time_Performance table, drag Day of Month to the Axis box, below MonthName Year.

The line chart looks unchanged.

  1. Double-click the point for December 2011.

Now we're seeing the day-to-day delays for the month of December 2011. Notice the two spikes for DepDelayMinutes?

  1. Hover over the spikes for DepDelayMinutes: December 22nd and 27th – before and after Christmas.

What about November?

  1. Click the Drill Up arrow in the upper-right corner to go back to the chart by months.

Drill up arrow in Power View chart visualizations

  1. Double-click November 2011.

Ah, no surprise there – November 29th was the worst day for arrivals and departures.

Read more about adding drill-down to a Power View chart or matrix.

More fun with airline delay data

In this tutorial, you've:

  • Gotten data from sources on the web.

  • Created relationships.

  • Created a variety of different charts, plus a map.

  • Filtered and cross-filtered data.

  • Created calculated fields in Power Pivot.

  • Worked with time data in Power Pivot and Power View.

  • Added drill-down to a chart.

You can continue playing with this data. For example, here's something else to try: Copy the MonthName Year line chart from this Power View sheet and paste it on the sheet with the map. Then, as you click airports on the map, you can see how they compare month to month.

Themes and formatting

Try playing with the themes on the Power View tab, too. The Power View sheet in the image at the beginning uses the Composite theme with a Light2 Solid background. See what you come up with! Read more about formatting Power View reports.

Related information

Charts and other visualizations in Power View

Filtering, highlighting, and slicers in Power View

Maps in Power View

What's new in Power View in Excel 2013 and in SharePoint Server

Power View: Explore, visualize, and present your data

2 comments: