Saturday, May 8, 2021

Tutorial import data into excel and create a data model

Abstract:    This is the first tutorial in a series designed to get you acquainted and comfortable using Excel and its built-in data mash-up and analysis features. These tutorials build and refine an Excel workbook from scratch, build a data model, then create amazing interactive reports using Power View. The tutorials are designed to demonstrate Microsoft Business Intelligence features and capabilities in Excel, PivotTables, Power Pivot, and Power View.

Note: This article describes data models in Excel 2013. However, the same data modeling and Power Pivot features introduced in Excel 2013 also apply to Excel 2016.

In these tutorials you learn how to import and explore data in Excel, build and refine a data model using Power Pivot, and create interactive reports with Power View that you can publish, protect, and share.

The tutorials in this series are the following:

  1. Import Data into Excel 2013, and Create a Data Model

  2. Extend Data Model relationships using Excel, Power Pivot, and DAX

  3. Create Map-based Power View Reports

  4. Incorporate Internet Data, and Set Power View Report Defaults

  5. Power Pivot Help

  6. Create Amazing Power View Reports - Part 2

In this tutorial, you start with a blank Excel workbook.

The sections in this tutorial are the following:

At the end of this tutorial is a quiz you can take to test your learning.

This tutorial series uses data describing Olympic Medals, hosting countries, and various Olympic sporting events. We suggest you go through each tutorial in order. Also, tutorials use Excel 2013 with Power Pivot enabled. For more information on Excel 2013, click here. For guidance on enabling Power Pivot, click here.

Import data from a database

We start this tutorial with a blank workbook. The goal in this section is to connect to an external data source, and import that data into Excel for further analysis.

Let's start by downloading some data from the Internet. The data describes Olympic Medals, and is a Microsoft Access database.

  1. Click the following links to download files we use during this tutorial series. Download each of the four files to a location that's easily accessible, such as Downloads or My Documents, or to a new folder you create:
    > OlympicMedals.accdb Access database
    > OlympicSports.xlsx Excel workbook
    > Population.xlsx Excel workbook
    > DiscImage_table.xlsx Excel workbook

  2. In Excel 2013, open a blank workbook.

  3. Click DATA > Get External Data > From Access. The ribbon adjusts dynamically based on the width of your workbook, so the commands on your ribbon may look slightly different from the following screens. The first screen shows the ribbon when a workbook is wide, the second image shows a workbook that has been resized to take up only a portion of the screen.

    Import data from Access

    Import data from Access with small ribbon

     

  4. Select the OlympicMedals.accdb file you downloaded and click Open. The following Select Table window appears, displaying the tables found in the database. Tables in a database are similar to worksheets or tables in Excel. Check the Enable selection of multiple tables box, and select all the tables. Then click OK.

    Select table window

  5. The Import Data window appears.

    Note: Notice the checkbox at the bottom of the window that allows you to Add this data to the Data Model, shown in the following screen. A Data Model is created automatically when you import or work with two or more tables simultaneously. A Data Model integrates the tables, enabling extensive analysis using PivotTables, Power Pivot, and Power View. When you import tables from a database, the existing database relationships between those tables is used to create the Data Model in Excel. The Data Model is transparent in Excel, but you can view and modify it directly using the Power Pivot add-in. The Data Model is discussed in more detail later in this tutorial.


    Select the PivotTable Report option, which imports the tables into Excel and prepares a PivotTable for analyzing the imported tables, and click OK.

    Import Data window

  6. Once the data is imported, a PivotTable is created using the imported tables.

    Blank Pivot Table

With the data imported into Excel, and the Data Model automatically created, you're ready to explore the data.

Explore data using a PivotTable

Exploring imported data is easy using a PivotTable. In a PivotTable, you drag fields (similar to columns in Excel) from tables (like the tables you just imported from the Access database) into different areas of the PivotTable to adjust how it presents your data. A PivotTable has four areas: FILTERS, COLUMNS, ROWS, and VALUES.

The four PivotTable Fields areas

It might take some experimenting to determine which area a field should be dragged to. You can drag as many or few fields from your tables as you like, until the PivotTable presents your data how you want to see it. Feel free to explore by dragging fields into different areas of the PivotTable; the underlying data is not affected when you arrange fields in a PivotTable.

Let's explore the Olympic Medals data in the PivotTable, starting with Olympic medalists organized by discipline, medal type, and the athlete's country or region.

  1. In PivotTable Fields, expand the Medals table by clicking the arrow beside it. Find the NOC_CountryRegion field in the expanded Medals table, and drag it to the COLUMNS area. NOC stands for National Olympic Committees, which is the organizational unit for a country or region.

  2. Next, from the Disciplines table, drag Discipline to the ROWS area.

  3. Let's filter Disciplines to display only five sports: Archery, Diving, Fencing, Figure Skating, and Speed Skating. You can do this from within the PivotTable Fields area, or from the Row Labels filter in the PivotTable itself.

    1. Click anywhere in the PivotTable to ensure the Excel PivotTable is selected. In the PivotTable Fields list, where the Disciplines table is expanded, hover over its Discipline field and a dropdown arrow appears to the right of the field. Click the dropdown, click (Select All)to remove all selections, then scroll down and select Archery, Diving, Fencing, Figure Skating, and Speed Skating. Click OK.

    2. Or, in the Row Labels section of the PivotTable, click the dropdown next to Row Labels in the PivotTable, click (Select All) to remove all selections, then scroll down and select Archery, Diving, Fencing, Figure Skating, and Speed Skating. Click OK.

  4. In PivotTable Fields, from the Medals table, drag Medal to the VALUES area. Since Values must be numeric, Excel automatically changes Medal to Count of Medal.

  5. From the Medals table, select Medal again and drag it into the FILTERS area.

  6. Let's filter the PivotTable to display only those countries or regions with more than 90 total medals. Here's how.

    1. In the PivotTable, click the dropdown to the right of Column Labels.

    2. Select Value Filters and select Greater Than….

    3. Type 90 in the last field (on the right). Click OK.
      Value Filter window

Your PivotTable looks like the following screen.

Updated PivotTable

With little effort, you now have a basic PivotTable that includes fields from three different tables. What made this task so simple were the pre-existing relationships among the tables. Because table relationships existed in the source database, and because you imported all the tables in a single operation, Excel could recreate those table relationships in its Data Model.

But what if your data originates from different sources, or is imported at a later time? Typically, you can create relationships with new data based on matching columns. In the next step, you import additional tables, and learn how to create new relationships.

Import data from a spreadsheet

Now let's import data from another source, this time from an existing workbook, then specify the relationships between our existing data and the new data. Relationships let you analyze collections of data in Excel, and create interesting and immersive visualizations from the data you import.

Let's start by creating a blank worksheet, then import data from an Excel workbook.

  1. Insert a new Excel worksheet, and name it Sports.

  2. Browse to the folder that contains the downloaded sample data files, and open OlympicSports.xlsx.

  3. Select and copy the data in Sheet1. If you select a cell with data, such as cell A1, you can press Ctrl + A to select all adjacent data. Close the OlympicSports.xlsx workbook.

  4. On the Sports worksheet, place your cursor in cell A1 and paste the data.

  5. With the data still highlighted, press Ctrl + T to format the data as a table. You can also format the data as a table from the ribbon by selecting HOME > Format as Table. Since the data has headers, select My table has headers in the Create Table window that appears, as shown here.

    Create Table window

    Formatting the data as a table has many advantages. You can assign a name to a table, which makes it easy to identify. You can also establish relationships between tables, enabling exploration and analysis in PivotTables, Power Pivot, and Power View.

  6. Name the table. In TABLE TOOLS > DESIGN > Properties, locate the Table Name field and type Sports. The workbook looks like the following screen.
    Name a table in Excel

  7. Save the workbook.

Import data using copy and paste

Now that we've imported data from an Excel workbook, let's import data from a table we find on a web page, or any other source from which we can copy and paste into Excel. In the following steps, you add the Olympic host cities from a table.

  1. Insert a new Excel worksheet, and name it Hosts.

  2. Select and copy the following table, including the table headers.

City

NOC_CountryRegion

Alpha-2 Code

Edition

Season

Melbourne / Stockholm

AUS

AS

1956

Summer

Sydney

AUS

AS

2000

Summer

Innsbruck

AUT

AT

1964

Winter

Innsbruck

AUT

AT

1976

Winter

Antwerp

BEL

BE

1920

Summer

Antwerp

BEL

BE

1920

Winter

Montreal

CAN

CA

1976

Summer

Lake Placid

CAN

CA

1980

Winter

Calgary

CAN

CA

1988

Winter

St. Moritz

SUI

SZ

1928

Winter

St. Moritz

SUI

SZ

1948

Winter

Beijing

CHN

CH

2008

Summer

Berlin

GER

GM

1936

Summer

Garmisch-Partenkirchen

GER

GM

1936

Winter

Barcelona

ESP

SP

1992

Summer

Helsinki

FIN

FI

1952

Summer

Paris

FRA

FR

1900

Summer

Paris

FRA

FR

1924

Summer

Chamonix

FRA

FR

1924

Winter

Grenoble

FRA

FR

1968

Winter

Albertville

FRA

FR

1992

Winter

London

GBR

UK

1908

Summer

London

GBR

UK

1908

Winter

London

GBR

UK

1948

Summer

Munich

GER

DE

1972

Summer

Athens

GRC

GR

2004

Summer

Cortina d'Ampezzo

ITA

IT

1956

Winter

Rome

ITA

IT

1960

Summer

Turin

ITA

IT

2006

Winter

Tokyo

JPN

JA

1964

Summer

Sapporo

JPN

JA

1972

Winter

Nagano

JPN

JA

1998

Winter

Seoul

KOR

KS

1988

Summer

Mexico

MEX

MX

1968

Summer

Amsterdam

NED

NL

1928

Summer

Oslo

NOR

NO

1952

Winter

Lillehammer

NOR

NO

1994

Winter

Stockholm

SWE

SW

1912

Summer

St Louis

USA

US

1904

Summer

Los Angeles

USA

US

1932

Summer

Lake Placid

USA

US

1932

Winter

Squaw Valley

USA

US

1960

Winter

Moscow

URS

RU

1980

Summer

Los Angeles

USA

US

1984

Summer

Atlanta

USA

US

1996

Summer

Salt Lake City

USA

US

2002

Winter

Sarajevo

YUG

YU

1984

Winter

  1. In Excel, place your cursor in cell A1 of the Hosts worksheet and paste the data.

  2. Format the data as a table. As described earlier in this tutorial, you press Ctrl + T to format the data as a table, or from HOME > Format as Table. Since the data has headers, select My table has headers in the Create Table window that appears.

  3. Name the table. In TABLE TOOLS > DESIGN > Properties locate the Table Name field, and type Hosts.

  4. Select the Edition column, and from the HOME tab, format it as Number with 0 decimal places.

  5. Save the workbook. Your workbook looks like the following screen.

Host Table

Now that you have an Excel workbook with tables, you can create relationships between them. Creating relationships between tables lets you mash up the data from the two tables.

Create a relationship between imported data

You can immediately begin using fields in your PivotTable from the imported tables. If Excel can't determine how to incorporate a field into the PivotTable, a relationship must be established with the existing Data Model. In the following steps, you learn how to create a relationship between data you imported from different sources.

  1. On Sheet1, at the top ofPivotTable Fields, clickAll to view the complete list of available tables, as shown in the following screen.
    Click All in PivotTable Fields to show all available tables

  2. Scroll through the list to see the new tables you just added.

  3. Expand Sports and select Sport to add it to the PivotTable. Notice that Excel prompts you to create a relationship, as seen in the following screen.
    The CREATE... relationship prompt in PivotTable Fields
     

    This notification occurs because you used fields from a table that's not part of the underlying Data Model. One way to add a table to the Data Model is to create a relationship to a table that's already in the Data Model. To create the relationship, one of the tables must have a column of unique, non-repeated, values. In the sample data, the Disciplines table imported from the database contains a field with sports codes, called SportID. Those same sports codes are present as a field in the Excel data we imported. Let's create the relationship.

  4. Click CREATE... in the highlighted PivotTable Fields area to open the Create Relationship dialog, as shown in the following screen.

    Create Relationship window

  5. In Table, choose Disciplines from the drop down list.

  6. In Column (Foreign), choose SportID.

  7. In Related Table, choose Sports.

  8. In Related Column (Primary), choose SportID.

  9. Click OK.

The PivotTable changes to reflect the new relationship. But the PivotTable doesn't look right quite yet, because of the ordering of fields in the ROWS area. Discipline is a subcategory of a given sport, but since we arranged Discipline above Sport in the ROWS area, it's not organized properly. The following screen shows this unwanted ordering.
PivotTable with unwanted ordering

  1. In the ROWS area, move Sport above Discipline. That's much better, and the PivotTable displays the data how you want to see it, as shown in the following screen.

    PivotTable with corrected ordering

Behind the scenes, Excel is building a Data Model that can be used throughout the workbook, in any PivotTable, PivotChart, in Power Pivot, or any Power View report. Table relationships are the basis of a Data Model, and what determine navigation and calculation paths.

In the next tutorial, Extend Data Model relationships using Excel 2013, Power Pivot, and DAX, you build on what you learned here, and step through extending the Data Model using a powerful and visual Excel add-in called Power Pivot. You also learn how to calculate columns in a table, and use that calculated column so that an otherwise unrelated table can be added to your Data Model.

Checkpoint and Quiz

Review What You've Learned

You now have an Excel workbook that includes a PivotTable accessing data in multiple tables, several of which you imported separately. You learned to import from a database, from another Excel workbook, and from copying data and pasting it into Excel.

To make the data work together, you had to create a table relationship that Excel used to correlate the rows. You also learned that having columns in one table that correlate to data in another table is essential for creating relationships, and for looking up related rows.

You're ready for the next tutorial in this series. Here's a link:

Extend Data Model relationships using Excel 2013, Power Pivot, and DAX

QUIZ

Want to see how well you remember what you learned? Here's your chance. The following quiz highlights features, capabilities, or requirements you learned about in this tutorial. At the bottom of the page, you'll find the answers. Good luck!

Question 1: Why is it important to convert imported data into tables?

A: You don't have to convert them into tables, because all imported data is automatically turned into tables.

B: If you convert imported data into tables, they will be excluded from the Data Model. Only when they're excluded from the Data Model are they available in PivotTables, Power Pivot, and Power View.

C: If you convert imported data into tables, they can be included in the Data Model, and be made available to PivotTables, Power Pivot, and Power View.

D: You cannot convert imported data into tables.

Question 2: Which of the following data sources can you import into Excel, and include in the Data Model?

A: Access Databases, and many other databases as well.

B: Existing Excel files.

C: Anything you can copy and paste into Excel and format as a table, including data tables in websites, documents, or anything else that can be pasted into Excel.

D: All of the above

Question 3: In a PivotTable, what happens when you reorder fields in the four PivotTable Fields areas?

A: Nothing – you cannot reorder fields once you place them in the PivotTable Fields areas.

B: The PivotTable format is changed to reflect the layout, but underlying data is unaffected.

C: The PivotTable format is changed to reflect the layout, and all underlying data is permanently changed.

D: The underlying data is changed, resulting in new data sets.

Question 4: When creating a relationship between tables, what is required?

A: Neither table can have any column that contains unique, non-repeated values.

B: One table must not be part of the Excel workbook.

C: The columns must not be converted to tables.

D: None of the above is correct.

Quiz Answers

  1. Correct answer: C

  2. Correct answer: D

  3. Correct answer: B

  4. Correct answer: D

Notes: Data and images in this tutorial series are based on the following:

  • Olympics Dataset from Guardian News & Media Ltd.

  • Flag images from CIA Factbook (cia.gov)

  • Population data from The World Bank (worldbank.org)

  • Olympic Sport Pictograms by Thadius856 and Parutakupiu

No comments:

Post a Comment