Wednesday, April 25, 2018

Move data from Excel to Access

Move data from Excel to Access

This article shows you how to move your data from Excel to Access and convert your data to relational tables so that you can use Microsoft Office Excel and Access together. To summarize, Access is best for capturing, storing, querying, and sharing data, and Excel is best for calculating, analyzing, and visualizing data.

Two articles, Using Access or Excel to manage your data and Top 10 reasons to use Access with Excel, discuss which program is best suited for a particular task and how to use Excel and Access together to create a practical solution.

When you move data from Excel to Access, there are three basic steps to the process.

three basic steps

Note: For information on data modeling and relationships in Access, see Database design basics.

Step 1: Import data from Excel to Access

Importing data is an operation that can go a lot more smoothly if you take some time to prepare and clean your data. Importing data is like moving to a new home. If you clean out and organize your possessions before you move, settling into your new home is much easier.

Clean your data before you import

Before you import data into Access, in Excel it's a good idea to:

  • Convert cells that contain non-atomic data (that is, multiple values in one cell) to multiple columns. For example, a cell in a "Skills" column that contains multiple skill values, such as "C# programming," "VBA programming," and "Web design" should be broken out to separate columns that each contain only one skill value.

  • Use the TRIM command to remove leading, trailing, and multiple embedded spaces.

  • Remove non-printing characters.

  • Find and fix spelling and punctuation errors.

  • Remove duplicate rows or duplicate fields.

  • Ensure that columns of data do not contain mixed formats, especially numbers formatted as text or dates formatted as numbers.

For more information, see the following Excel help topics:

Note: If your data cleaning needs are complex, or you don't have the time or resources to automate the process on your own, you might consider using a third-party vendor. For more information, search for "data cleansing software" or "data quality" by your favorite search engine in your Web browser.

Choose the best data type when you import

During the import operation in Access, you want to make good choices so that you receive few (if any) conversion errors that will require manual intervention. The following table summarizes how Excel number formats and Access data types are converted when you import data from Excel to Access, and offers some tips on the best data types to choose in the Import Spreadsheet Wizard.

Excel number format

Access data type

Comments

Best practice

Text

Text, Memo

The Access Text data type stores alphanumeric data up to 255 characters. The Access Memo data type stores alphanumeric data up to 65,535 characters.

Choose Memo to avoid truncating any data.

Number, Percentage, Fraction, Scientific

Number

Access has one Number data type that varies based on a Field Size property (Byte, Integer, Long Integer, Single, Double, Decimal).

Choose Double to avoid any data conversion errors.

Date

Date

Access and Excel both use the same serial date number to store dates. In Access, the date range is larger: from -657,434 (January 1, 100 A.D.) to 2,958,465 (December 31, 9999 A.D.).

Because Access does not recognize the 1904 date system (used in Excel for the Macintosh), you need to convert the dates either in Excel or Access to avoid confusion.

For more information, see Change the date system, format, or two-digit year interpretation and Import or link to data in an Excel workbook.

Choose Date.

Time

Time

Access and Excel both store time values by using the same data type.

Choose Time, which is usually the default.

Currency, Accounting

Currency

In Access, the Currency data type stores data as 8-byte numbers with precision to four decimal places, and is used to store financial data and prevent rounding of values.

Choose Currency, which is usually the default.

Boolean

Yes/No

Access uses -1 for all Yes values and 0 for all No values, whereas Excel uses 1 for all TRUE values and 0 for all FALSE values.

Choose Yes/No, which automatically converts underlying values.

Hyperlink

Hyperlink

A hyperlink in Excel and Access contains a URL or Web address that you can click and follow.

Choose Hyperlink, otherwise Access may use the Text data type by default.

Once the data is in Access, you can delete the Excel data. Don't forget to backup the original Excel workbook first before deleting it.

For more information, see the Access help topic, Import or link to data in an Excel workbook.

Automatically append data the easy way

A common problem Excel users have is appending data with the same columns into one large worksheet. For example, you may have an asset tracking solution that started out in Excel but now has grown to include files from many workgroups and departments. This data may be in different worksheets and workbooks, or in text files that are data feeds from other systems. There is no user interface command or easy way to append similar data in Excel.

The best solution is to use Access, where you can easily import and append data into one table by using the Import Spreadsheet Wizard. Furthermore, you can append a lot of data into one table. You can save the import operations, add them as scheduled Microsoft Office Outlook tasks, and even use macros to automate the process.

Step 2: Normalize data by using the Table Analyzer Wizard

At first glance, stepping through the process of normalizing your data may seem a daunting task. Fortunately, normalizing tables in Access is a process that is much easier, thanks to the Table Analyzer Wizard.

the table analyzer wizard

1. Drag selected columns to a new table and automatically create relationships

2. Use button commands to rename a table, add a primary key, make an existing column a primary key, and undo the last action

You can use this wizard to do the following:

  • Convert a table into a set of smaller tables and automatically create a primary and foreign key relationship between the tables.

  • Add a primary key to an existing field that contains unique values, or create a new ID field that uses the AutoNumber data type.

  • Automatically create relationships to enforce referential integrity with cascading updates. Cascading deletes are not automatically added to prevent accidentally deleting data, but you can easily add cascading deletes later.

  • Search new tables for redundant or duplicate data (such as the same customer with two different phone numbers) and update this as desired.

  • Back up the original table and rename it by appending "_OLD" to its name. Then, you create a query that reconstructs the original table, with the original table name so that any existing forms or reports based on the original table will work with the new table structure.

Step 3: Connect to Access data from Excel

After the data has been normalized in Access and a query or table has been created that reconstructs the original data, it's a simple matter of connecting to the Access data from Excel. Your data is now in Access as an external data source, and so can be connected to the workbook through a data connection, which is a container of information that is used to locate, log on to, and access the external data source. Connection information is stored in the workbook and can also be stored in a connection file, such as an Office Data Connection (ODC) file (.odc file name extension) or a Data Source Name file (.dsn extension). After you connect to external data, you can also automatically refresh (or update) your Excel workbook from Access whenever the data is updated in Access.

For more information, see Overview of connecting (importing) data and Exchange (copy, import, export) data between Excel and Access.

Get your data into Access

This section walks you through the following phases of normalizing your data: Breaking values in the Salesperson and Address columns into their most atomic pieces, separating related subjects into their own tables, copying and pasting those tables from Excel into Access, creating key relationships between the newly created Access tables, and creating and running a simple query in Access to return information.

Example data in non-normalized form

The following worksheet contains non-atomic values in the Salesperson column and the Address column. Both columns should be split into two or more separate columns. This worksheet also contains information about salespersons, products, customers, and orders. This information should also be split further, by subject, into separate tables.

Salesperson

Order ID

Order Date

Product ID

Qty

Price

Customer Name

Address

Phone

Li, Yale

2348

3/2/09

J-558

4

$8.50

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Li, Yale

2348

3/2/09

B-205

2

$4.50

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Li, Yale

2348

3/2/09

D-4420

5

$7.25

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Li, Yale

2349

3/4/09

C-789

3

$7.00

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Li, Yale

2349

3/4/09

C-795

6

$9.75

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Adams, Ellen

2350

3/4/09

A-2275

2

$16.75

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Adams, Ellen

2350

3/4/09

F-198

6

$5.25

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Adams, Ellen

2350

3/4/09

B-205

1

$4.50

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Hance, Jim

2351

3/4/09

C-795

6

$9.75

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Hance, Jim

2352

3/5/09

A-2275

2

$16.75

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Hance, Jim

2352

3/5/09

D-4420

3

$7.25

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Koch, Reed

2353

3/7/09

A-2275

6

$16.75

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Koch, Reed

2353

3/7/09

C-789

5

$7.00

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Sousa, Luis

2354

3/7/09

A-2275

3

$16.75

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Adams, Ellen

2355

3/8/09

D-4420

4

$7.25

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Adams, Ellen

2355

3/8/09

C-795

3

$9.75

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Li, Yale

2356

3/10/09

C-789

6

$7.00

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Information in its smallest parts: atomic data

Working with the data in this example, you can use the Text to Column command in Excel to separate the "atomic" parts of a cell (such as street address, city, state, and postal code) into discrete columns.

The following table shows the new columns in the same worksheet after they have been split to make all values atomic. Note that the information in the Salesperson column has been split into Last Name and the First Name columns and that the information in the Address column has been split into Street Address, City, State, and ZIP Code columns. This data is in "first normal form."

Last Name

First Name

 

Street Address

City

State

ZIP Code

Li

Yale

2302 Harvard Ave

Bellevue

WA

98227

Adams

Ellen

1025 Columbia Circle

Kirkland

WA

98234

Hance

Jim

2302 Harvard Ave

Bellevue

WA

98227

Koch

Reed

7007 Cornell St Redmond

Redmond

WA

98199

Sousa

Luis

2302 Harvard Ave

Bellevue

WA

98227

Breaking data out into organized subjects in Excel

The several tables of example data that follow show the same information from the Excel worksheet after it has been split into tables for salespersons, products, customers, and orders. The table design isn't final, but it's on the right track.

The Salespersons table contains only information about sales personnel. Note that each record has a unique ID (SalesPerson ID). The SalesPerson ID value will be used in the Orders table to connect orders to salespeople.

Salespersons

Salesperson ID

Last Name

First Name

101

Li

Yale

103

Adams

Ellen

105

Hance

Jim

107

Koch

Reed

109

Sousa

Luis

The Products table contains only information about products. Note that each record has a unique ID (Product ID). The Product ID value will be used to connect product information to the Order Details table.

Products

Product ID

Price

A-2275

16.75

B-205

4.50

C-789

7.00

C-795

9.75

D-4420

7.25

F-198

5.25

J-558

8.50

The Customers table contains only information about customers. Note that each record has a unique ID (Customer ID). The Customer ID value will be used to connect customer information to the Orders table.

Customers

Customer ID

Name

Street Address

City

State

ZIP Code

Phone

1001

Contoso, Ltd.

2302 Harvard Ave

Bellevue

WA

98227

425-555-0222

1003

Adventure Works

1025 Columbia Circle

Kirkland

WA

98234

425-555-0185

1005

Fourth Coffee

7007 Cornell St

Redmond

WA

98199

425-555-0201

The Orders table contains information about orders, salespersons, customers, and products. Note that each record has a unique ID (Order ID). Some of the information in this table needs to be split into an additional table that contains order details so that the Orders table contains only four columns — the unique order ID, the order date, the salesperson ID, and the customer ID. The table shown here has not yet been split into the Order Details table.

Orders

Order ID

Order Date

SalesPerson ID

Customer ID

Product ID

Qty

2348

3/2/09

101

1001

J-558

4

2348

3/2/09

101

1001

B-205

2

2348

3/2/09

101

1001

D-4420

5

2349

3/4/09

101

1005

C-789

3

2349

3/4/09

101

1005

C-795

6

2350

3/4/09

103

1003

A-2275

2

2350

3/4/09

103

1003

F-198

6

2350

3/4/09

103

1003

B-205

1

2351

3/4/09

105

1001

C-795

6

2352

3/5/09

105

1003

A-2275

2

2352

3/5/09

105

1003

D-4420

3

2353

3/7/09

107

1005

A-2275

6

2353

3/7/09

107

1005

C-789

5

2354

3/7/09

109

1001

A-2275

3

2355

3/8/09

103

1003

D-4420

4

2355

3/8/09

103

1003

C-795

3

2356

3/10/09

101

1001

C-789

5

Order details, such as the product ID and quantity are moved out of the Orders table and stored in a table named Order Details. Keep in mind that there are 9 orders, so it makes sense that there are 9 records in this table. Note that the Orders table has a unique ID (Order ID), which will referred to from the Order Details table.

The final design of the Orders table should look like the following:

Orders

Order ID

Order Date

SalesPerson ID

Customer ID

2348

3/2/09

101

1001

2349

3/4/09

101

1005

2350

3/4/09

103

1003

2351

3/4/09

105

1001

2352

3/5/09

105

1003

2353

3/7/09

107

1005

2354

3/7/09

109

1001

2355

3/8/09

103

1003

2356

3/10/09

101

1001

The Order Details table contains no columns that require unique values (that is, there is no primary key), so it is okay for any or all columns to contain "redundant" data. However, no two records in this table should be completely identical (this rule applies to any table in a database). In this table, there should be 17 records — each corresponding to a product in an individual order. For example, in order 2349, three C-789 products comprise one of the two parts of the entire order.

The Order Details table should, therefore, look like the following:

Order Details

Order ID

Product ID

Qty

2348

J-558

4

2348

B-205

2

2348

D-4420

5

2349

C-789

3

2349

C-795

6

2350

A-2275

2

2350

F-198

6

2350

B-205

1

2351

C-795

6

2352

A-2275

2

2352

D-4420

3

2353

A-2275

6

2353

C-789

5

2354

A-2275

3

2355

D-4420

4

2355

C-795

3

2356

C-789

5

Copying and pasting data from Excel into Access

Now that the information about salespersons, customers, products, orders, and order details has been broken out into separate subjects in Excel, you can copy that data directly into Access, where it will become tables.

Creating relationships between the Access tables and running a query

After you have moved your data to Access, you can create relationships between tables and then create queries to return information about various subjects. For example, you can create a query that returns the Order ID and the names of the salespersons for orders entered between 3/05/09 and 3/08/09.

In addition, you can create forms and reports to make data entry and sales analysis easier.

No comments:

Post a Comment