Move data from Excel to Access
This article shows you how to move your data from Excel to Access and convert your data to relational tables so that you can use Microsoft Excel and Access together. To summarize, Access is best for capturing, storing, querying, and sharing data, and Excel is best for calculating, analyzing, and visualizing data.
Two articles, Using Access or Excel to manage your data and Top 10 reasons to use Access with Excel, discuss which program is best suited for a particular task and how to use Excel and Access together to create a practical solution.
When you move data from Excel to Access, there are three basic steps to the process.
Note: For information on data modeling and relationships in Access, see Database design basics.
Step 1: Import data from Excel to Access
Importing data is an operation that can go a lot more smoothly if you take some time to prepare and clean your data. Importing data is like moving to a new home. If you clean out and organize your possessions before you move, settling into your new home is much easier.
Clean your data before you import
Before you import data into Access, in Excel it's a good idea to:
-
Convert cells that contain non-atomic data (that is, multiple values in one cell) to multiple columns. For example, a cell in a "Skills" column that contains multiple skill values, such as "C# programming," "VBA programming," and "Web design" should be broken out to separate columns that each contain only one skill value.
-
Use the TRIM command to remove leading, trailing, and multiple embedded spaces.
-
Remove non-printing characters.
-
Find and fix spelling and punctuation errors.
-
Remove duplicate rows or duplicate fields.
-
Ensure that columns of data do not contain mixed formats, especially numbers formatted as text or dates formatted as numbers.
For more information, see the following Excel help topics:
Note: If your data cleaning needs are complex, or you don't have the time or resources to automate the process on your own, you might consider using a third-party vendor. For more information, search for "data cleansing software" or "data quality" by your favorite search engine in your Web browser.
Choose the best data type when you import
During the import operation in Access, you want to make good choices so that you receive few (if any) conversion errors that will require manual intervention. The following table summarizes how Excel number formats and Access data types are converted when you import data from Excel to Access, and offers some tips on the best data types to choose in the Import Spreadsheet Wizard.
Excel number format | Access data type | Comments | Best practice |
Text | Text, Memo | The Access Text data type stores alphanumeric data up to 255 characters. The Access Memo data type stores alphanumeric data up to 65,535 characters. | Choose Memo to avoid truncating any data. |
Number, Percentage, Fraction, Scientific | Number | Access has one Number data type that varies based on a Field Size property (Byte, Integer, Long Integer, Single, Double, Decimal). | Choose Double to avoid any data conversion errors. |
Date | Date | Access and Excel both use the same serial date number to store dates. In Access, the date range is larger: from -657,434 (January 1, 100 A.D.) to 2,958,465 (December 31, 9999 A.D.). Because Access does not recognize the 1904 date system (used in Excel for the Macintosh), you need to convert the dates either in Excel or Access to avoid confusion. For more information, see Change the date system, format, or two-digit year interpretation and Import or link to data in an Excel workbook. | Choose Date. |
Time | Time | Access and Excel both store time values by using the same data type. | Choose Time, which is usually the default. |
Currency, Accounting | Currency | In Access, the Currency data type stores data as 8-byte numbers with precision to four decimal places, and is used to store financial data and prevent rounding of values. | Choose Currency, which is usually the default. |
Boolean | Yes/No | Access uses -1 for all Yes values and 0 for all No values, whereas Excel uses 1 for all TRUE values and 0 for all FALSE values. | Choose Yes/No, which automatically converts underlying values. |
Hyperlink | Hyperlink | A hyperlink in Excel and Access contains a URL or Web address that you can click and follow. | Choose Hyperlink, otherwise Access may use the Text data type by default. |
Once the data is in Access, you can delete the Excel data. Don't forget to backup the original Excel workbook first before deleting it.
For more information, see the Access help topic Import or link to data in an Excel workbook.
Automatically append data the easy way
A common problem Excel users have is appending data with the same columns into one large worksheet. For example, you may have an asset tracking solution that started out in Excel but now has grown to include files from many workgroups and departments. This data may be in different worksheets and workbooks, or in text files that are data feeds from other systems. There is no user interface command or easy way to append similar data in Excel.
The best solution is to use Access, where you can easily import and append data into one table by using the Import Spreadsheet Wizard. Furthermore, you can append a lot of data into one table. You can save the import operations, add them as scheduled Microsoft Outlook tasks, and even use macros to automate the process.
Step 2: Normalize data by using the Table Analyzer Wizard
At first glance, stepping through the process of normalizing your data may seem a daunting task. Fortunately, normalizing tables in Access is a process that is much easier, thanks to the Table Analyzer Wizard.
1. Drag selected columns to a new table and automatically create relationships
2. Use button commands to rename a table, add a primary key, make an existing column a primary key, and undo the last action
You can use this wizard to do the following:
-
Convert a table into a set of smaller tables and automatically create a primary and foreign key relationship between the tables.
-
Add a primary key to an existing field that contains unique values, or create a new ID field that uses the AutoNumber data type.
-
Automatically create relationships to enforce referential integrity with cascading updates. Cascading deletes are not automatically added to prevent accidentally deleting data, but you can easily add cascading deletes later.
-
Search new tables for redundant or duplicate data (such as the same customer with two different phone numbers) and update this as desired.
-
Back up the original table and rename it by appending "_OLD" to its name. Then, you create a query that reconstructs the original table, with the original table name so that any existing forms or reports based on the original table will work with the new table structure.
For more information, see Normalize your data using the Table Analyzer.
Step 3: Connect to Access data from Excel
After the data has been normalized in Access and a query or table has been created that reconstructs the original data, it's a simple matter of connecting to the Access data from Excel. Your data is now in Access as an external data source, and so can be connected to the workbook through a data connection, which is a container of information that is used to locate, log on to, and access the external data source. Connection information is stored in the workbook and can also be stored in a connection file, such as an Office Data Connection (ODC) file (.odc file name extension) or a Data Source Name file (.dsn extension). After you connect to external data, you can also automatically refresh (or update) your Excel workbook from Access whenever the data is updated in Access.
For more information, see Overview of connecting (importing) data.
Get your data into Access
This section walks you through the following phases of normalizing your data: Breaking values in the Salesperson and Address columns into their most atomic pieces, separating related subjects into their own tables, copying and pasting those tables from Excel into Access, creating key relationships between the newly created Access tables, and creating and running a simple query in Access to return information.
Example data in non-normalized form
The following worksheet contains non-atomic values in the Salesperson column and the Address column. Both columns should be split into two or more separate columns. This worksheet also contains information about salespersons, products, customers, and orders. This information should also be split further, by subject, into separate tables.
Salesperson | Order ID | Order Date | Product ID | Qty | Price | Customer Name | Address | Phone |
Li, Yale | 2349 | 3/4/09 | C-789 | 3 | $7.00 | Fourth Coffee | 7007 Cornell St Redmond, WA 98199 | 425-555-0201 |
Li, Yale | 2349 | 3/4/09 | C-795 | 6 | $9.75 | Fourth Coffee | 7007 Cornell St Redmond, WA 98199 | 425-555-0201 |
Adams, Ellen | 2350 | 3/4/09 | A-2275 | 2 | $16.75 | Adventure Works | 1025 Columbia Circle Kirkland, WA 98234 | 425-555-0185 |
Adams, Ellen | 2350 | 3/4/09 | F-198 | 6 | $5.25 | Adventure Works | 1025 Columbia Circle Kirkland, WA 98234 | 425-555-0185 |
Adams, Ellen | 2350 | 3/4/09 | B-205 | 1 | $4.50 | Adventure Works | 1025 Columbia Circle Kirkland, WA 98234 | 425-555-0185 |
Hance, Jim | 2351 | 3/4/09 | C-795 | 6 | $9.75 | Contoso, Ltd. | 2302 Harvard Ave Bellevue, WA 98227 | 425-555-0222 |
Hance, Jim | 2352 | 3/5/09 | A-2275 | 2 | $16.75 | Adventure Works | 1025 Columbia Circle Kirkland, WA 98234 | 425-555-0185 |
Hance, Jim | 2352 | 3/5/09 | D-4420 | 3 | $7.25 | Adventure Works | 1025 Columbia Circle Kirkland, WA 98234 | 425-555-0185 |
Koch, Reed | 2353 | 3/7/09 | A-2275 | 6 | $16.75 | Fourth Coffee | 7007 Cornell St Redmond, WA 98199 | 425-555-0201 |
Koch, Reed | 2353 | 3/7/09 | C-789 | 5 | $7.00 | Fourth Coffee | 7007 Cornell St Redmond, WA 98199 | 425-555-0201 |
Information in its smallest parts: atomic data
Working with the data in this example, you can use the Text to Column command in Excel to separate the "atomic" parts of a cell (such as street address, city, state, and postal code) into discrete columns.
The following table shows the new columns in the same worksheet after they have been split to make all values atomic. Note that the information in the Salesperson column has been split into Last Name and the First Name columns and that the information in the Address column has been split into Street Address, City, State, and ZIP Code columns. This data is in "first normal form."
Last Name | First Name |
| Street Address | City | State | ZIP Code |
Li | Yale | 2302 Harvard Ave | Bellevue | WA | 98227 | |
Adams | Ellen | 1025 Columbia Circle | Kirkland | WA | 98234 | |
Hance | Jim | 2302 Harvard Ave | Bellevue | WA | 98227 | |
Koch | Reed | 7007 Cornell St Redmond | Redmond | WA | 98199 |
Breaking data out into organized subjects in Excel
The several tables of example data that follow show the same information from the Excel worksheet after it has been split into tables for salespersons, products, customers, and orders. The table design isn't final, but it's on the right track.
The Salespersons table contains only information about sales personnel. Note that each record has a unique ID (SalesPerson ID). The SalesPerson ID value will be used in the Orders table to connect orders to salespeople.
Salespersons | ||
Salesperson ID | Last Name | First Name |
101 | Li | Yale |
103 | Adams | Ellen |
105 | Hance | Jim |
107 | Koch | Reed |
The Products table contains only information about products. Note that each record has a unique ID (Product ID). The Product ID value will be used to connect product information to the Order Details table.
Products | |
Product ID | Price |
A-2275 | 16.75 |
B-205 | 4.50 |
C-789 | 7.00 |
C-795 | 9.75 |
D-4420 | 7.25 |
F-198 | 5.25 |
The Customers table contains only information about customers. Note that each record has a unique ID (Customer ID). The Customer ID value will be used to connect customer information to the Orders table.
Customers | ||||||
Customer ID | Name | Street Address | City | State | ZIP Code | Phone |
1001 | Contoso, Ltd. | 2302 Harvard Ave | Bellevue | WA | 98227 | 425-555-0222 |
1003 | Adventure Works | 1025 Columbia Circle | Kirkland | WA | 98234 | 425-555-0185 |
1005 | Fourth Coffee | 7007 Cornell St | Redmond | WA | 98199 | 425-555-0201 |
The Orders table contains information about orders, salespersons, customers, and products. Note that each record has a unique ID (Order ID). Some of the information in this table needs to be split into an additional table that contains order details so that the Orders table contains only four columns — the unique order ID, the order date, the salesperson ID, and the customer ID. The table shown here has not yet been split into the Order Details table.
Orders | |||||
Order ID | Order Date | SalesPerson ID | Customer ID | Product ID | Qty |
2349 | 3/4/09 | 101 | 1005 | C-789 | 3 |
2349 | 3/4/09 | 101 | 1005 | C-795 | 6 |
2350 | 3/4/09 | 103 | 1003 | A-2275 | 2 |
2350 | 3/4/09 | 103 | 1003 | F-198 | 6 |
2350 | 3/4/09 | 103 | 1003 | B-205 | 1 |
2351 | 3/4/09 | 105 | 1001 | C-795 | 6 |
2352 | 3/5/09 | 105 | 1003 | A-2275 | 2 |
2352 | 3/5/09 | 105 | 1003 | D-4420 | 3 |
2353 | 3/7/09 | 107 | 1005 | A-2275 | 6 |
2353 | 3/7/09 | 107 | 1005 | C-789 | 5 |
Order details, such as the product ID and quantity are moved out of the Orders table and stored in a table named Order Details. Keep in mind that there are 9 orders, so it makes sense that there are 9 records in this table. Note that the Orders table has a unique ID (Order ID), which will referred to from the Order Details table.
The final design of the Orders table should look like the following:
Orders | |||
Order ID | Order Date | SalesPerson ID | Customer ID |
2349 | 3/4/09 | 101 | 1005 |
2350 | 3/4/09 | 103 | 1003 |
2351 | 3/4/09 | 105 | 1001 |
2352 | 3/5/09 | 105 | 1003 |
2353 | 3/7/09 | 107 | 1005 |
The Order Details table contains no columns that require unique values (that is, there is no primary key), so it is okay for any or all columns to contain "redundant" data. However, no two records in this table should be completely identical (this rule applies to any table in a database). In this table, there should be 17 records — each corresponding to a product in an individual order. For example, in order 2349, three C-789 products comprise one of the two parts of the entire order.
The Order Details table should, therefore, look like the following:
Order Details | ||
Order ID | Product ID | Qty |
2349 | C-789 | 3 |
2349 | C-795 | 6 |
2350 | A-2275 | 2 |
2350 | F-198 | 6 |
2350 | B-205 | 1 |
2351 | C-795 | 6 |
2352 | A-2275 | 2 |
2352 | D-4420 | 3 |
2353 | A-2275 | 6 |
2353 | C-789 | 5 |
Copying and pasting data from Excel into Access
Now that the information about salespersons, customers, products, orders, and order details has been broken out into separate subjects in Excel, you can copy that data directly into Access, where it will become tables.
Creating relationships between the Access tables and running a query
After you have moved your data to Access, you can create relationships between tables and then create queries to return information about various subjects. For example, you can create a query that returns the Order ID and the names of the salespersons for orders entered between 3/05/09 and 3/08/09.
In addition, you can create forms and reports to make data entry and sales analysis easier.
Need more help?
You can always ask an expert in the Excel Tech Community, get support in the Answers community, or suggest a new feature or improvement on Excel User Voice.
No comments:
Post a Comment