Monday, October 18, 2021

Combine files in a folder with combine binaries power query

Use Power Query to combine multiple files with the same schema stored in a single folder into one table. For example, each month you want to combine budget workbooks from multiple departments, where the columns are the same, but the number of rows and values differ in each workbook. Once you set it up, you can apply additional transformations as you would with any single imported data source and then refresh the data to see results for each month.  

A conceptual overview of Combining folder files

Note    This topic shows how to combine files from a folder. You can also combine files stored in SharePoint, Azure Blob Storage, and Azure Data Lake Storage. The process is similar.

Keep it simple:

  • Ensure that all the files you want to combine are contained in a dedicated folder without extraneous files. Otherwise, all files in the folder and any subfolders you select are included in the data to be combined.

  • Each file should have the same schema with consistent column headers, data types, and number of columns. The columns do not have to be in the same order as the matching is done by column names.

  • If possible, avoid unrelated data objects for data sources that can have more than one data object, such as a JSON file, an Excel workbook, or Access database.

Each of these files follow a simple pattern, only one table of data in each file.

  1. Select Data > Get Data > From File > From Folder. The Browse dialog box appears.

  2. Locate the folder containing the files you want to combine.

  3. A list of the files in the folder appears in the <Folder path> dialog box. Verify that all the files you want are listed.

    An example text import dialog box

  4. Select one of the commands at the bottom of the dialog box, for example  Combine > Combine & Load. There are additional commands discussed in the section About all those commands.

  5. If you select any Combine command, The Combine Files dialog box appears. To change file settings, select each file from the Sample File box, set the File Origin, Delimiter, and Data Type Detection as desired. You can also select or clear the Skip files with errors checkbox at the bottom of the dialog box.

  6. Select OK.

Result

Power Query automatically creates queries to consolidate the data from each file into a worksheet. The query steps and columns created depend on which command you choose. For more information, see the section, About all those queries.

  1. Select Data > Get Data > From File > From Folder. The Browse dialog box appears.

  2. Locate the folder containing the files you want to combine.

  3. A list of the files in the folder appears in the <Folder path> dialog box. Verify that all the files you want are listed.

  4. Select one of the commands at the bottom of the dialog box, for example  Combine > Combine & Transform. There are additional commands discussed in the section About all those commands.

    The Power Query Editor appears.

  5. The Value column is a structured List column. Select the Expand  Expand column icon icon, and then select Expand to New rows. 

    Expanding a JSON List

  6. The Value column is now a structured Record column. Select the Expand  Expand column icon icon. A drop-down dialog box appears.

    Expanding a JSON Record

  7. Keep all the columns selected. You may want to clear the Use original column name as a prefix check box. Select OK.

  8. Select all the columns that contain data values. Select Home, the arrow next to Remove Columns, and then select Remove Other Columns.

  9. Select Home > Close & Load.

Result

Power Query automatically creates queries to consolidate the data from each file into a worksheet. The query steps and columns created depend on which command you choose. For more information, see the section, About all those queries.

Each of these data sources can have more than one object to import. An Excel workbook can have multiple worksheets, Excel tables, or named ranges. An Access database can have multiple tables and queries. 

  1. Select Data > Get Data > From File > From Folder. The Browse dialog box appears.

  2. Locate the folder containing the files you want to combine.

  3. A list of the files in the folder appears in the <Folder path> dialog box. Verify that all the files you want are listed.

  4. Select one of the commands at the bottom of the dialog box, for example  Combine > Combine & Load. There are additional commands discussed in the section About all those commands.

  5. In the Combine Files dialog box:

    • In the Sample File box, select a file to use as sample data used to create the queries. You can either not select an object or select just one object. But, you can't select more than one.

    • If you have many objects, use the Search box to locate an object or the Display Options along with the Refresh button to filter the list.

    • Select or clear the Skip files with errors checkbox at the bottom of the dialog box.

  6. Select OK.

Result

Power Query automatically creates a query to consolidate the data from each file into a worksheet. The query steps and columns created depend on which command you choose. For more information, see the section, About all those queries.

For more flexibility, you can explicitly combine files in the Power Query Editor by using the Combine Files command. Let's say the source folder has a mixture of file types and subfolders, and you want to target specific files with the same file type and schema but not others. This can improve performance and help simplify your transformations.

  1. Select Data > Get Data > From File > From Folder. The Browse dialog box appears.

  2. Locate the folder containing the files you want to combine, and then select Open.

  3. A list of all the files in the folder and subfolders appears in the <Folder path> dialog box. Verify that all the files you want are listed.

  4. Select Transform Data at the bottom. The Power Query Editor opens and displays all the files in the folder and any subfolders.

  5. To select the files you want, filter columns, such as Extension or Folder Path.

  6. To combine the files into single table, select the Content column that contains each Binary (usually the first column), and then select Home > Combine Files. The Combine Files dialog box appears.

  7. Power Query analyzes an example file, by default the first file in the list, to use the correct connector and identify matching columns.

    To use a different file for the example file, select it from the Sample File drop-down list.

  8. Optionally, at the bottom, select Skip files with errors to exclude those files from the result.

  9. Select OK.

Result

Power Query automatically creates a queries to consolidate the data from each file into a worksheet. The query steps and columns created depend on which command you choose. For more information, see the section, About all those queries.

There are several commands you can select and each one has a different purpose.

  • Combine and Transform Data    To combine all the files with a query, and then launch the Power Query Editor, select Combine > Combine and Transform Data.

  • Combine and Load    To display the Sample file dialog box, create a query, and then load to worksheet, select Combine > Combine and Load.

  • Combine and Load To    To display the Sample file dialog box, create a query, and then display Import dialog box, select Combine > Combine and Load To.

  • Load    To create a query with one step, and then load to a worksheet, select Load > Load.

  • Load To    To create a query with one step, and then display the Import dialog box, select Load > Load To.

  • Transform Data To create a query with one step, and then launch the Power Query Editor, select Transform Data.

However you combine files, several supporting queries are created in the Queries pane under the "Helper Queries" group.

A list of the queries created in the Queries pane

  • Power Query creates a "Sample File" query based on the example query.

  • A "Transform File" function query uses the "Parameter1" query to specify each file (or binary) as input to the "Sample File" query. This query also creates the Content column containing the file contents and automatically expands the structured Record column to add the column data to the results. The "Transform File" and "Sample File" queries are linked, so that changes to the "Sample File" query are reflected in the "Transform File" query.

  • The query containing the final results is in the "Other queries" group. By default, it is named after the folder you imported the files from.

For further investigation, right click each query and select Edit to examine each query step and to see how the queries work in concert.

To start the process of combining multiple files, first ensure that all the files you want to combine are contained in a single file folder.

Note: Excel, Text, CSV, JSON, XML and Access files are supported.

  1. Click on Power Query tab, then From File > From Folder.

    Power Query > From File > From Folder options
  2. Click the Browse button to locate your folder.

    A list of the files in the folder will appear. Select the Edit option to open the Query Editor. In this case, we've loaded 4 Excel workbooks.

    Combine Binary file dialog with files to be combined
  3. Once you've verified that all the files you want are listed, click the Combine Binaries button from the Content column.

    Press the Combine Binaries button from the query Content column or from the Combine section in the Power Query Ribbon
  4. Each file is analyzed, and the correct file format is determined, such as text, Excel or JSON. In this case, a list of worksheets from the first Excel workbook is displayed. Select the worksheet you want to use, and then and click OK.

    Combine Binaries dialog displaying available Excel worksheets to select the primary consolidation target
  5. The combine binaries transform process will then automatically create a query to consolidate the data from each file and display a preview. If you're satisfied with the outcome, then click Close & Load.

    Combine Binaries Preview dialog. Press Close & Load to accept the results and import them to Excel.

    The combine binaries process will then consolidate the listed worksheets into a single worksheet.

  6. If the original data files change, you can always refresh the imported data. Click anywhere within the data range, then go to Query Tools > Refresh. In addition, you can easily apply additional transformation or extraction steps by modifying the automatically created exemplar query, without having to worry about modifying or creating additional function query steps; any changes to the exemplar query are automatically generated in the linked function query.

Known Issues

When loading the combine binaries procedure to Excel you may see the following error message:

Combine Binaries error message. This is a known bug that is being addressed.

If you dismiss the error(s) by clicking OK, the combine binaries process will complete and properly load your data despite the error. We are working to correct the error.

See Also

Power Query for Excel Help

Append queries

Combine files overview (docs.com)

Combine CSV files in Power Query (docs.com)

No comments:

Post a Comment