Context in DAX Formulas
Context enables you to perform dynamic analysis, in which the results of a formula can change to reflect the current row or cell selection and also any related data. Understanding context and using context effectively are very important for building high-performing formulas, dynamic analyses, and for troubleshooting problems in formulas.
This section defines the different types of context: row context, query context, and filter context. It explains how context is evaluated for formulas in calculated columns and in PivotTables.
The last part of this article provides links to detailed examples that illustrate how the results of formulas change according to context.
Understanding Context
Formulas in Power Pivot can be affected by the filters applied in a PivotTable, by relationships between tables, and by filters used in formulas. Context is what makes it possible to perform dynamic analysis. Understanding context is important for building and for troubleshooting formulas.
There are different types of context: row context, query context, and filter context.
Row context can be thought of as "the current row." If you have created a calculated column, the row context consists of the values in each individual row and values in columns that are related to the current row. There are also some functions (EARLIER Function and EARLIEST Function) that get a value from the current row and then use that value while performing an operation over an entire table.
Query context refers to the subset of data that is implicitly created for each cell in a PivotTable, depending on the row and column headers.
Filter context is the set of values allowed in each column, based on filter constraints that were applied to the row or that are defined by filter expressions within the formula.
Row Context
If you create a formula in a calculated column, the row context for that formula includes the values from all columns in the current row. If the table is related to another table, the content also includes all the values from that other table that are related to the current row.
For example, suppose you create a calculated column, =[Freight] + [Tax],
, that adds together two columns from the same table. This formula behaves like formulas in an Excel table, which automatically reference values from the same row. Note that tables are different from ranges: you cannot reference a value from the row before the current row by using range notation, and you cannot reference any arbitrary single value in a table or cell. You must always work with tables and columns.
Row context automatically follows the relationships between tables to determine which rows in related tables are associated with the current row.
For example, the following formula uses the RELATED function to fetch a tax value from a related table, based on the region that the order was shipped to. The tax value is determined by using the value for region in the current table, looking up the region in the related table, and then getting the tax rate for that region from the related table.
= [Freight] + RELATED('Region'[TaxRate])
This formula simply gets the tax rate for the current region, from the Region table. You do not need to know or specify the key that connects the tables.
Multiple Row Context
Additionally, DAX includes functions that iterate calculations over a table. These functions can have multiple current rows and current row contexts. In programming terms, you can create formulas that recurse over an inner and outer loop.
For example, suppose your workbook contains a Products table and a Sales table. You might want to go through the entire sales table, which is full of transactions involving multiple products, and find the largest quantity ordered for each product in any one transaction.
In Excel, this calculation requires a series of intermediate summaries, which would have to be rebuilt if the data changed. If you are a power user of Excel, you might be able to build array formulas that would do the job. Alternatively, in a relational database you could write nested subselects.
However, with DAX you can build a single formula that returns the correct value, and the results are automatically updated any time you add data to the tables.
=MAXX(FILTER(Sales,[ProdKey]=EARLIER([ProdKey])),Sales[OrderQty])
For a detailed walkthrough of this formula, see the EARLIER Function.
In short, the EARLIER function stores the row context from the operation that preceded the current operation. At all times, the function stores in memory two sets of context: one set of context represents the current row for the inner loop of the formula, and another set of context represents the current row for the outer loop of the formula. DAX automatically feeds values between the two loops so that you can create complex aggregates.
Query Context
Query context refers to the subset of data that is implicitly retrieved for a formula. When you drop a measure or other value field into a cell in a PivotTable, the Power Pivot engine examines the row and column headers, Slicers, and report filters to determine the context. Then, Power Pivot makes the necessary calculations to populate each cell in the PivotTable. The set of data that is retrieved is the query context for each cell.
Because the context can change depending on where you place the formula, the results of the formula also change depending on whether you use the formula in a PivotTable with many groupings and filters, or in a calculated column with no filters and minimal context.
For example, suppose you create this simple formula that sums the values in the Profit column of the Sales table:=SUM('Sales'[Profit]).
If you use this formula in a calculated column within the Sales table, the results for the formula will be the same for the entire table, because the query context for the formula is always the entire data set of the Sales table. Your results will have profit for all regions, all products, all years, and so on.
However, typically you don't want to see the same result hundreds of times, but instead you want to get the profit for a particular year, a particular country or region, a particular product, or some combination of these, and then get a grand total.
In a PivotTable, it is easy to change context by adding or removing column and row headers and by adding or removing Slicers. You can create a formula like the one above, in a measure, and then drop it into a PivotTable. Whenever you add column or row headings to the PivotTable, you change the query context in which the measure is evaluated. Slicing and filtering operations also affect context. Therefore, the same formula, used in a PivotTable, is evaluated in a different query context for each cell.
Filter Context
Filter context is added when you specify filter constraints on the set of values allowed in a column or table, by using arguments to a formula. Filter context applies on top of other contexts, such as row context or query context.
For example, a PivotTable calculates its values for each cell based on the row and column headings, as described in the preceding section on query context. However, within the measures or calculated columns that you add to the PivotTable, you can specify filter expressions to control the values that are used by the formula. You can also selectively clear the filters on particular columns.
For more information about how to create filters within formulas, see the Filter Functions (DAX).
For an example of how filters can be cleared to create grand totals, see the ALL Function.
For examples of how to selectively clear and apply filters within formulas, see the ALLEXCEPT Function.
Therefore, you must review the definition of measures or formulas used in a PivotTable so that you are aware of filter context when interpreting the results of formulas.
Determining Context in Formulas
When you create a formula, Power Pivot for Excel first checks for general syntax, and then it checks the names of columns and tables that you provide against possible columns and tables in the current context. If Power Pivot cannot find the columns and tables specified by the formula, you will get an error.
Context is determined as described in the preceding sections, by using the available tables in the workbook, any relationships between the tables, and any filters that have been applied.
For example, if you have just imported some data into a new table and have not applied any filters, the entire set of columns in the table is part of the current context. If you have multiple tables that are linked by relationships and you are working in a PivotTable that has been filtered by adding column headings and using Slicers, the context includes the related tables and any filters on the data.
Context is a powerful concept that can also make it difficult to troubleshoot formulas. We recommend that you begin with simple formulas and relationships to see how context works, and then begin experimenting with simple formulas in PivotTables. The following section also provides some examples of how formulas use different types of context to dynamically return results.
Examples of Context in Formulas
-
The RELATED function expands the context of the current row to include values in a related column. This lets you perform lookups. The example in this topic illustrates the interaction of filtering and row context.
-
The FILTER function lets you specify the rows to include in the current context. The examples in this topic also illustrate how to embed filters within other functions that perform aggregates.
-
The ALL function sets context within a formula. You can use it to override filters that are applied as result of query context.
-
The ALLEXCEPT function lets you remove all filters except one that you specify. Both topics include examples that walk you through building formulas and understanding complex contexts.
-
The EARLIER and EARLIEST functions let you loop through tables by performing calculations, while referencing a value from an inner loop. If you are familiar with the concept of recursion and with inner and outer loops, you will appreciate the power that the EARLIER and EARLIEST functions provide. If you are new to these concepts, you should follow the steps in the example carefully to see how the inner and outer contexts are used in calculations.
Referential Integrity
This section discusses some advanced concepts related to missing values in Power Pivot tables that are connected by relationships. This section might be useful to you if you have workbooks with multiple tables and complex formulas and want help in understanding the results.
If you are new to relational data concepts, we recommend that you first read the introductory topic, Relationships Overview.
Referential Integrity and Power Pivot Relationships
Power Pivot does not require that referential integrity be enforced between two tables in order to define a valid relationship. Instead, a blank row is created on the "one" end of each one-to-many relationship and is used to handle all non-matching rows from the related table. It effectively behaves as a SQL outer join.
In PivotTables, if you group data by the one side of the relationship, any unmatched data on the many side of the relationship is grouped together and will be included in totals with a blank row heading. The blank heading is roughly equivalent to the "unknown member."
Understanding the Unknown Member
The concept of the unknown member is probably familiar to you if you have worked with multidimensional database systems, such as SQL Server Analysis Services. If the term is new to you, the following example explains what the unknown member is and how it affects calculations.
Suppose you are creating a calculation that sums monthly sales for each store, but a column in the Sales table is missing a value for the store name. Given that the tables for Store and Sales are connected by the store name, what would you expect to happen in the formula? How should the PivotTable group or display the sales figures that are not related to an existing store?
This problem is a common one in data warehouses, where large tables of fact data must be logically related to dimension tables that contain information about stores, regions, and other attributes that are used for categorizing and calculating facts. To resolve the problem, any new facts that are unrelated to an existing entity are temporarily assigned to the unknown member. That is why unrelated facts will appear grouped in a PivotTable under a blank heading.
Treatment of Blank Values vs. the Blank Row
Blank values are different from the blank rows that are added to accommodate the unknown member. The blank value is a special value that is used to represent nulls, empty strings, and other missing values. For more information about the blank value, as well as other DAX data types, see Data types in Data Models.
No comments:
Post a Comment