Select Page

Let’s continue with the 25 Days of DAX Fridays! Challenge from Curbal. In this post I’ll go through how I solved the questions for days 6 to 10.

For more background on the overall challenge and the NorthWind data model, refer to the first post in this series:

## DAX solutions

I’ll explain how I used DAX to derive each day’s answer below. To see the expected answers, go to the Advent Calendar on the challenge page and click on the card for the respective day.

Note that it is usually possible to get to the answer by dropping relevant fields into a table or matrix visual and reading off a particular cell where the desired information is summarised. However, the challenge requires writing a measure for each day’s question that would result in the expected answer.

### Day 6: What is the average order size for all years and all customers?

The ‘order size’ here refers to the number of distinct products included in an order. This means the granularity we need to work at is each order. Since the grain of the Orders table is each unique combination of Orders[OrderID] and Orders[ProductID] (refer to the post on days 1 to 5), we use `VALUES( Orders(OrderID] )` to obtain the list of unique orders. Then we iterate on these orders using `AVERAGEX` to count the number of distinct products per order and calculate the average across all orders. I will demonstrate a number of ways to go about this, including some incorrect solutions.

#### Using a relationship

To count distinct products for each order, an initial idea may be to use the relationship between the Orders and Products tables and use `DISTINCTCOUNT` on the Products[ProductID] field.

Average order size counting `Products[ProductID]` incorrectly and correctly

Look at the matrix above. For every order the measure `[06a. Average order size for all years and all customers (Products table)]` returns 77, the total number of products in the Products table. In the data model, Products is on the one-side of the relationship while Orders is on the many-side. Because of this, the filter on Orders[ProductID] (cross-filtered by Orders[OrderID] in each row of the matrix) does not flow to the Products table, and no products are filtered.

Relationship between Products and Orders

To overcome this, we can use `CROSSFILTER` to change the filter direction temporarily to `Both`, so that Orders[ProductID] also filters Products[ProductID]. We can see the correct results for `[06b. Average order size for all years and all customers (Products table with CROSSFILTER)]` in the matrix above.

#### Using context transition

We could also have counted Orders[ProductID] directly:

Average order size counting Orders[ProductID] incorrectly and correctly

This measure is almost correct, except at the total level where the result is 77. Each row of the matrix shows the correct numbers for `[06c. Average order size for all years and all customers (Orders table)]`, because within each row the filter context includes Orders[OrderID]. In fact, in each row `UniqueOrders` returns one order ID only, and the distinct count of product IDs is calculated for that order.

On the other hand, in the total row there are no filters on Orders[OrderID]. Remember that the row context iterates but does not filter. When we iterate on `UniqueOrders`, although a row context exists for each order ID, this is not used to filter the Orders table. For each order ID, the same distinct count of product IDs across all orders is returned, namely 77. (As it happens, every product available for sale has been ordered at least once in this dataset, so it is the same value as the number of distinct products.) The average of many 77s is still 77.

The correct solution is to wrap the `DISTINCTCOUNT` within `CALCULATE` to ensure context transition occurs when we iterate on `UniqueOrders`. This means the row context with the current value of Orders[OrderID] is converted into an equivalent filter context. Now, a different distinct count of product IDs is returned for each order. In turn, the measure `[06d. Average order size for all years and all customers]` is correctly calculated at all levels of detail.

### Day 7: What is the order value in \$ of open orders (not shipped yet)?

The ‘order value’ here means sales value. First, we create a measure to calculate total sales for a given set of orders:

Now, we change the filter context to only include orders where the shipped date is blank (i.e. orders that haven’t been shipped yet). Then we calculate the `[07a. Total sales]` measure within this updated filter context:

### Day 8: How many orders are “single item” (only one product ordered)?

This is very similar to day 6, but instead we need to count only orders where the order size (number of distinct products ordered) is 1.

Once again, we use `VALUES( Orders[OrderID] )` to derive the list of unique orders. Then, using `HASONEVALUE`, we filter `UniqueOrders` to work out which of these orders only include a single product. Since the `FILTER` iterator only provides a row context and does not introduce a filter context, we use `CALCULATE` to force context transition and obtain the result of `HASONEVALUE` per order rather than across all orders. In the end, we count the number of rows in this filtered `SingleItemOrders` table.

### Day 9: Average sales per transaction (OrderID) for “Romero y tomillo”

We will solve this in two steps. First, we create a measure for average sales per order using `AVERAGEX` on the list of unique orders and total sales, both of which we’ve covered before.

Then, we modify the filter context, keeping only orders for the ship name “Romero y tomillo”, and calculate the average sales for these orders.

### Day 10: Days since “North/South” last purchase

Looking through the data we see that “North/South” is a value in the Customers[CompanyName] column. We work out the last date of purchase for “North/South” first, by modifying the filter context and using `MAX` on Orders[OrderDate] to get the most recent date. Then we use `DATEDIFF` to return the number of `DAY`s between this order date and today.

IMPORTANT!

There is a bug in the dataset that affects the result of the measure above.

In the starter PBIX file from the Curbal challenge page, values in the Orders[OrderDate] column are dynamically generated so that the latest date is always the current date (i.e. whenever you open the file). However, there is a footnote on every Curbal challenge answer page that says the last order date should be 10/4/2022 (i.e. 4 October 2022 when rendered in US format).

Footnote on Curbal challenge answer pages

To make sure this is the case, open up Power Query, navigate to the Orders table and adjust the M code with the Advanced Editor. Comment out (or remove) line 6 and add the following line right below it (see highlighted lines in the screenshot below):

`NoDays = Number.From(Date.From("2022-10-04")-Date.From(List.Max(#"Removed Columns1"[OrderDate]))),`

Fixing Orders[OrderDate] in Power Query

After doing this, you should see the measure `[10. Days since "North South" last purchase]` calculated correctly.

And that’s a wrap on the solutions for days 6 to 10. We’ll go through the solutions for days 11 to 15 next time. Feel free to reach out on LinkedIn if you have any questions or suggestions for improving these solutions.

##### Author: Vincent Ging Ho Yim

Vincent has always enjoyed learning new things as well as finding elegant and efficient solutions to problems since childhood. He studied linguistics at university and has subsequently worked in theatre lighting and broadcast captioning. In his previous job he found his passion working with data and decided to pursue a change in career. In his spare time he likes reading, learning languages (both human and programming ones) and playing Pic-a-Pix and sudoku. He loves laksa, sushi and burritos.