I've been tasked with altering existing ETL processes to reflect changes in the data, and to also alter an existing cube which will be using this data.
My question is, can you filter the drillthrough results based on whether a particular dimension has been used or not, and if it hasn't, return only distinct rows?
Perhaps I should explain the situation more.
Our current processes regularly pull data from our source system, perform some transformations and validation etc using staging tables, and eventually store the data in SQL Server tables. The data is then used to populate our cubes, which users access via a custom built application and build reports on them.
One area of the source system is changing due to business requirements. The table, PersonRelease, stores information about employees when they leave the company, one row for every time a person leaves. The table is keyed on the EmployeeID and ReleaseDate.
At the moment, this table has a column, Reason, which records why a person left. This data is used in the PersonRelease cube, and there is a ReleaseReason dimension which joins to the fact table on the Reason column.
The business now needs to be able to record up to three reasons of why a person leaves.
My manager does not want extra columns to be added to this table. Instead, he wants a new table created (PersonReleaseReason) with a 1:M relationship to the PersonRelease table, joined by EmployeeID.
That way, if the business requirements change (highly likely) and they want more than three reasons, we will not have to alter the schema. It's also good this way because when the data flows into the PersonRelease cube, the dimension will still work correctly without making any changes.
My approach so far is to:
Change how the fact table is loaded with data. The load will now join PersonRelease and PersonReleaseReason tables to get all reasons for a person leaving.
This will increase the rows loaded into the fact table.
As such, I will need a Distinct Count, and a normal Count.
Distinct Count for number of people leaving.
Count for number of records.
Everything so far is easy to implement.
The problem is when it comes to changing the cube's drillthrough.
Because of these extra rows, my manager wants to be able to drillthrough on the cube as follows:
If the report does not use the Release reason dimension to filter on, bring back only distinct PersonRelease data. Distinct is based on the original key columns (Person and ReleaseDate), and doesn't include the Reason column.
i.e. if there are 100 Release records all with 3 reasons (300 rows in fact table) bring back the 100 distinct rows.
(The reasoning behind this is so that when we alter the cube, existing users won't see a difference in numbers etc...)
If the report does use the Release reason dimension, bring back all rows that match the values selected in the dimensions.
And, I know there is drillthrough filter functionality, but notice it's more to filter on a particular value (i.e. where statement), rather than using it to return distinct values, like using a group by.
I guess what I need is a more dynamic drillthrough filter functionality, which checks to see if the ReleaseReason dimension has been used in the report, and if it hasn’t, to return less rows.
Is this possible??????
Even if that’s possible, I can foresee some problems when it comes to the drillthrough, especially since the Reason column is shown in the drillthrough. If I am able to return fewer rows when the ReleaseReason dimension is not used, what value should appear in the Release column?
Anyways, thanks in advance for any thoughts, ideas, and workarounds. I would r