Hi all,
we're facing frequent problems with calculating count distincts, and I think they could be of greater value to be discussed here (sorry if this seems too enthusiastic to some of you, but believe me if you're not aware of this you'll run into problems)
First, consider a DWH where you have millions of customers making purchases, and we're lucky because we can identify them all (well, probably it would have been better not being able to, as then the data volume would be smaller
, and there's a table in which the unique customer ID is stored.
However, when you want the full flexibility over time you may face some issues with the data volume and also query speed, so you think of an alternative approach where you precalculate predefined levels, as aggregates won't work fine (how to aggregate efficiently on so many distinct customers ...?! not possible at all...). So you put a LEVEL indicator in the table and try to get this rewritten in MSTR for aggregates, which seems to be virtually impossible.
Another more practical problem is ocurring with consolidations: When a COUNT DISTINCT metric is defined, it seems that MSTR is not applying consolidations correctly, such as on product groups (how many distinct customers have bought products from a product group...?). At least the SQL it creates doesn't count on groups (only consolidation, no attribute on this level) but still on the more detailed level of products (on which attribute the consolidation was defined).
Finally slightly out of topic, is there a way to flexibly define a 'Rest' element in a consolidation?
Thanks for your thoughts!
Best regards
we're facing frequent problems with calculating count distincts, and I think they could be of greater value to be discussed here (sorry if this seems too enthusiastic to some of you, but believe me if you're not aware of this you'll run into problems)
First, consider a DWH where you have millions of customers making purchases, and we're lucky because we can identify them all (well, probably it would have been better not being able to, as then the data volume would be smaller
However, when you want the full flexibility over time you may face some issues with the data volume and also query speed, so you think of an alternative approach where you precalculate predefined levels, as aggregates won't work fine (how to aggregate efficiently on so many distinct customers ...?! not possible at all...). So you put a LEVEL indicator in the table and try to get this rewritten in MSTR for aggregates, which seems to be virtually impossible.
Another more practical problem is ocurring with consolidations: When a COUNT DISTINCT metric is defined, it seems that MSTR is not applying consolidations correctly, such as on product groups (how many distinct customers have bought products from a product group...?). At least the SQL it creates doesn't count on groups (only consolidation, no attribute on this level) but still on the more detailed level of products (on which attribute the consolidation was defined).
Finally slightly out of topic, is there a way to flexibly define a 'Rest' element in a consolidation?
Thanks for your thoughts!
Best regards