Avoid Non-numeric facts!

Yes, I’ve done this mistake more than once, usually unwilfully. Using varchar columns in fact table.

Everyone knows that using anything but numeric facts is a bad idea. However, there’re situations when it might be deemed necessary, especially when dealing with reporting at the detailed level. For example, client once wanted Condition Grade for previous year using AGO function in Fact Table. There was a need for previous year reporting using AGO and TODATE functions. Also, there was a complex function evaluating variance in condition. Since there’re limited aggregation functions available for varchars in OBIEE  (FIRST and LAST..I don’t countCOUNT – since it doesn’t really do it for that report). And you need to set some aggergation, because otherwise, you can’t use those column in time-series functions since those require aggergated metrics.

Well, what do you know. Query performance for the whole dataset was very slow (it’s supposed to be, since nobody should get inital report of more than a page) . About 20 minutes. Setting up bitmap indexes reduced it to 4 minutes. And setting textual facts’ aggregation setting to “None” reduced that time further acceptable 20 seconds. However, time-series functionality was lost.

So, please don’t use textual facts – they won’t make you any good:

a) You can’t really get any meaning out of them except on the most detailed level. How do you measure A vs B ?

b) You run into performance issues.

What to do:

I guess it reallty depends on the situation and your gut feeling.  Ideally, you want to convert letter grades to numeric alternative.  Such as A=4, B=3, etc. then you can make all kinds of fun analysis with it.  If that’s not an option, then you should try  pushing that data to your dimension (using an aliased combo table if you don’t have that dimension). I understand there might be more viable solutions. I’m going to find my post and discussion on OTN.

4 thoughts on “Avoid Non-numeric facts!”

  1. I too am currently implementing a solution for HR data. Like it or not there are lots of textual facts and that is just life! I think OBIEE could better support stuff like this. As you say the basic solution is to put an aggregate function so that OBIEE treats them as measures and this works fine for me. What I found was that I only needed to do this when I was combining 2 facts together and needed to enter filters against one fact – without them being defined as measures OBIEE errored. An example would be: I have an Employee dimension, fact 1 is Salary history and fact 2 is Salary Payment Method history and the sort of report I have to produce is Employee Information, Salary, Payment Method for a user entered Date. Next level of complexity would be the same info but just for Employees whose Salary payment method is ‘Bank Account’. What caught my eye in your discussion is where you say “setting textual facts’ aggregation setting to “None”. Do you mean in the Logical model?

  2. Steve,

    It’s true – you can’t avoid those completely – I just wish there was a better support for those as well. There’re many valid cases, when metrics involve words. And you can’t push everything to dimensions.

    Yes, I meant setting their aggregation to None in BMM. It solves some problems, however, it opens another can of worms. For example, you can’t use time-series functions on those.

    By the way, have you been able to solve your problem with multiple facts?

  3. Can we not map the textual facts under an attribute dimension?. We do not need to create a seperate dimension table, but in the BMM layer just map it under a logical table using the fact as the LTS.

  4. Well I’ve now spent a considerable amount of time trying to solve this and am still left with a problem of multi ‘fact’ queries and filters on these pseudo measures (which is what I am calling textual facts with an aggregate function). Since they have an agg function the SQL generated uses having, rather than a where clause which is obviously not the most efficient. I’ve become more interested in the general question of How suitable is OBIEE fof reporting against OLTP/3NF schemas? This is particularly relevant as OBIEE is Oracle’s strategic direction for all ad-hic reporting and right now it looks like that you can do more with Discoverer so customers are in danger of blindly migrating but will then find they have to invest in building Data Marts in order to provide the same reporting. So with that in mind I think what the community needs is a library of 3NF/OLTP data model patterns and how they should be implemented in OBIEE. I am more than happy to be the custodian of this library and am slowly working through some as we speak. Back to my HR problem, I’ve now got a cut down RPD and blog that I can share with anyone that is interested in helping me solve the problem described above. If you are interested, drop me an Email at sdevine@hitachiconsulting.com

Leave a Reply