Pentaho Analysis - Mondrian

Allow rollup of measures based on AVG aggregate function

Details

  • Type: New Feature New Feature
  • Status: Closed Closed
  • Priority: Severe Severe
  • Resolution: Fixed
  • Affects Version/s: 3.1.5 GA
  • Component/s: None
  • Labels:
    None
  • Customer Case:
  • CI Build:
    bi-server-ee-MON33.zip, Sep 16, 2011 2:21:10 PM(#26)

Description

Allow rollup of measures based on AVG aggregate function.

In this thread I described how it's difficult: http://forums.pentaho.org/showthread.php?t=56603 But it can be done if there are sum and count measures based on the same column and mondrian is smart enough to rewrite in terms of those measures.

I've added a test case to CompoundSlicerTest. Search for this bug number.
  1. mon791.xanalyzer
    19/Sep/11 4:49 AM
    6 kB
    Suhas Gururaja
  2. pentaho1.log
    19/Sep/11 4:54 AM
    27 kB
    Suhas Gururaja
  3. steelwheelsA.mondrian (1).xml
    19/Sep/11 4:49 AM
    6 kB
    Suhas Gururaja
  1. 3.8_report.png
    203 kB
    19/Sep/11 4:49 AM
  2. 4.0_report.png
    281 kB
    19/Sep/11 4:54 AM

Issue Links

Activity

Hide
Julian Hyde added a comment - 01/Jan/10 9:35 PM
Here is the stack trace/query for the test case:

java.lang.RuntimeException: mondrian.olap.fun.MondrianEvaluationException: Don't know how to rollup aggregator 'avg'

        at com.pentaho.analyzer.service.impl.AnalysisAreaManager.executeQuery(SourceFile:275)

        at com.pentaho.analyzer.report.ReportManagerImpl.parseAndExecuteMDX(SourceFile:293)

        at com.pentaho.analyzer.report.ReportManagerImpl.getCVResultSet(SourceFile:352)

        at com.pentaho.analyzer.report.ReportManagerImpl.runReport(SourceFile:110)

        at com.pentaho.analyzer.report.ReportManagerImpl.runReport(SourceFile:81)

        at com.pentaho.analyzer.report.ReportRequestService$a.ca(SourceFile:250)

        at com.pentaho.analyzer.report.ReportRequestService$a.call(SourceFile:241)

        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)

        at java.util.concurrent.FutureTask.run(FutureTask.java:123)

        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)

        at java.lang.Thread.run(Thread.java:595)

Caused by: mondrian.olap.fun.MondrianEvaluationException: Don't know how to rollup aggregator 'avg'

        at mondrian.olap.fun.FunUtil.newEvalException(FunUtil.java:78)

        at mondrian.olap.fun.AggregateFunDef$AggregateCalc.aggregate(AggregateFunDef.java:100)

        at mondrian.olap.fun.AggregateFunDef$AggregateCalc.evaluate(AggregateFunDef.java:72)

        at mondrian.rolap.RolapEvaluator.evaluateCurrent(RolapEvaluator.java:461)

        at mondrian.calc.impl.MemberValueCalc.evaluate(MemberValueCalc.java:73)

        at mondrian.rolap.RolapEvaluator.evaluateCurrent(RolapEvaluator.java:461)

        at mondrian.calc.impl.ValueCalc.evaluate(ValueCalc.java:36)

        at mondrian.olap.fun.FunUtil.evaluateMembers(FunUtil.java:386)

        at mondrian.olap.fun.FunUtil.sortMembers(FunUtil.java:476)

        at mondrian.olap.fun.OrderFunDef$MemberCalcImpl.evaluateDual(OrderFunDef.java:189)

        at mondrian.olap.fun.OrderFunDef$ContextCalc.evaluate(OrderFunDef.java:452)

        at mondrian.rolap.RolapResult.evaluateExp(RolapResult.java:847)

        at mondrian.rolap.RolapNamedSetEvaluator.ensureList(RolapNamedSetEvaluator.java:140)

        at mondrian.rolap.RolapNamedSetEvaluator.evaluateMemberIterable(RolapNamedSetEvaluator.java:68)

        at mondrian.mdx.NamedSetExpr$2.evaluateMemberIterable(NamedSetExpr.java:122)

        at mondrian.calc.impl.AbstractMemberIterCalc.evaluate(AbstractMemberIterCalc.java:52)

        at mondrian.rolap.RolapResult.executeAxis(RolapResult.java:727)

        at mondrian.rolap.RolapResult.evalLoad(RolapResult.java:570)

        at mondrian.rolap.RolapResult.loadMembers(RolapResult.java:536)

        at mondrian.rolap.RolapResult.(RolapResult.java:269)

        at mondrian.rolap.RolapConnection.execute(RolapConnection.java:563)

        at com.pentaho.analyzer.service.impl.AnalysisAreaManager.executeQuery(SourceFile:267)

        ... 11 more

 

With
Set [*NATIVE_CJ_SET] as 'NonEmptyCrossJoin([*BASE_MEMBERS_Student],[*BASE_MEMBERS_Time])'
Set [*SORTED_ROW_AXIS] as 'Order([*CJ_ROW_AXIS],[Measures].[*FORMATTED_MEASURE_0],BDESC)'
Set [*BASE_MEMBERS_Student] as '[Student].[Student Name].Members'
Set [*NATIVE_MEMBERS_Time] as 'Generate([*NATIVE_CJ_SET], {[Time].CurrentMember})'
Set [*BASE_MEMBERS_Time] as '{[Time].[All Time].[1999],[Time].[All Time].[2000],[Time].[All Time].[2001],[Time].[All Time].[2002],[Time].[All Time].[2003]}'
Set [*NATIVE_MEMBERS_Student] as 'Generate([*NATIVE_CJ_SET], {[Student].CurrentMember})'
Set [*BASE_MEMBERS_Measures] as '{[Measures].[*FORMATTED_MEASURE_0]}'
Set [*CJ_ROW_AXIS] as 'Generate([*NATIVE_CJ_SET], {([Student].currentMember)})'
Set [*CJ_COL_AXIS] as '[*NATIVE_CJ_SET]'
Member [Measures].[*FORMATTED_MEASURE_0] as '[Measures].[Numeracy Benchmark]', FORMAT_STRING = '', SOLVE_ORDER=400
Member [Time].[*SLICER_MEMBER] as 'Aggregate ([*NATIVE_MEMBERS_Time])', SOLVE_ORDER=-400
Select
[*BASE_MEMBERS_Measures] on columns,
Non Empty [*SORTED_ROW_AXIS] on rows
From [Sales Data Cube]
Where ([Time].[*SLICER_MEMBER])

SqlTupleReader.readTuples [[Student].[Student Name], [Time].[Year]]: done executing sql [select `sales_rep`.`SALESREP` as `c0`, Year(cord.ORDER_DATE) as `c1` from `sales_rep` as `sales_rep`, `cord` as `cord` where `cord`.`SALESREP_ID` = `sales_rep`.`SALESREP_ID` and (Year(cord.ORDER_DATE) in ('1999', '2000', '2001', '2002', '2003')) group by `sales_rep`.`SALESREP`, Year(cord.ORDER_DATE) order by ISNULL(`sales_rep`.`SALESREP`), `sales_rep`.`SALESREP` ASC, ISNULL(Year(cord.ORDER_DATE)), Year(cord.ORDER_DATE) ASC], exec+fetch 172 ms, 558 rows
Show
Julian Hyde added a comment - 01/Jan/10 9:35 PM Here is the stack trace/query for the test case: java.lang.RuntimeException: mondrian.olap.fun.MondrianEvaluationException: Don't know how to rollup aggregator 'avg'         at com.pentaho.analyzer.service.impl.AnalysisAreaManager.executeQuery(SourceFile:275)         at com.pentaho.analyzer.report.ReportManagerImpl.parseAndExecuteMDX(SourceFile:293)         at com.pentaho.analyzer.report.ReportManagerImpl.getCVResultSet(SourceFile:352)         at com.pentaho.analyzer.report.ReportManagerImpl.runReport(SourceFile:110)         at com.pentaho.analyzer.report.ReportManagerImpl.runReport(SourceFile:81)         at com.pentaho.analyzer.report.ReportRequestService$a.ca(SourceFile:250)         at com.pentaho.analyzer.report.ReportRequestService$a.call(SourceFile:241)         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)         at java.util.concurrent.FutureTask.run(FutureTask.java:123)         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)         at java.lang.Thread.run(Thread.java:595) Caused by: mondrian.olap.fun.MondrianEvaluationException: Don't know how to rollup aggregator 'avg'         at mondrian.olap.fun.FunUtil.newEvalException(FunUtil.java:78)         at mondrian.olap.fun.AggregateFunDef$AggregateCalc.aggregate(AggregateFunDef.java:100)         at mondrian.olap.fun.AggregateFunDef$AggregateCalc.evaluate(AggregateFunDef.java:72)         at mondrian.rolap.RolapEvaluator.evaluateCurrent(RolapEvaluator.java:461)         at mondrian.calc.impl.MemberValueCalc.evaluate(MemberValueCalc.java:73)         at mondrian.rolap.RolapEvaluator.evaluateCurrent(RolapEvaluator.java:461)         at mondrian.calc.impl.ValueCalc.evaluate(ValueCalc.java:36)         at mondrian.olap.fun.FunUtil.evaluateMembers(FunUtil.java:386)         at mondrian.olap.fun.FunUtil.sortMembers(FunUtil.java:476)         at mondrian.olap.fun.OrderFunDef$MemberCalcImpl.evaluateDual(OrderFunDef.java:189)         at mondrian.olap.fun.OrderFunDef$ContextCalc.evaluate(OrderFunDef.java:452)         at mondrian.rolap.RolapResult.evaluateExp(RolapResult.java:847)         at mondrian.rolap.RolapNamedSetEvaluator.ensureList(RolapNamedSetEvaluator.java:140)         at mondrian.rolap.RolapNamedSetEvaluator.evaluateMemberIterable(RolapNamedSetEvaluator.java:68)         at mondrian.mdx.NamedSetExpr$2.evaluateMemberIterable(NamedSetExpr.java:122)         at mondrian.calc.impl.AbstractMemberIterCalc.evaluate(AbstractMemberIterCalc.java:52)         at mondrian.rolap.RolapResult.executeAxis(RolapResult.java:727)         at mondrian.rolap.RolapResult.evalLoad(RolapResult.java:570)         at mondrian.rolap.RolapResult.loadMembers(RolapResult.java:536)         at mondrian.rolap.RolapResult.(RolapResult.java:269)         at mondrian.rolap.RolapConnection.execute(RolapConnection.java:563)         at com.pentaho.analyzer.service.impl.AnalysisAreaManager.executeQuery(SourceFile:267)         ... 11 more   With Set [*NATIVE_CJ_SET] as 'NonEmptyCrossJoin([*BASE_MEMBERS_Student],[*BASE_MEMBERS_Time])' Set [*SORTED_ROW_AXIS] as 'Order([*CJ_ROW_AXIS],[Measures].[*FORMATTED_MEASURE_0],BDESC)' Set [*BASE_MEMBERS_Student] as '[Student].[Student Name].Members' Set [*NATIVE_MEMBERS_Time] as 'Generate([*NATIVE_CJ_SET], {[Time].CurrentMember})' Set [*BASE_MEMBERS_Time] as '{[Time].[All Time].[1999],[Time].[All Time].[2000],[Time].[All Time].[2001],[Time].[All Time].[2002],[Time].[All Time].[2003]}' Set [*NATIVE_MEMBERS_Student] as 'Generate([*NATIVE_CJ_SET], {[Student].CurrentMember})' Set [*BASE_MEMBERS_Measures] as '{[Measures].[*FORMATTED_MEASURE_0]}' Set [*CJ_ROW_AXIS] as 'Generate([*NATIVE_CJ_SET], {([Student].currentMember)})' Set [*CJ_COL_AXIS] as '[*NATIVE_CJ_SET]' Member [Measures].[*FORMATTED_MEASURE_0] as '[Measures].[Numeracy Benchmark]', FORMAT_STRING = '', SOLVE_ORDER=400 Member [Time].[*SLICER_MEMBER] as 'Aggregate ([*NATIVE_MEMBERS_Time])', SOLVE_ORDER=-400 Select [*BASE_MEMBERS_Measures] on columns, Non Empty [*SORTED_ROW_AXIS] on rows From [Sales Data Cube] Where ([Time].[*SLICER_MEMBER]) SqlTupleReader.readTuples [[Student].[Student Name], [Time].[Year]]: done executing sql [select `sales_rep`.`SALESREP` as `c0`, Year(cord.ORDER_DATE) as `c1` from `sales_rep` as `sales_rep`, `cord` as `cord` where `cord`.`SALESREP_ID` = `sales_rep`.`SALESREP_ID` and (Year(cord.ORDER_DATE) in ('1999', '2000', '2001', '2002', '2003')) group by `sales_rep`.`SALESREP`, Year(cord.ORDER_DATE) order by ISNULL(`sales_rep`.`SALESREP`), `sales_rep`.`SALESREP` ASC, ISNULL(Year(cord.ORDER_DATE)), Year(cord.ORDER_DATE) ASC], exec+fetch 172 ms, 558 rows
Hide
Julian Hyde added a comment - 20/May/10 3:30 PM
(My comments on an email thread, and a suggested workaround.)

Mondrian is being truthful when it says "Don't know how to rollup aggregator 'avg' " -- it can't know to roll up averages. If I tell you the average age of Male customers is 32 and the average age of Female customers is 28, what's the average age of all customers? There's simply not enough information.

However, if you replace your 'avg' measure with two stored measures 'sum' and 'count', then create a calculated measure from these, you will be able to roll up.
Show
Julian Hyde added a comment - 20/May/10 3:30 PM (My comments on an email thread, and a suggested workaround.) Mondrian is being truthful when it says "Don't know how to rollup aggregator 'avg' " -- it can't know to roll up averages. If I tell you the average age of Male customers is 32 and the average age of Female customers is 28, what's the average age of all customers? There's simply not enough information. However, if you replace your 'avg' measure with two stored measures 'sum' and 'count', then create a calculated measure from these, you will be able to roll up.
Hide
Sergey Mazin added a comment - 20/Jul/10 4:35 AM
Hi Julian,

I think I might have a possible solution to AVG problem. There are basically 2 issues:

1) "Don't know how to rollup aggregator 'avg'"
2) Mondrian is not able to roll up calculated measure [Measures].[AVG] = [Measures].[Sum] / [Measures].[Count]

Issue #1:
"If I tell you the average age of Male customers is 32 and the average age of Female customers is 28, what's the average age of all customers? There's simply not enough information."

For "Avg" aggregator the only way to get correct result for "all customers" is to ask it separately from database. We need to create separate aggregation batch for each level/combination of levels in the result set. (currently it is done so for "distinct-count" measures). What we get is that database will produce all results we need in two separate queries. 1) Male + Female 2) All Customers


Issue #2:
Mondrian is not able to roll up any calculated measures because only base measures have aggregators.
I wrote a method aggregateCalcMember(Calc calc, Evaluator evaluator, List list, Exp parentExp) which looks into calculated measure expression and according to expression/children expression type and FundDef name decides what to do next.

in our example:
[Measures].[AVG] = [Measures].[Sum] / [Measures].[Count]

We have 2 base measures with aggregators sum and count. We can aggregate them separately (one by one) and then calculate [Measures].[AVG] measure (divide results from SUM measure with results from COUNT measure).

I attach source code and couple of test cases. That method definitely needs some additional work to be done.
Show
Sergey Mazin added a comment - 20/Jul/10 4:35 AM Hi Julian, I think I might have a possible solution to AVG problem. There are basically 2 issues: 1) "Don't know how to rollup aggregator 'avg'" 2) Mondrian is not able to roll up calculated measure [Measures].[AVG] = [Measures].[Sum] / [Measures].[Count] Issue #1: "If I tell you the average age of Male customers is 32 and the average age of Female customers is 28, what's the average age of all customers? There's simply not enough information." For "Avg" aggregator the only way to get correct result for "all customers" is to ask it separately from database. We need to create separate aggregation batch for each level/combination of levels in the result set. (currently it is done so for "distinct-count" measures). What we get is that database will produce all results we need in two separate queries. 1) Male + Female 2) All Customers Issue #2: Mondrian is not able to roll up any calculated measures because only base measures have aggregators. I wrote a method aggregateCalcMember(Calc calc, Evaluator evaluator, List list, Exp parentExp) which looks into calculated measure expression and according to expression/children expression type and FundDef name decides what to do next. in our example: [Measures].[AVG] = [Measures].[Sum] / [Measures].[Count] We have 2 base measures with aggregators sum and count. We can aggregate them separately (one by one) and then calculate [Measures].[AVG] measure (divide results from SUM measure with results from COUNT measure). I attach source code and couple of test cases. That method definitely needs some additional work to be done.
Hide
Brian Hagan added a comment - 19/Oct/10 10:26 AM
added customer case
Show
Brian Hagan added a comment - 19/Oct/10 10:26 AM added customer case
Hide
Man Shing Yau added a comment - 19/Nov/10 5:48 AM
New cased added.
Show
Man Shing Yau added a comment - 19/Nov/10 5:48 AM New cased added.
Hide
Kurtis Cruzada added a comment - 23/Nov/10 8:08 AM
leaving in triage for Jared's comments.
Show
Kurtis Cruzada added a comment - 23/Nov/10 8:08 AM leaving in triage for Jared's comments.
Hide
Man Shing Yau added a comment - 15/Dec/10 10:39 AM
New case added
Show
Man Shing Yau added a comment - 15/Dec/10 10:39 AM New case added
Hide
Jake Cornelius added a comment - 28/Dec/10 8:13 AM
Jared, can you investigate if we can accomplish the same thing using a calculated member
- have a sum and count in your schema
- create a calculated member based on those that calc's the average
- validate that the numbers are correct when used in Analyzer
Show
Jake Cornelius added a comment - 28/Dec/10 8:13 AM Jared, can you investigate if we can accomplish the same thing using a calculated member - have a sum and count in your schema - create a calculated member based on those that calc's the average - validate that the numbers are correct when used in Analyzer
Hide
Jared Cornelius added a comment - 04/Jan/11 12:59 PM
The workaround is valid - you can accomplish this using a calculated measure.
Show
Jared Cornelius added a comment - 04/Jan/11 12:59 PM The workaround is valid - you can accomplish this using a calculated measure.
Hide
Anthony Carter added a comment - 05/Jan/11 4:47 AM
Customer Case added
Show
Anthony Carter added a comment - 05/Jan/11 4:47 AM Customer Case added
Hide
Man Shing Yau added a comment - 03/Mar/11 7:30 AM
New customer added
Show
Man Shing Yau added a comment - 03/Mar/11 7:30 AM New customer added
Hide
Julian Hyde added a comment - 07/Mar/11 2:20 AM
Partially fixed in change 14149 (will be in mondrian-3.3).
Show
Julian Hyde added a comment - 07/Mar/11 2:20 AM Partially fixed in change 14149 (will be in mondrian-3.3).
Hide
Julian Hyde added a comment - 11/Jul/11 5:46 PM
It would help everyone if this issue were boiled down to an MDX statement that fails against FoodMart. (I burned quite a lot of time on this issue solving what turned out to be the wrong problem.) Can someone please provide that?
Show
Julian Hyde added a comment - 11/Jul/11 5:46 PM It would help everyone if this issue were boiled down to an MDX statement that fails against FoodMart. (I burned quite a lot of time on this issue solving what turned out to be the wrong problem.) Can someone please provide that?
Hide
Luc Boudreau added a comment - 03/Aug/11 3:36 PM
Fixed and tested in revision 14516. I have implemented a basic algorithm to attempt an AVG rollup which scans the cube's measures and tries to find a COUNT and a SUM measures to perform the avg rollup. If the cube has both a COUNT and a SUM measure defined on the same column, the AVG rollup can take place. If not, it throws an exception telling it couldn't perform an AVG rollup.
Show
Luc Boudreau added a comment - 03/Aug/11 3:36 PM Fixed and tested in revision 14516. I have implemented a basic algorithm to attempt an AVG rollup which scans the cube's measures and tries to find a COUNT and a SUM measures to perform the avg rollup. If the cube has both a COUNT and a SUM measure defined on the same column, the AVG rollup can take place. If not, it throws an exception telling it couldn't perform an AVG rollup.
Hide
Suhas Gururaja added a comment - 19/Sep/11 4:20 AM
Tested this issue on CI build dated Sep 16, 2011 2:21:10 PM (#26).

Repro:

1) Please use the attached schema file.
2) Login to PUC.
3)Click on New Analyzer report.
4)Select Asteelwheels cube.
5) Drag and drop Territory, Sales, Quantity and Average sales.
6) Note that there will be no error reported in the log.
Please find the attached screen shot and the log file. This issue is fixed.
Show
Suhas Gururaja added a comment - 19/Sep/11 4:20 AM Tested this issue on CI build dated Sep 16, 2011 2:21:10 PM (#26). Repro: 1) Please use the attached schema file. 2) Login to PUC. 3)Click on New Analyzer report. 4)Select Asteelwheels cube. 5) Drag and drop Territory, Sales, Quantity and Average sales. 6) Note that there will be no error reported in the log. Please find the attached screen shot and the log file. This issue is fixed.

People

Vote (9)
Watch (11)

Dates

  • Created:
    01/Jan/10 4:45 PM
    Updated:
    19/Sep/11 4:56 AM
    Resolved:
    19/Sep/11 4:55 AM