Pentaho Analysis - Mondrian
  1. Pentaho Analysis - Mondrian
  2. MONDRIAN-608

Performance issue with large number of measures inMondrian 3.0.4

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Severe Severe
    • Resolution: Fixed
    • Affects Version/s: 3.1.2 GA
    • Component/s: None
    • Labels:
      None
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Operating System/s:
      Windows XP

      Description

      Performance issue with Mondrian 304 and 312 when a calculated measure uses large number of base measures.

      [Test Case]

      1. Add the following calculated measures in the Sales cube. M0 is the sum of 7 base measures. M1 is the sum of M0 and 17 base measures. M2 is the sum of M1 and 17 base measures.
       

        <CalculatedMember

            name="M0"

              dimension="Measures"

              formula="[Measures].[Unit Sales] + [Measures].[Store Cost] + [Measures].[Store Sales] + [Measures].[Customer Count] + [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]" />

          <CalculatedMember

            name="M1"

              dimension="Measures"

              formula="[Measures].[M0] + [Measures].[Unit Sales] + [Measures].[Store Cost] + [Measures].[Store Sales] + [Measures].[Customer Count] + [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]" />

            <CalculatedMember

            name="M2"

              dimension="Measures"

              formula="[Measures].[M1] + [Measures].[Unit Sales] + [Measures].[Store Cost] + [Measures].[Store Sales] + [Measures].[Customer Count] + [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]" />

       

      2. Select M1 in the query. It took 3.5 seconds to return.
            WITH

      SET [#DataSet#] as 'NonEmptyCrossjoin(

      { [Product].[Food]},

      {Descendants([Store].[USA], 1)}

      )'

       

      SELECT {[Measures].[M1]} on columns, NON EMPTY Hierarchize({[#DataSet#]}) on rows FROM [Sales]

       

      3. Select M2. It has been running for 5 minutes and hasn't returned.
           WITH

      SET [#DataSet#] as 'NonEmptyCrossjoin(

      { [Product].[Food]},

      {Descendants([Store].[USA], 1)}

      )'

       

      SELECT {[Measures].[M2]} on columns, NON EMPTY Hierarchize({[#DataSet#]}) on rows FROM [Sales]

      [Proposed fix]
      It seems the problem was in the code that does non empty optimization for Cross Join; more specifically, the nonEmptyList() mothod in the CrossJoinFunDef class. It seems we can fix this problem by modifying CrossJoinFunDef->MeasureVisitor-> public Object visit(ResolvedFunCall call) method.
      Current code:
              public Object visit(ResolvedFunCall funcall) {
                 Exp[] exps = funcall.getArgs();
                  if (exps != null) {
                      for (Exp exp: exps) {
                          exp.accept(this);
                      }
                  }
                  return null;
              }
      Proposed fix:
              public Object visit(ResolvedFunCall funcall) {
                  return null;
              }
      We might not need to accept arguments of resolved function call here since this is already done in the accept() method of the ResolvedFunCall class. In the test case, MeasureVisitor::visit() and ResolvedFuncCall::accept() called each other and this led to heavy recursive calculations.

      This change passed Foodmart unit tests.


        Activity

        Hide
        Huei-Ju Chen added a comment -
        proposed fix.
        Show
        Huei-Ju Chen added a comment - proposed fix.
        Hide
        Huei-Ju Chen added a comment -
        email discussion attached.
        Show
        Huei-Ju Chen added a comment - email discussion attached.
        Hide
        Kurtis Cruzada added a comment -
        Please review for possible inclusion.
        Show
        Kurtis Cruzada added a comment - Please review for possible inclusion.
        Hide
        Julian Hyde added a comment -
        Fixed in change 13019 (on mondrian-3.1 branch). Will be in 3.1.4 and 4.0.

        The problem was a visitor pattern where visitor and visitee were both traversing all arguments of a function, hence exponential running time. Also fixed a similar problem in the visitor that decides whether a cell based on a calculated member is drillable.
        Show
        Julian Hyde added a comment - Fixed in change 13019 (on mondrian-3.1 branch). Will be in 3.1.4 and 4.0. The problem was a visitor pattern where visitor and visitee were both traversing all arguments of a function, hence exponential running time. Also fixed a similar problem in the visitor that decides whether a cell based on a calculated member is drillable.

          People

          • Assignee:
            Julian Hyde
            Reporter:
            Huei-Ju Chen
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: