Pentaho Analysis - Mondrian
  1. Pentaho Analysis - Mondrian
  2. MONDRIAN-608

Performance issue with large number of measures inMondrian 3.0.4

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Severe Severe
    • Resolution: Fixed
    • Affects Version/s: 3.1.2 GA
    • Component/s: None
    • Labels:
      None
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Operating System/s:
      Windows XP

      Description

      Performance issue with Mondrian 304 and 312 when a calculated measure uses large number of base measures.

      [Test Case]

      1. Add the following calculated measures in the Sales cube. M0 is the sum of 7 base measures. M1 is the sum of M0 and 17 base measures. M2 is the sum of M1 and 17 base measures.

      <CalculatedMember

      name="M0"

      dimension="Measures"

      formula="[Measures].[Unit Sales] + [Measures].[Store Cost] + [Measures].[Store Sales] + [Measures].[Customer Count] + [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]" />

      <CalculatedMember

      name="M1"

      dimension="Measures"

      formula="[Measures].[M0] + [Measures].[Unit Sales] + [Measures].[Store Cost] + [Measures].[Store Sales] + [Measures].[Customer Count] + [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]" />

      <CalculatedMember

      name="M2"

      dimension="Measures"

      formula="[Measures].[M1] + [Measures].[Unit Sales] + [Measures].[Store Cost] + [Measures].[Store Sales] + [Measures].[Customer Count] + [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]+ [Measures].[Sales Count]" />

      2. Select M1 in the query. It took 3.5 seconds to return.
      WITH

      SET DataSet# as 'NonEmptyCrossjoin(

      { [Product].[Food]},
      {Descendants([Store].[USA], 1)}

      )'



      SELECT {[Measures].[M1]} on columns, NON EMPTY Hierarchize({[#DataSet#]}) on rows FROM [Sales]



      3. Select M2. It has been running for 5 minutes and hasn't returned.
      WITH

      SET DataSet# as 'NonEmptyCrossjoin(
      { [Product].[Food]}

      ,

      {Descendants([Store].[USA], 1)}

      )'

      SELECT

      {[Measures].[M2]}

      on columns, NON EMPTY Hierarchize(

      {[#DataSet#]}

      ) on rows FROM [Sales]

      [Proposed fix]
      It seems the problem was in the code that does non empty optimization for Cross Join; more specifically, the nonEmptyList() mothod in the CrossJoinFunDef class. It seems we can fix this problem by modifying CrossJoinFunDef->MeasureVisitor-> public Object visit(ResolvedFunCall call) method.
      Current code:
      public Object visit(ResolvedFunCall funcall) {
      Exp[] exps = funcall.getArgs();
      if (exps != null) {
      for (Exp exp: exps)

      { exp.accept(this); }

      }
      return null;
      }
      Proposed fix:
      public Object visit(ResolvedFunCall funcall)

      { return null; }

      We might not need to accept arguments of resolved function call here since this is already done in the accept() method of the ResolvedFunCall class. In the test case, MeasureVisitor::visit() and ResolvedFuncCall::accept() called each other and this led to heavy recursive calculations.

      This change passed Foodmart unit tests.

        Activity

        Hide
        Huei-Ju Chen added a comment -

        proposed fix.

        Show
        Huei-Ju Chen added a comment - proposed fix.
        Hide
        Huei-Ju Chen added a comment -

        email discussion attached.

        Show
        Huei-Ju Chen added a comment - email discussion attached.
        Hide
        Kurtis Cruzada added a comment -

        Please review for possible inclusion.

        Show
        Kurtis Cruzada added a comment - Please review for possible inclusion.
        Hide
        Julian Hyde added a comment -

        Fixed in change 13019 (on mondrian-3.1 branch). Will be in 3.1.4 and 4.0.

        The problem was a visitor pattern where visitor and visitee were both traversing all arguments of a function, hence exponential running time. Also fixed a similar problem in the visitor that decides whether a cell based on a calculated member is drillable.

        Show
        Julian Hyde added a comment - Fixed in change 13019 (on mondrian-3.1 branch). Will be in 3.1.4 and 4.0. The problem was a visitor pattern where visitor and visitee were both traversing all arguments of a function, hence exponential running time. Also fixed a similar problem in the visitor that decides whether a cell based on a calculated member is drillable.

          People

          • Assignee:
            Julian Hyde
            Reporter:
            Huei-Ju Chen
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: