Uploaded image for project: 'Pentaho Analysis - Mondrian'
  1. Pentaho Analysis - Mondrian
  2. MONDRIAN-1133

Issue with native filter evaluation always joining fact table

    Details

    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      When the property mondrian.native.filter.enable is set to true, Mondrian will evaluate some Filter MDX functions in SQL. The function may or may not have references to measures from the fact table. However, Mondrian is always including a join to the fact table regardless if it is needed or not.

      Ex.

      With
      Set [*NATIVE_CJ_SET] as '[*BASE_MEMBERS_Product]'
      Set [*SORTED_ROW_AXIS] as 'Order([*CJ_ROW_AXIS],[Measures].[*FORMATTED_MEASURE_2],BASC)'
      Set [*BASE_MEMBERS_Product] as 'Filter([Product].[Product Name].Members,[Product].CurrentMember.Caption Matches ("(?i).cdr."))'
      Set [*BASE_MEMBERS_Measures] as '

      {[Measures].[*FORMATTED_MEASURE_2],[Measures].[*FORMATTED_MEASURE_0],[Measures].[*FORMATTED_MEASURE_1]}

      '
      Set [*CJ_ROW_AXIS] as 'Generate([*NATIVE_CJ_SET],

      {([Product].currentMember)}

      )'
      Set [*CJ_COL_AXIS] as '[*NATIVE_CJ_SET]'
      Member [Measures].[*FORMATTED_MEASURE_2] as '[Measures].[Unit Sales]', FORMAT_STRING = 'Standard', SOLVE_ORDER=400
      Member [Measures].[*FORMATTED_MEASURE_1] as '[Measures].[Sales Count]', FORMAT_STRING = '#,###', SOLVE_ORDER=400
      Member [Measures].[*FORMATTED_MEASURE_0] as '[Measures].[Store Sales]', FORMAT_STRING = '#,###', SOLVE_ORDER=400
      Select
      [*BASE_MEMBERS_Measures] on columns,
      [*SORTED_ROW_AXIS] on rows
      From [Sales]

      Result in this SQL:

      HighCardSqlTupleReader.readTuples [[Product].[Product Name]]: done executing sql [select `product_class`.`product_family` as `c0`, `product_class`.`product_department` as `c1`, `product_class`.`product_category` as `c2`, `product_class`.`product_subcategory` as `c3`, `product`.`brand_name` as `c4`, `product`.`product_name` as `c5` from `FOODMART`.`product` as `product`, `FOODMART`.`product_class` as `product_class`, `FOODMART`.`sales_fact_1997` as `sales_fact_1997` where `product`.`product_class_id` = `product_class`.`product_class_id` and `sales_fact_1997`.`product_id` = `product`.`product_id` group by `product_class`.`product_family`, `product_class`.`product_department`, `product_class`.`product_category`, `product_class`.`product_subcategory`, `product`.`brand_name`, `product`.`product_name` having UPPER(c5) REGEXP '.*cdr.*' order by ISNULL(`product_class`.`product_family`) ASC, `product_class`.`product_family` ASC, ISNULL(`product_class`.`product_department`) ASC, `product_class`.`product_department` ASC, ISNULL(`product_class`.`product_category`) ASC, `product_class`.`product_category` ASC, ISNULL(`product_class`.`product_subcategory`) ASC, `product_class`.`product_subcategory` ASC, ISNULL(`product`.`brand_name`) ASC, `product`.`brand_name` ASC, ISNULL(`product`.`product_name`) ASC, `product`.`product_name` ASC], exec+fetch 284 ms, 28 rows

      The fact table join is incorrect and actually results in the wrong result.

      If we set mondrian.native.filter.enable to false, then we get the correct result:

      SqlTupleReader.readTuples [[Product].[Product Name]]: executing sql [select `product_class`.`product_family` as `c0`, `product_class`.`product_department` as `c1`, `product_class`.`product_category` as `c2`, `product_class`.`product_subcategory` as `c3`, `product`.`brand_name` as `c4`, `product`.`product_name` as `c5` from `FOODMART`.`product` as `product`, `FOODMART`.`product_class` as `product_class` where `product`.`product_class_id` = `product_class`.`product_class_id` group by `product_class`.`product_family`, `product_class`.`product_department`, `product_class`.`product_category`, `product_class`.`product_subcategory`, `product`.`brand_name`, `product`.`product_name` order by ISNULL(`product_class`.`product_family`) ASC, `product_class`.`product_family` ASC, ISNULL(`product_class`.`product_department`) ASC, `product_class`.`product_department` ASC, ISNULL(`product_class`.`product_category`) ASC, `product_class`.`product_category` ASC, ISNULL(`product_class`.`product_subcategory`) ASC, `product_class`.`product_subcategory` ASC, ISNULL(`product`.`brand_name`) ASC, `product`.`brand_name` ASC, ISNULL(`product`.`product_name`) ASC, `product`.`product_name` ASC], exec 23 ms

      However, now we don't get the benefit of native evaluation of the contains/regexp filter which would kill performance on large dimensions.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ldeng Li Deng
                Reporter:
                bchow Benny Chow
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: