Pentaho Analysis - Mondrian
  1. Pentaho Analysis - Mondrian
  2. MONDRIAN-1057

Match a specific string filtering in Analyzer generates the wrong regular expression in Oracle 10G

    Details

    • Customer Case:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • QA Validation Status:
      Validated by QA
    • CI Build:
      4.5.0-RC1

      Description

      Using sample data installed on an Oracle database; set a new existing datasource for Steel Wheels; once created add product code and price each to the canvas. Add a new filter using: Match a specific sttring: add the following strings: 10_19, 10_16, 12_33, 12_11 (for product codes: S10_1949, S10_1678, S12_3380, S12_1108) See Screenshot-1
      The results from the filters only returns 1 row for product code S12_1108 which was the last string added to the filter. See screenshot2.
      The attached log shows the following generated SQL:
      select "PRODUCTS"."PRODUCTCODE" as "c0" from "PRODUCTS" "PRODUCTS", "ORDERFACT" "ORDERFACT" where "ORDERFACT"."PRODUCTCODE" = "PRODUCTS"."PRODUCTCODE" group by "PRODUCTS"."PRODUCTCODE" having REGEXP_LIKE("PRODUCTS"."PRODUCTCODE", '.*\Q10_19\E.*|.*\Q10_16\E.*|.*\Q12_33\E.*|.*12_11.*', 'i') order by "PRODUCTS"."PRODUCTCODE" ASC NULLS LAST

      Running this SQL against the DB returns only 1 product code S12_1108; if I remove \Q and \E from the remaining 3 strings the SQL returns all 4 product codes.

      select "PRODUCTS"."PRODUCTCODE" as "c0" from "PRODUCTS" "PRODUCTS", "ORDERFACT" "ORDERFACT" where "ORDERFACT"."PRODUCTCODE" = "PRODUCTS"."PRODUCTCODE" group by "PRODUCTS"."PRODUCTCODE" having REGEXP_LIKE("PRODUCTS"."PRODUCTCODE", '.*10_19.*|.*10_16.*|.*12_33.*|.*12_11.*', 'i') order by "PRODUCTS"."PRODUCTCODE" ASC NULLS LAST




      1. Oracle_Last Analyzer Query Log.html
        3 kB
        Carlos Lopez
      1. screenshot-1.jpg
        74 kB
      2. screenshot-2.jpg
        115 kB

        Activity

        Hide
        Benny Chow added a comment -
        The \Q and \E escape characters are for:

        Escaping a single metacharacter with a backslash works in all regular expression flavors. Many flavors also support the \Q...\E escape sequence. All the characters between the \Q and the \E are interpreted as literal characters. E.g. \Q*\d+*\E matches the literal text *\d+*. The \E may be omitted at the end of the regex, so \Q*\d+* is the same as \Q*\d+*\E.This syntax is supported by the JGsoft engine, Perl, PCRE and Java, both inside and outside character classes. However, in Java, this feature does not work correctly in JDK 1.4 and 1.5 when used in a character class or followed by a quantifier.

        http://www.regular-expressions.info/characters.html

        I don't think Oracle supports those escape sequences. Mondrian should not generate Oracle SQL with those escape sequences.
        Show
        Benny Chow added a comment - The \Q and \E escape characters are for: Escaping a single metacharacter with a backslash works in all regular expression flavors. Many flavors also support the \Q...\E escape sequence. All the characters between the \Q and the \E are interpreted as literal characters. E.g. \Q*\d+*\E matches the literal text *\d+*. The \E may be omitted at the end of the regex, so \Q*\d+* is the same as \Q*\d+*\E.This syntax is supported by the JGsoft engine, Perl, PCRE and Java, both inside and outside character classes. However, in Java, this feature does not work correctly in JDK 1.4 and 1.5 when used in a character class or followed by a quantifier. http://www.regular-expressions.info/characters.html I don't think Oracle supports those escape sequences. Mondrian should not generate Oracle SQL with those escape sequences.
        Hide
        Luc Boudreau added a comment -
        Fixed in revision 14842. I've added the test DialectTest.testComplexRegularExpression() to keep a track of that bug.

        To reproduce, use Analyzer and filter on multiple names, as shown in the attached screenshot.
        Show
        Luc Boudreau added a comment - Fixed in revision 14842. I've added the test DialectTest.testComplexRegularExpression() to keep a track of that bug. To reproduce, use Analyzer and filter on multiple names, as shown in the attached screenshot.
        Hide
        Carter Everett added a comment -
        validating
        Show
        Carter Everett added a comment - validating
        Hide
        Carter Everett added a comment -
        Validated in Windows 7 (64bit) and Ubuntu 10 (32bit)
        Show
        Carter Everett added a comment - Validated in Windows 7 (64bit) and Ubuntu 10 (32bit)
        Hide
        Carter Everett added a comment -
        Validated in Ubuntu 10 (64bit)
        Show
        Carter Everett added a comment - Validated in Ubuntu 10 (64bit)
        Hide
        Carter Everett added a comment -
        To summarize validations:

        Tested in Windows 7 (64bit) - FF10, IE 9, Chrome 17
        Ubuntu 10 (32 and 64bit) - FF10, Chrome 17
        Show
        Carter Everett added a comment - To summarize validations: Tested in Windows 7 (64bit) - FF10, IE 9, Chrome 17 Ubuntu 10 (32 and 64bit) - FF10, Chrome 17

          People

          • Assignee:
            Carter Everett
            Reporter:
            Carlos Lopez
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: