Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-12515

Regular expressions and escape characters in "Replace in string" step

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Severity: Medium
    • Resolution: Unresolved
    • Affects Version/s: 5.0.6 GA
    • Fix Version/s: Backlog
    • Component/s: Step
    • Environment:
      Pentaho Data Integration 5.0.6
    • PDI Sub-component:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      Pentaho Product: Data Integration
      Version: 5.0.6
      Operating system: Any

      The attached transformations display an issue when trying to replace backslash
      and double-quote characters using the "replace in string" step.

      The data grid step has two fields defined - "value" and "shouldbe".
      The value field simulates a string returned from a database that contains backslashes
      and double quotes. The "shouldbe" field is how the field should look after the
      backslashes and double quotes have been correctly escaped.

      For example, the value returned as b"\ when escaped correctly should look like b\"

      The "Search" field in the "replace in string" step uses regexp logic to identify the
      backslashes and double quotes and should precede them with the escape character (another
      backslash), only nothing is changed. This is verified by previewing the "filter rows"
      step.

      The regexps used have been checked on online regexp utilities (such as http://www.regexr.com) so I believe them to be correct.

      Is it possible to investigate this problem in order to:

      • have consistent regexp handling
      • Avoid having to use scripting steps to handle replacing characters
      • improve performance

      Thanks

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            nmidson Nick Midson
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated: