Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-13265

Extend fuzzy match step: extend with exact match

    Details

    • Type: Improvement
    • Status: Open
    • Severity: Medium
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Not Planned
    • Component/s: Step
    • Labels:
      None
    • Story Points:
      0
    • PDI Sub-component:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      When performing a fuzzy match, the distance measure is calculated for every row combination. However sometimes the (organisation) names are only referring to the same unit, if they are from eg the same country. It would be usefull to extend the fuzzy match step with extra fields for this kind of exact matches.
      (At this moment it is possible to accomplish this by: merge join the two datastreams, calculate the distance measure and select the best fit).

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jaapandre Jaap-Andre de Hoop
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: