Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-9099

Sorting of steps in org.pentaho.di.trans.TransMeta.sortStepsNatural() fails, causing start of transformation to die silently

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Severity: High
    • Resolution: Fixed
    • Affects Version/s: 4.2.1 (4.1.0 GA Suite Release), 4.4.0 GA (4.8.0 GA Suite Release)
    • Component/s: API
    • Labels:
      None
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Operating System/s:
      Windows 7 (64-bit)

      Description

      Steps are sorted before each run of a transformation. The method org.pentaho.di.trans.TransMeta.sortStepsNatural() that does the sorting, ignores splits and joins (per inspection and the doc provided in the code). In some cases, java's sort algorithm fails and throws Illegal Argument Exception which bubbles up the stack until it kills the thread. In turn this causes the running of transformation to die silently with no errors whatsoever in logs. Here's one stack trace of this exception from a debugger:

      Thread [Thread-838] (Suspended (exception IllegalArgumentException))
      TimSort<T>.mergeLo(int, int, int, int) line: not available
      TimSort<T>.mergeAt(int) line: not available
      TimSort<T>.mergeCollapse() line: not available
      TimSort<T>.sort(T[], int, int, Comparator<? super T>) line: not available
      TimSort<T>.sort(T[], Comparator<? super T>) line: not available
      Arrays.sort(T[], Comparator<? super T>) line: not available
      Collections.sort(List<T>, Comparator<? super T>) line: not available
      TransMeta.sortStepsNatural() line: 3810
      Trans.prepareExecution(String[]) line: 438
      TransGraph$22.run() line: 3659
      Thread.run() line: not available

      My guess is that because the splits and joins are ignored, the comparator gives inconsistent order results in certain node configurations. I have replicated the same structures with same data in python and the sort completes which leads me to believe that python's sort implementation isn't as sensitive to these inconsistencies as java's. I'm not particularly knowledgeable in the relevant math so this is as much as I've figured.

      I've attached a copy of an ETL which causes the above exception. Commenting out the sort in prepareExecution() method seems like a quick fix for now as the sorting is only for visual purposes.

        Attachments

          Activity

            People

            • Assignee:
              sflatley Sean Flatley (Inactive)
              Reporter:
              oxplot Mansour Behabadi
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: