Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-8236

Repository based sub tranformations failing in Mapreduce job

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Open
    • Severity: High
    • Resolution: Unresolved
    • Affects Version/s: 4.3.0 GA (4.5.0 GA Suite Release), 6.1.0 GA
    • Fix Version/s: Backlog
    • Labels:
    • PDI Sub-component:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      MapReduce job works fine when the corresponding mapper and reducer tranformations do not contain sub tranformations. But if we include sub transformations in the Mapper tranformation, Mapreduce job fails in Hadoop with error message "Failed to initialize step". This is happening as the Hadoop job is unable to reference the Pentaho repository for sub tranformation.

      Workaround we have is to use VFS to save sub tranformation KTRs in HDFS and then refer to the HDFS location from Mapper tranformation. This is getting cumbersome as any change in the sub ktr now has to be saved in repository first and a copy of it transferred to HDFS.

      Please consider fixing this issue so that sub transformations can be referenced from repository itself.

        PractiTest Integration




          Attachments

            Issue Links

              Activity

                People

                Assignee:
                Unassigned Unassigned
                Reporter:
                vkivaturi Vijay Ivaturi
                Votes:
                4 Vote for this issue
                Watchers:
                8 Start watching this issue

                  Dates

                  Created:
                  Updated: