Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-19182

Text file input: Significant performance degradation from 8.1 to 8.3/9.1 when running the transformation directly on the server via PUC

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Severity: Urgent
    • Resolution: Unresolved
    • Affects Version/s: 8.3.0 GA, 9.1.0 GA, 8.3.0.21
    • Fix Version/s: Backlog
    • Component/s: Step
    • Labels:
    • Story Points:
      0
    • PDI Sub-component:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Steps to Reproduce:
      Hide

      Steps to replicate the issue:

      1. Do an installation of Pentaho 8.1 on RHEL 7
      2. Install Spoon on client OS like Windows 10 or Ubuntu
      3. Connect to the remote repository running on RHEL7 from Spoon running on client OS
      4. Unzip and place the attached text file WQ1000000.zipon your server.
      5. Open the attached transformation READTXTSTRWQ.ktr in Spoon, change the text file as per your environment and save it into your repository.
      6. Login to the User Console and run the transformation on the server.
      7. Follow the above steps with the 8.3 and for 9.1 version
      Show
      Steps to replicate the issue: Do an installation of Pentaho 8.1 on RHEL 7 Install Spoon on client OS like Windows 10 or Ubuntu Connect to the remote repository running on RHEL7 from Spoon running on client OS Unzip and place the attached text file WQ1000000.zip on your server. Open the attached transformation  READTXTSTRWQ.ktr in Spoon, change the text file as per your environment and save it into your repository. Login to the User Console and run the transformation on the server. Follow the above steps with the 8.3 and for 9.1 version

      Description

      We are seeing a significant performance degradation from version 8.1 to 8.3 when running the transformation with the Text file Input step directly on the Pentaho Server via the User Console. To replicate the issue we are using a very simple transformation with Text File input and a Dummy. In this transformation, we are trying to read a file with 1000000 rows.

      8.1 stats:

      Running this sample transformation the server took almost 11 min. Attached the execution logs 81executionlog.txt

      Transformation start time = 15:07:20
      Transformation finish time = 15:18:18

      8.3.0.21 stats:

      Running this sample transformation the server took almost 34 min. Attached the execution log 83executionlog.txt

      Transformation start time = 16:02:19
      Transformation finish time = 16:36:30

      9.1.0.7 stats:

      The behavior is the same with Pentaho 9.1 91executionlog.txtand 9.1.0.7 9107executionlog.txt

      Transformation start time = 08:37:56
      Transformation finish time = 09:09:52

      NOTE: When running the same transformation via the Slave Server arrangement we are seeing a negligible difference in performance between 8.1 and 8.3

      8.1 took 12min 55sec
      8.3.0.21 took 12 min 32sec

      Attaching a word document with the screenshots. TestResults.docx

        Attachments

        1. 81executionlog.txt
          3 kB
        2. 83executionlog.txt
          3 kB
        3. 9107executionlog.txt
          3 kB
        4. 91executionlog.txt
          3 kB
        5. call.png
          call.png
          32 kB
        6. READTXTSTRWQ.ktr
          52 kB
        7. TestResults.docx
          486 kB
        8. WQ1000000.zip
          1.30 MB

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            gdev Gurudev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated: