Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-17586

Excel Input Step can't use VFS

    XMLWordPrintable

    Details

    • Story Points:
      0
    • PDI Sub-component:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Steps to Reproduce:
      Hide

      Just try to use VFS in ExcelInput step by using an Excel file inside a zip file.

      Show
      Just try to use VFS in ExcelInput step by using an Excel file inside a zip file.

      Description

      During the work on PDI-15482 several problems were found on the ExcelInput step.

      1. VFS usage is not working since a change made under PDI-16942 (march 2018)

      • Note that while the change was made on the 8.1 suit, it was backported to the 8.0 suite (SP-4204) and the 7.1 suite (SP-4205); as so, VFS is also not working for ExcelInput step in these versions.
      • To fix this, the referred code must be reverted (but doing so, the unit test created for the issue will fail and the original bug will resurface)

      2. To handle the problem with the high compression ratio Zip files one should have a configuration variable that would control the behaviour that the ZipSecureFile class has for detecting Zip-bombs.

      • For the file in the issue, it's enough to control the ratio between de- and inflated bytes through setMinInflateRatio; however it would be best to have also configuration for setting MaxEntrySize and MaxTextSize

      3. A current limitation is the fact that only Zip files using Deflate or Store compression method is supported.

      • This problem was sent to a new issue (PDI-17648) to give more visibility on the fact that we now support new zip compression methods.

      Implementation notes:
      When FileInputStream is used, no VFS file is supported: one has to let WorkbookFactory get the filename and handle it internally.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ooliveira Octavio Oliveira
                Reporter:
                sergio.ribeiro Sérgio Ribeiro
              • Votes:
                1 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: