Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-18977

Microsoft Excel Writer/Input mishandling empty strings and null values when using streaming

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Severity: Unknown
    • Resolution: Duplicate
    • Affects Version/s: 8.3.0.10 GA, 9.0.0.5 GA
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      0
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Steps to Reproduce:
      Hide

      SCENARIO 1

      1: Run write_excel_file_1.ktr write_excel_file_1.ktr
      2. Run read_excel_file.ktr read_excel_file.ktr

       SCENARIO 2

      1: Run write_excel_file_2.ktr write_excel_file_2.ktr
      2: Run read_excel_file.ktr read_excel_file.ktr

       

      Show
      SCENARIO 1 1: Run write_excel_file_1.ktr  write_excel_file_1.ktr 2. Run read_excel_file.ktr  read_excel_file.ktr   SCENARIO 2 1: Run write_excel_file_2.ktr  write_excel_file_2.ktr 2: Run read_excel_file.ktr  read_excel_file.ktr  

      Description

      This issue occurs both in the PDI Client and on the Pentaho Server.

      These Transformations that use the Microsoft Excel Writer and the Microsoft Excel Input steps.

      First we write an xlsx file with the Microsoft Excel Writer step using Streaming. This file is readable by Microsoft Excel



      Then we read the generated file with an Microsoft Excel Input step using Streaming.



      We have observed some issues related to cells that contain an:

      • Empty String
      • Null

      The observations are summarized below in the following two scenarios:

      SCENARIO 1

      1.1: A file is created containing an:

      • Empty String (BOXED in BLUE)
      • Null (BOXED in ORANGE)

      neither of these cells are the last cell in their row.



      1.2: After the file has been read the following can be seen:

      • The cell that contains the Empty String has its value changed to the value of the next cell in the row. (BOXED in RED)
      • The next cell in the row has its value changed to Null. (BOXED in ORANGE)
      • The Microsoft Excel Input step does not continue reading rows after it reaches the row containing the Null value. (BOXED in BLUE)


      SCENARIO 2

      2.1: A file is created containing an:

      • Empty String (BOXED in BLUE)
      • Null (BOXED in ORANGE)

      the cell with the Empty String is the last cell in its row.



      2.2: After the file has been read the following can be seen:

      • The cell that contains the Empty String has its value replaced by the value of the first cell of the next row. (BOXED in RED)
      • The next row values are replaced by the values of the row coming after. (BOXED in ORANGE)
      • The row coming after is repeated (BOXED in BLUE)
      • The Microsoft Excel Input step does not continue reading rows after it reaches the row containing the Null value. (BOXED in PURPLE)



      NOTE: These issues DO NOT occur if either of the following changes are made:

      1: Microsoft Excel Writer step NOT using Streaming. (BOXED in GREEN)



      2: Microsoft Excel Input step NOT using Streaming. (BOXED in GREEN)


        PractiTest Integration




          Attachments

          1. 01 Stream Ticked.png
            01 Stream Ticked.png
            24 kB
          2. 02 Spreadsheet Type Streaming.png
            02 Spreadsheet Type Streaming.png
            30 kB
          3. 03 Stream Not Ticked.png
            03 Stream Not Ticked.png
            36 kB
          4. 04 Spreadsheet Type Not Streaming.png
            04 Spreadsheet Type Not Streaming.png
            30 kB
          5. read_excel_file_no_stream.ktr
            15 kB
          6. read_excel_file.ktr
            15 kB
          7. S101 Spreadsheet Format.png
            S101 Spreadsheet Format.png
            19 kB
          8. S102 Results.png
            S102 Results.png
            20 kB
          9. S201 Spreadsheet Format.png
            S201 Spreadsheet Format.png
            24 kB
          10. S202 Results.png
            S202 Results.png
            29 kB
          11. write_excel_file_1_no_stream.ktr
            40 kB
          12. write_excel_file_1.ktr
            40 kB
          13. write_excel_file_2_no_stream.ktr
            40 kB
          14. write_excel_file_2.ktr
            40 kB

            Issue Links

              Activity

                People

                Assignee:
                singram Stuart Ingram
                Reporter:
                singram Stuart Ingram
                Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved: