Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-4696

'Alternative Start Date' in dimensional lookup/update is not being evaluated correctly

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Severity: Medium
    • Resolution: Fixed
    • Affects Version/s: 3.2.4
    • Component/s: None
    • Labels:
      None
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      If you don't use the alternative start date, the value in "Stream Datefield" is used zu fill the updated row field date_to and the new row date_from.
      If you use the alternative start date, the value in "A column value" is used zu fill the updated row field date_to and the new row date_from.

      If you don't use the alternative start date, the value in "Stream Datefield" is used to find an actuell record by this date beeing between date_from and date_to.
      If you use the alternative start date, it is the same. The lookup is NOT using "A column value" for this!

      In my transformation 1. DimensionLookup_Update_with_booking_date_only_in_stream_20100930.ktr (the wrong one) the value in "Stream Datefield" is 2010-08-31, the alternative start date is 2010-09-28.
      So every new entry in the data base has a Date_from = 2010-09-28.
      And the lookup SQL could newer find this record again by using 2010-08-31 in the WHERE clause

      WHERE
      mitarbeiternummer = 3810
      AND ( date_from IS NULL OR date_from <= '2010-08-31 00:00:00.000' )
      AND date_from >= '2010-08-31 00:00:00.000'

      --> So in the documentation should be added:
      If you use an alternative starte date, make shure the you use the same date in the field "Stream Datefield", which is used for the lookup operation.

      But there is also a bug!

      Result of three runs:

      ID VERSION GUELTIG_VON GUELTIG_BIS MITARBEITERNUMMER NAME BOOKING_DATE
      – ------- --------------------- ----------------------- ----------------- --------- ---------------------
      0 1 NULL NULL NULL NULL NULL
      1 1 2010-09-28 00:00:00.0 9999-12-31 23:59:59.999 3810 Klaus5 2010-09-28 00:00:00.0
      2 1 2010-09-28 00:00:00.0 2010-09-28 00:00:00.0 2330 Bernhard 2010-09-28 00:00:00.0
      3 2 2010-09-28 00:00:00.0 9999-12-31 23:59:59.999 2330 Bernhard2 2010-09-28 00:00:00.0
      4 2 2010-09-28 00:00:00.0 9999-12-31 23:59:59.999 2330 Bernhard2 2010-09-28 00:00:00.0

      1. Creation of row 0 to 2
      2. Change a name --> New record 3

      3. No changes against 2. run:
      Where clause in the lookup SQL:

      WHERE mitarbeiternummer = ?
      AND ( gueltig_von IS NULL OR gueltig_von <= '2010-09-28 00:00:00.0' )
      AND gueltig_bis >= '2010-09-28 00:00:00.0'

      This lookup will find record no. 1 and will create no. 4 which is identical to no. 3

      The problem is, that the step uses the same timestamp for date_from and date_to.
      Solution: Change the WHERE clause to

      WHERE mitarbeiternummer = ?
      AND ( gueltig_von IS NULL OR gueltig_von < '2010-09-28 00:00:00.0' )
      AND gueltig_bis >= '2010-09-28 00:00:00.0'

      You see the difference? If you use the standard start date it is he same SQL, so the error will be there two. And this is essential for us. Our operting systems are doing a closure in the evening and so records with a booking-day (a date, no time-stamp will be created. We need to use this date as the start-date and the update-date.
      So if we have errors in the actuall version of this step it is impossible to rerun the transformation. It will run on an error by doing the insert (ID and verion are a unique key).

      So please, I provide the answers and a solution, can you fix it soon?

        Attachments

          Activity

            People

            Assignee:
            gdavid Golda Thomas
            Reporter:
            myau Man Shing Yau (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: