Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-893

Error related to database lookup and crossjoin.



    • Type: Bug
    • Status: Closed
    • Severity: Medium
    • Resolution: Incomplete
    • Affects Version/s: 3.0.2
    • Fix Version/s: None
    • Component/s: Transformation
    • Labels:
    • Environment:
      Ubuntu 7.10
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.


      The example I'm going to give is a simplified example. My original implementation was not as silly, but I've since re-arranged things and I don't remember exactly how things were setup.

      Create a table with 3 or more fields: a group key, and a group and parent group id.
      Create another table that serves as a closure table for the group table, populated with group and parent keys (not ids)

      Query once from the 1st table, and once from the closure table – get the group key and parent group id from the 1st table. On the group stream do a database lookup for the group parent key. Now do a crossjoin between the two streams based on the keys (not ids).

      The result is the comparison fails because of the group parent key. I think I originally had this design so I could write out the parent keys, then do the crossjoin with another table (not the closure table – i just can't remember which one it was) and write out another table, all in one transformation. I resolved the problem by creating a separate job step that executes some sql to update the groupdimension table to add parent keys, then did whatever work based on that later on.

      In short: query a table with one value, do a db lookup on a table using that value to get another value, do a query to a different table already using the value gained in the db lookup step as a key, and crossjoin the two streams on that key – the crossjoin fails to produce legitimate results.




            phantal Brian Vandenberg
            phantal Brian Vandenberg
            0 Vote for this issue
            0 Start watching this issue