Pentaho Data Integration - Kettle
  1. Pentaho Data Integration - Kettle
  2. PDI-11046

Cassandra: Input step with ReversedType Comparator

    Details

    • Type: Bug Bug
    • Status: Closed
    • Severity: High High
    • Resolution: Fixed
    • Affects Version/s: 5.0.0 GA (5.0.0 GA Suite Release)
    • Fix Version/s: 5.1.0 GA
    • Component/s: Big Data (Hadoop etc.)
    • Labels:
      None
    • Environment:
      DSE Cassandra
    • PDI Sub-component:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      When using a "ReversedType" comparator for a Cassandra keyspace, the following error occurs on a Cassandra Input step:

      Cassandra Input.0 - ERROR (version 5.0.0.1, build 1 from 2013-09-11_16-51-19 by buildguy) : Unexpected error
      Cassandra Input.0 - ERROR (version 5.0.0.1, build 1 from 2013-09-11_16-51-19 by buildguy) : org.pentaho.di.core.exception.KettleException:
      2013/11/05 16:22:35 - Cassandra Input.0 -
      2013/11/05 16:22:35 - Cassandra Input.0 - Cant find a deserializer for type "

      {0}

      "

      Reproduction steps:

      Create table with CQL3:

      create table cf_topher(id int primary key, location text, created 'ReversedType(org.apache.cassandra.db.marshal.DateType)');
      insert into cf_topher(id, location, created) values (123, 'here', '2013-1-1');

      Then point the cassandra input step at this table. Without the 'ReversedType' it works fine. With the type id does not.

      Attaching screen shots for reference.

      1. test1.PNG
        67 kB
      2. test2.PNG
        87 kB
      3. Unknown.png
        47 kB
      4. Unknown-1.png
        215 kB
      5. Unknown-2.png
        146 kB

        Issue Links

          Activity

          Sandeep Kemparaju created issue -
          Sandeep Kemparaju made changes -
          Field Original Value New Value
          Sandeep Kemparaju made changes -
          Attachment Unknown-1.png [ 45593 ]
          Attachment Unknown-2.png [ 45594 ]
          Attachment Unknown.png [ 45595 ]
          Doug Moran made changes -
          Assignee Triage [ project admin ] Mark Hall [ mhall ]
          Hide
          Doug Moran added a comment - - edited

          Mark,

          What would it take to do this?

          Show
          Doug Moran added a comment - - edited Mark, What would it take to do this?
          Doug Moran made changes -
          Severity Unknown [ 7 ] Critical [ 2 ]
          Hide
          Mark Hall added a comment -

          Shouldn't be too much effort. An approach similar to that used for CompositeType should do the trick.

          Show
          Mark Hall added a comment - Shouldn't be too much effort. An approach similar to that used for CompositeType should do the trick.
          Hide
          Doug Moran added a comment - - edited

          Make it so - I'll get it into a sprint. I assume it will be detected automatically and just handled. Fix the error message while in there too.

          Are there any other common types we are missing? Can we look at the schema and dynamically load and use whatever classes are specified to deserialize as long as they are available on the class path?

          Show
          Doug Moran added a comment - - edited Make it so - I'll get it into a sprint. I assume it will be detected automatically and just handled. Fix the error message while in there too. Are there any other common types we are missing? Can we look at the schema and dynamically load and use whatever classes are specified to deserialize as long as they are available on the class path?
          Hide
          Mark Hall added a comment -

          I'll check for any other missing types. We might be able to do some stuff dynamically. We have to explicitly check for some types however in order to do something that is Kettle-compatible (e.g. CompositeType, DynamicCompositeType). The same will be true for ReversedType - to determine the outgoing field meta data, before actually deserializing any data, the ReversedType schema string will have to be parsed to find the base type that it wraps.

          Show
          Mark Hall added a comment - I'll check for any other missing types. We might be able to do some stuff dynamically. We have to explicitly check for some types however in order to do something that is Kettle-compatible (e.g. CompositeType, DynamicCompositeType). The same will be true for ReversedType - to determine the outgoing field meta data, before actually deserializing any data, the ReversedType schema string will have to be parsed to find the base type that it wraps.
          Show
          Mark Hall added a comment - master: https://github.com/pentaho/pentaho-cassandra-plugin/pull/26
          Daniel Bechtel made changes -
          Daniel Bechtel made changes -
          Assignee Mark Hall [ mhall ] Triage [ project admin ]
          Kurtis Cruzada made changes -
          Fix Version/s 5.1.0 GA [ 12126 ]
          Kurtis Cruzada made changes -
          Severity Critical [ 2 ] Blocker [ 1 ]
          Jira Service Acct made changes -
          Workflow Pentaho Engineering 9.0 Workflow [ 618595 ] Pentaho Bug 1.0 Workflow [ 632417 ]
          Jira Service Acct made changes -
          Workflow Pentaho Bug 1.0 Workflow [ 632417 ] Pentaho Engineering 9.0 Workflow [ 654293 ]
          Show
          Mark Hall added a comment - master: https://github.com/pentaho/pentaho-cassandra-plugin/pull/26
          Hide
          Mark Hall added a comment -

          To validate - create a CQL 3 table as outlined in the description of this case. Previewing a "select *" in CassandraInput should show:

          id created location
          123 2013/01/01 00:00:00.000 here

          Show
          Mark Hall added a comment - To validate - create a CQL 3 table as outlined in the description of this case. Previewing a "select *" in CassandraInput should show: id created location 123 2013/01/01 00:00:00.000 here
          Mark Hall made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          David Kincade made changes -
          Assignee Triage [ project admin ] Unassigned User [ unassigned ]
          Show
          Mark Hall added a comment - 5.0: https://github.com/pentaho/pentaho-cassandra-plugin/pull/27
          David Kincade made changes -
          Assignee Unassigned User [ unassigned ] Triage [ project admin ]
          Jens Bleuel made changes -
          PDI Sub-component Cassandra
          Daniel Bechtel made changes -
          Daniel Bechtel made changes -
          Status Resolved [ 5 ] Resolved [ 5 ]
          Daniel Bechtel made changes -
          David Kincade made changes -
          Sprint Team Tequila [ 10674 ]
          David Kincade made changes -
          Assignee Triage [ project admin ] Unassigned User [ unassigned ]
          David Kincade made changes -
          Sprint March 2013 SP [ 79 ]
          David Kincade made changes -
          Rank Ranked higher
          Doug Moran made changes -
          Rank Ranked higher
          Michelle Bradbury made changes -
          Epic Link PDI-9958 [ 103193 ]
          Michelle Bradbury made changes -
          Michelle Bradbury made changes -
          Rank Ranked higher
          Antonina Doudkina made changes -
          Sprint March 2014 SP [ 79 ] March 2014 SP, Chacha [ 79, 104 ]
          Hide
          Antonina Doudkina added a comment -

          Tested on PDI 5.1 nightly build # 659 against DataStax Cassandra 2.0.7 Community Edition.

          Test table was created as described in description section:

          create table cf_topher(id int primary key, location text, created 'ReversedType(org.apache.cassandra.db.marshal.DateType)');
          insert into cf_topher(id, location, created) values (123, 'here', '2013-1-1');

          Running transformation with Cassandra Input step against test table was successful. Data was retrieved correctly. Previewing step data also worked fine.

          Closing.

          Show
          Antonina Doudkina added a comment - Tested on PDI 5.1 nightly build # 659 against DataStax Cassandra 2.0.7 Community Edition. Test table was created as described in description section: create table cf_topher(id int primary key, location text, created 'ReversedType(org.apache.cassandra.db.marshal.DateType)'); insert into cf_topher(id, location, created) values (123, 'here', '2013-1-1'); Running transformation with Cassandra Input step against test table was successful. Data was retrieved correctly. Previewing step data also worked fine. Closing.
          Antonina Doudkina made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          QA Validation Status Not Yet Validated [ 10179 ] Validated by QA [ 10177 ]
          Antonina Doudkina made changes -
          Attachment test1.PNG [ 49338 ]
          Antonina Doudkina made changes -
          Attachment test2.PNG [ 49339 ]
          Antonina Doudkina made changes -
          Assignee Unassigned User [ unassigned ] Antonina Doudkina [ adoudkina ]
          Jira Service Acct made changes -
          Workflow Pentaho Engineering 9.0 Workflow [ 654293 ] Pentaho Engineering 9.1 Workflow [ 756252 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          14d 3h 32m 1 Mark Hall 17/Dec/13 3:34 PM
          Resolved Resolved Resolved Resolved
          63d 19h 33m 1 Daniel Bechtel 19/Feb/14 11:07 AM
          Resolved Resolved Closed Closed
          84d 3m 1 Antonina Doudkina 14/May/14 12:10 PM

            People

            • Assignee:
              Antonina Doudkina
              Reporter:
              Sandeep Kemparaju
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Agile