Details
-
Type:
Bug
-
Status: Closed
-
Severity:
High
-
Resolution: Fixed
-
Affects Version/s: 5.1.0 GA
-
Fix Version/s: 5.1.0 GA
-
Component/s: Step
-
Labels:
-
Environment:Windows 7, psql 9.2.5, Greenplum Database 4.2.2 (on remote CentOS 5)
-
Story Points:5
-
PDI Sub-component:
-
Notice:
-
Sprint Team:Pervach
-
Operating System/s:Windows 7
Description
Control file specified in "Control file" field in GP Bulk Loader step is generated in PDI 5.1 build CE #822 as follows:
\COPY "public".bulk1 ( id, "name", firstname, zip, city, birthdate, street, housenr, stateCode, "state" ) FROM "D:\GP_data.dat" WITH CSV LOG ERRORS INTO "public".bulk1_errors SEGMENT REJECT LIMIT 50
Path to data file appears to be enclosed in double quotes as shown above what is a reason of failure.
Loading data to Greenplum gets failed with the next log trace:
2013/11/01 00:12:56 - Greenplum Bulk Loader.0 - ERROR>psql:/GP_control.txt:1: "D:/GP_data.dat: Invalid argument
When I replace double quotes around "D:\GP_data.dat" to single quotes 'D:\GP_data.dat' in generated file and then run manually with psql - it works.
This is a regression because in PDI 4.4 filename was generated to be enclosed in single quotes.
Possibly, it might be regressed after fix for bug http://jira.pentaho.com/browse/PDI-2661.
So binary, control and log files should be double-quoted, however, data file specified in control file is not treated correctly when double-quoted and works fine when single-quoted.