Control file specified in "Control file" field in GP Bulk Loader step is generated in PDI 5.1 build CE #822 as follows:
\COPY "public".bulk1 ( id, "name", firstname, zip, city, birthdate, street, housenr, stateCode, "state" ) FROM "D:\GP_data.dat" WITH CSV LOG ERRORS INTO "public".bulk1_errors SEGMENT REJECT LIMIT 50
Path to data file appears to be enclosed in double quotes as shown above what is a reason of failure.
Loading data to Greenplum gets failed with the next log trace:
2013/11/01 00:12:56 - Greenplum Bulk Loader.0 - ERROR>psql:/GP_control.txt:1: "D:/GP_data.dat: Invalid argument
When I replace double quotes around "D:\GP_data.dat" to single quotes 'D:\GP_data.dat' in generated file and then run manually with psql - it works.
This is a regression because in PDI 4.4 filename was generated to be enclosed in single quotes.
Possibly, it might be regressed after fix for bug http://jira.pentaho.com/browse/PDI-2661.
So binary, control and log files should be double-quoted, however, data file specified in control file is not treated correctly when double-quoted and works fine when single-quoted.