Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-19342

Python Executor step fails to initialize if the python script has imported "warnings" and "swifter" libraries.



    • Type: Bug
    • Status: Open
    • Severity: High
    • Resolution: Unresolved
    • Affects Version/s: 8.3.0 GA, 9.1.0 GA, 9.2.0 GA
    • Fix Version/s: None
    • Component/s: Step
    • Labels:
    • Story Points:
    • PDI Sub-component:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.


      The Python Executor step fails to initialize when the script contains the import of "warnings" and "swifter" libraries. We get the below error. pentaho (4).txt

      2021/09/27 17:27:42 - Python Executor.0 - ERROR (version, build from 2020-09-07 05.09.05 by buildguy) : java.lang.IllegalStateException: py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: Traceback (most recent call last):
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\site-packages\py4j\java_gateway.py", line 2451, in _call_proxy
          return_value = getattr(self.pool[obj_id], method)(*params)
        File "C:\Users\hdpuser1\AppData\Local\Temp\1631888361773-0\pdiPyServer.py", line 74, in runScript
          return self.execute_script(script, output_vars, include_index, global_env, csv_output_file)
        File "C:\Users\hdpuser1\AppData\Local\Temp\1631888361773-0\pdiPyServer.py", line 222, in execute_script
          raise e
        File "C:\Users\hdpuser1\AppData\Local\Temp\1631888361773-0\pdiPyServer.py", line 202, in execute_script
          frame = self.get_script_result_frame(global_env, py_script, output_variables)
        File "C:\Users\hdpuser1\AppData\Local\Temp\1631888361773-0\pdiPyServer.py", line 242, in get_script_result_frame
          raise e
        File "C:\Users\hdpuser1\AppData\Local\Temp\1631888361773-0\pdiPyServer.py", line 238, in get_script_result_frame
          exec (py_script, global_env)
        File "<string>", line 13, in <module>
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\site-packages\swifter\__init__.py", line 5, in <module>
          from .swifter import SeriesAccessor, DataFrameAccessor
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\site-packages\swifter\swifter.py", line 9, in <module>
          from dask import dataframe as dd
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\site-packages\dask\__init__.py", line 3, in <module>
          from .base import annotate, compute, is_dask_collection, optimize, persist, visualize
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\site-packages\dask\base.py", line 20, in <module>
          from . import config, local, threaded
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\site-packages\dask\threaded.py", line 11, in <module>
          from concurrent.futures import ThreadPoolExecutor
        File "<frozen importlib._bootstrap>", line 1055, in _handle_fromlist
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\__init__.py", line 49, in __getattr__
          from .thread import ThreadPoolExecutor as te
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\thread.py", line 37, in <module>
        File "C:\Users\hdpuser1\AppData\Local\Programs\Python\Python39\lib\threading.py", line 1374, in _register_atexit
          raise RuntimeError("can't register atexit after shutdown")
      RuntimeError: can't register atexit after shutdown

      This seems to an issue with py4j which is the interpreter we are using to execute python code and the support for swifter is not there




            Unassigned Unassigned
            gdev Gurudev
            0 Vote for this issue
            5 Start watching this issue