Uploaded image for project: 'Pentaho Data Integration - Kettle'
  1. Pentaho Data Integration - Kettle
  2. PDI-15910

Allow Subjob Loops Where Number of Loops Is Unknown Prior to Runtime (the end of the job loop is depending on a condition)

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Severity: High
    • Resolution: Unresolved
    • Affects Version/s: 6.0.0 GA, 6.1.0 GA, 7.0.0 GA
    • Fix Version/s: Backlog
    • Component/s: Job, Job Entry
    • Labels:
    • Story Points:
      0
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      The customer would like to be able to run a subjob as a loop using an alternate to the Execute Every Row option. One suggestion that would be consistent within the product is to allow the user to start a subjob loop by selecting Repeat in the Start entry. We say "consistent" because selecting Repeat in the Start entry is one of the recommended ways to loop a job.

      BACKGROUND AND USE CASE

      Background

      In subjobs, a user can perform a loop by using the Execute Every Row option. But if the number of subjob loops is not dependent on the number of rows passed from a previous subjob or transformation, but instead depends on something not known until runtime, like the return from an asynchronous REST call, this can be difficult to resolve in PDI.

      This impacts usability because instead of the user simply checking the Repeat option in the Start entry - as they would do for a job- the user has to either design the logic so that the subjob is pushed up to the main job or schedule the job to run repeatedly which can cause issues in real-time or near-real time scenarios.

      Please see the Use Case for more details.

      Use Case

      A subjob is kicked off when a Jenkin's job completes. The subjob and primary job must run and complete within 5 minutes of the Jenkins job completion. Since Jenkins jobs can occur at anytime, the customer would like to take advantage of the loop functionality that would check for job completion every 30 seconds.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned
              Reporter:
              cbrathwaite Chantel Brathwaite (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated: