Details

    • Type: Bug
    • Status: Closed
    • Severity: High
    • Resolution: Fixed
    • Affects Version/s: 1.7.1.GA
    • Fix Version/s: 3.0.0.GA
    • Component/s: Scheduler
    • Labels:
      None
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.

      Description

      We have a pentaho 1.7.1.GA running for quite a while, executing a quartz job every minute.
      After some days we can see increasing memory use, up to the point of hitting OOM.

      The leaked objects are instances of StandaloneSession, SolutionRepository, and RuntimeRepository.

      The leak is caused by QuartzExecute, which creates a new StandaloneSession for every job execution.
      When building the output, we eventually end up in PentahoSystem.getSolutionRepository() and .getRuntimeRepository(), which each create a new object as their scope is "session" in our (default) configuration.

      Both SolutionRepository and RuntimeRepository (as well as ContentRepository) have a method setSession() that puts the current session into a threadlocal.
      If this is called with the StandaloneSession created by QuartzExecute, we have a leak:

      The thread's ThreadLocalMap will now refer to the session (both via the new Solution- and RuntimeRepository), and the session refers to the repositories via it's attributes member.
      This will never get cleared.

      Workaround that seems to work for us: set the scope of the repositories to "global" in pentaho.xml.

        Issue Links

          Activity

          Hide
          kcruzada Kurtis Cruzada added a comment - - edited

          Please do not use a globally scoped respository. This needs to be fixed in code.
          BTW. This is the best bug report ever... thank you for your detail.

          Show
          kcruzada Kurtis Cruzada added a comment - - edited Please do not use a globally scoped respository. This needs to be fixed in code. BTW. This is the best bug report ever... thank you for your detail.
          Hide
          olivier_fh2 Olivier Toromanoff added a comment -

          Why is this workaround not a solution?
          What kind of "bad things" can happen if we use a globally scoped repository?
          Thanks.

          Show
          olivier_fh2 Olivier Toromanoff added a comment - Why is this workaround not a solution? What kind of "bad things" can happen if we use a globally scoped repository? Thanks.
          Hide
          ankon Andreas Kohn added a comment -

          One bad thing seems to be that pentaho/Navigate page doesn't work properly anymore, it seems SolutionRepositoryBase.getSession() returned null, leading to this trace:

          java.lang.NullPointerException
                  at org.pentaho.core.cache.CacheManager.getCorrectedKey(CacheManager.java:444)
                  at org.pentaho.core.cache.CacheManager.getFromSessionCache(CacheManager.java:326)
                  at org.pentaho.core.repository.SolutionRepositoryBase.getRepositoryObjectFromCache(SolutionRepositoryBase.java:627)
                  at org.pentaho.repository.filebased.solution.SolutionRepository.getFileListIterator(SolutionRepository.java:641)
                  at org.pentaho.repository.filebased.solution.SolutionRepository.getActionSequences(SolutionRepository.java:578)
                  at org.pentaho.repository.filebased.solution.SolutionRepository.getNavigationUIDocument(SolutionRepository.java:795)
                  at org.pentaho.ui.component.NavigationComponent.getXmlContent(NavigationComponent.java:94)
                  at org.pentaho.ui.XmlComponent.getContent(XmlComponent.java:46)
                  at org.apache.jsp.jsp.Admin_jsp.getAdminLinks(Admin_jsp.java:46)
          
          Show
          ankon Andreas Kohn added a comment - One bad thing seems to be that pentaho/Navigate page doesn't work properly anymore, it seems SolutionRepositoryBase.getSession() returned null, leading to this trace: java.lang.NullPointerException at org.pentaho.core.cache.CacheManager.getCorrectedKey(CacheManager.java:444) at org.pentaho.core.cache.CacheManager.getFromSessionCache(CacheManager.java:326) at org.pentaho.core.repository.SolutionRepositoryBase.getRepositoryObjectFromCache(SolutionRepositoryBase.java:627) at org.pentaho.repository.filebased.solution.SolutionRepository.getFileListIterator(SolutionRepository.java:641) at org.pentaho.repository.filebased.solution.SolutionRepository.getActionSequences(SolutionRepository.java:578) at org.pentaho.repository.filebased.solution.SolutionRepository.getNavigationUIDocument(SolutionRepository.java:795) at org.pentaho.ui.component.NavigationComponent.getXmlContent(NavigationComponent.java:94) at org.pentaho.ui.XmlComponent.getContent(XmlComponent.java:46) at org.apache.jsp.jsp.Admin_jsp.getAdminLinks(Admin_jsp.java:46)
          Hide
          ankon Andreas Kohn added a comment -

          BISERVER-2639-mk1.diff contains a first attempt to break the cycle between StandaloneSession and the repositories.

          We are currently testing this, but so far it seems the sessions do no longer leak.

          Change detail:
          ISessionContainer: new interface that supports setSession(IPentahoSession)
          IRuntimeRepository, IContentRepository, SolutionRepositoryBase: retrofit "implements ISessionContainer", the setSession() method already existed
          RuntimeRepository: handle setSession(null), XXX: commits the transaction after disassociating with the session, not sure if it should do that
          QuartzExecute: destroy() the session after use
          StandaloneSession: destroy() calls setSession(null) on all ISessionContainer attributes

          Show
          ankon Andreas Kohn added a comment - BISERVER-2639-mk1.diff contains a first attempt to break the cycle between StandaloneSession and the repositories. We are currently testing this, but so far it seems the sessions do no longer leak. Change detail: ISessionContainer: new interface that supports setSession(IPentahoSession) IRuntimeRepository, IContentRepository, SolutionRepositoryBase: retrofit "implements ISessionContainer", the setSession() method already existed RuntimeRepository: handle setSession(null), XXX: commits the transaction after disassociating with the session, not sure if it should do that QuartzExecute: destroy() the session after use StandaloneSession: destroy() calls setSession(null) on all ISessionContainer attributes
          Hide
          ankon Andreas Kohn added a comment -

          The patch works for us in testing, please include this in the next release.

          Thank you!

          Show
          ankon Andreas Kohn added a comment - The patch works for us in testing, please include this in the next release. Thank you!
          Hide
          bseyler Bill Seyler added a comment -

          Implemented (and did some cleanup) of the community contribution. Works as advertised and was a well thought out contribution (thanks Andreas).

          To test:
          (1) start the pentaho platform.
          (2) create a report to run often (every minute)
          (3) After the platform has been running for about 5 minutes use a profiling tool to examine the amount of memory that is being used.
          (4) Let the platform continue to run for several hours
          (5) reexamine the amount of memory that is being used by the platform.
          (6) Verify that the difference between the two memory usages hasn't varied significantly.

          Show
          bseyler Bill Seyler added a comment - Implemented (and did some cleanup) of the community contribution. Works as advertised and was a well thought out contribution (thanks Andreas). To test: (1) start the pentaho platform. (2) create a report to run often (every minute) (3) After the platform has been running for about 5 minutes use a profiling tool to examine the amount of memory that is being used. (4) Let the platform continue to run for several hours (5) reexamine the amount of memory that is being used by the platform. (6) Verify that the difference between the two memory usages hasn't varied significantly.
          Hide
          jpshedesky Jared Pshedesky (Inactive) added a comment -

          I'm running this test now.

          Show
          jpshedesky Jared Pshedesky (Inactive) added a comment - I'm running this test now.
          Hide
          jpshedesky Jared Pshedesky (Inactive) added a comment -

          Validated. I ran a once per minute report for 3 hours and the memory moved not a smidgen.

          Show
          jpshedesky Jared Pshedesky (Inactive) added a comment - Validated. I ran a once per minute report for 3 hours and the memory moved not a smidgen.
          Hide
          ankon Andreas Kohn added a comment -

          Thanks for verifying this.

          Show
          ankon Andreas Kohn added a comment - Thanks for verifying this.

            People

            • Assignee:
              jpshedesky Jared Pshedesky (Inactive)
              Reporter:
              ankon Andreas Kohn
            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: