Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Severe Severe
    • Resolution: Fixed
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Customer Case:
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • QA Validation Status:
      Validated by QA

      Description

      Add JdbcConnectionUuid connect string parameter.

      === From an email to mondrian dev list ===

      It's important to me that connection factories (the means by which Mondrian gets JDBC connections to the underlying databases... which include instances of javax.sql.DataSource, or (URL, username) credentials) can be represented as strings. It was a mistake to allow javax.sql.DataSource objects to be passed into Mondrian when creating a connection via the legacy API. olap4j made it more difficult to pass in non-Strings, and that made life painful for some people. I thought it would be possible to just register DataSources in JNDI and pass in the JDNI name, but as Marc pointed out, Pentaho has to run in containers (such as Tomcat) with read-only JNDI environments.

      Mondrian already has a DataSourceResolver SPI. This is important, and this works. The one thing it doesn't do is tell Mondrian whether two data sources point to the same database.

      Consider setting up a distributed cache. It's important that all of the participating instances of Mondrian know that they are looking at the same database instance. If they don't know it's the same database, they can't safely share their cache. If we used an SPI to determine equality, it's difficult to ensure that the same SPI is being used on all machines. When I'm answering a support call, it's easy to forget to ask whether someone has overridden the default implementation of the SPI.

      So, how to tell whether two connection factories are the same, without introducing an SPI? We introduce a new connect string parameter, JdbcConnectionUuid. (This complements existing parameters Jdbc, JdbcUser, JdbcPassword and DataSource.) If two mondrian connections have the same JdbcConnectionUuid, Mondrian will take the client at its word that the back-end databases are identical. It will not consider the other parameters in determining equality.

      Determining whether two schemas are equal, and therefore candidates for sharing a cache, comes down to two parts: Are the connection factories equal (using JdbcConnectionUuid etc. as described above)? And are the contents of the XML schema files equal (using UseContentChecksum, Catalog, CatalogContent, DynamicSchemaProcessor, as today)? Both of these questions are answered by looking at a string.

      JdbcConnectionUuid is optional in the connection parameters. If not specified, Mondrian would use the same connection factory matching rules as today. (Internally, Mondrian will generate a Uuid so that all connections have one.)

      As its name suggests, it's a good idea if JdbcConnectionUuid is a UUID. But it doesn't need to be. It could be an MD5 hash. It could be anything the user likes. They should just make damn sure that it is unique.

      When we implement http://jira.pentaho.com/browse/MONDRIAN-1177, we will provide a means to define the UUID alongside the connection credentials.

        Issue Links

          Activity

          Julian Hyde created issue -
          Jared Cornelius made changes -
          Field Original Value New Value
          Status Open [ 1 ] Open [ 1 ]
          Priority Unknown [ 7 ] Severe [ 3 ]
          Assignee Triage [ project admin ] Julian Hyde [ jhyde ]
          Fix Version/s 3.4.x (4.8.0 GA Suite Release) [ 11481 ]
          Carlos Lopez made changes -
          Customer Case 26464
          Will Gorman made changes -
          Assignee Julian Hyde [ jhyde ] Pedro Alves [ pmalves ]
          Hide
          Tiago Gomes Ferreira added a comment -
          Have some doubts on this:
          Should the current way of getting the schema-dependent part of the key be kept?
          When JdbcConnectionUuid isn't provided, should the behavior be exactly the same as before?

          To better frame the question, this is
          how RolapSchema.Pool.get returns a Schema:

           1. Always a new one if UseSchemaPool=false
           2. From the mapMd5ToSchema using the schema's md5 to fetch if UseContentChecksum=true
           3. From mapUrlToSchema using the schema key otherwise.

          and how the RolapSchema key is currently generated:

           1. Full schema xml if either CatalogContent or DynamicSchemaProcessor are provided.
           2. <catalogUrl>.external#<dataSourceInstanceId> if dataSource is provided
           3. <catalogUrl>.<connectionKey>.<user>.<dataSourceString> otherwise

          For the purpose of cache sharing, having the JdbcConnectionUuid enables us to merge the last two cases, but keeping how the schema part of the key is created still leaves two incompatible key domains for schemas that can be equal and using the same database.
          Using the full xml has the added problem of bloating log files.

          So only if always using both JdbcConnectionUuid and UseContentChecksum=true would we have a good chance of using the same cache for the same database/schema. Is this right?
          Show
          Tiago Gomes Ferreira added a comment - Have some doubts on this: Should the current way of getting the schema-dependent part of the key be kept? When JdbcConnectionUuid isn't provided, should the behavior be exactly the same as before? To better frame the question, this is how RolapSchema.Pool.get returns a Schema:  1. Always a new one if UseSchemaPool=false  2. From the mapMd5ToSchema using the schema's md5 to fetch if UseContentChecksum=true  3. From mapUrlToSchema using the schema key otherwise. and how the RolapSchema key is currently generated:  1. Full schema xml if either CatalogContent or DynamicSchemaProcessor are provided.  2. <catalogUrl>.external#<dataSourceInstanceId> if dataSource is provided  3. <catalogUrl>.<connectionKey>.<user>.<dataSourceString> otherwise For the purpose of cache sharing, having the JdbcConnectionUuid enables us to merge the last two cases, but keeping how the schema part of the key is created still leaves two incompatible key domains for schemas that can be equal and using the same database. Using the full xml has the added problem of bloating log files. So only if always using both JdbcConnectionUuid and UseContentChecksum=true would we have a good chance of using the same cache for the same database/schema. Is this right?
          Hide
          Julian Hyde added a comment -
          Internally it should use (connection key, schema key) in all 3 cases. Sounds like that changes behavior in #1 -- which is a good thing, since they could theoretically have provided the same catalog content on different databases.

          Even if they don't provide a JdbcConnectionUuid, you can create one internally (e.g. using an MD5 hash of user name and JDBC connect string, if that's what they provided). That way all connections are identified using the Uuid.

          Does that answer your questions?
          Show
          Julian Hyde added a comment - Internally it should use (connection key, schema key) in all 3 cases. Sounds like that changes behavior in #1 -- which is a good thing, since they could theoretically have provided the same catalog content on different databases. Even if they don't provide a JdbcConnectionUuid, you can create one internally (e.g. using an MD5 hash of user name and JDBC connect string, if that's what they provided). That way all connections are identified using the Uuid. Does that answer your questions?
          Hide
          Julian Hyde added a comment -
          Fixed in master branch as of https://github.com/pentaho/mondrian/commit/5bd12011bf0e834e4ff027e9622c95d83a552d9b. (Merged webdetails' fix, then refactored.)
          Show
          Julian Hyde added a comment - Fixed in master branch as of https://github.com/pentaho/mondrian/commit/5bd12011bf0e834e4ff027e9622c95d83a552d9b . (Merged webdetails' fix, then refactored.)
          Julian Hyde made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Brandon Bruce made changes -
          Assignee Pedro Alves [ pmalves ] Unassigned User [ unassigned ]
          Brandon Bruce made changes -
          Assignee Unassigned User [ unassigned ] Brandon Bruce [ bbruce ]
          Hide
          Pedro Vale added a comment -
          To validate from the BA Server:

          1. Ensure the latest mondrian.jar is being used
          2. Edit the datasources.xml from:
                      <DataSourceInfo>Provider=mondrian;DataSource=SampleData</DataSourceInfo>
          to
                      <DataSourceInfo>Provider=mondrian;DataSource=SampleData;UseContentChecksum=true;JdbcConnectionUuid=SampleDataUUID</DataSourceInfo>
          3. Using the xmla4js plugin, issue a query on the sampledata catalog
          4. Check the mondrian.log file and search for a line containing

          get: catalog=solution:steel-wheels/analysis/SampleData.mondrian.xml connectionKey=null, jdbcUser=<irrelevant>, dataSourceStr=<irrelevant>, dataSource=<irrelevant>, jdbcConnectionUuid=SampleDataUUID, useSchemaPool=true, useContentChecksum=true, ma
          p-size=<irrelevant>, md5-map-size=<irrelevant>

          The relevant part is that both the jdbcConnectionUuid and the useContentChecsum info match the info added to datasources.xml
          Show
          Pedro Vale added a comment - To validate from the BA Server: 1. Ensure the latest mondrian.jar is being used 2. Edit the datasources.xml from:             <DataSourceInfo>Provider=mondrian;DataSource=SampleData</DataSourceInfo> to             <DataSourceInfo>Provider=mondrian;DataSource=SampleData;UseContentChecksum=true;JdbcConnectionUuid=SampleDataUUID</DataSourceInfo> 3. Using the xmla4js plugin, issue a query on the sampledata catalog 4. Check the mondrian.log file and search for a line containing get: catalog=solution:steel-wheels/analysis/SampleData.mondrian.xml connectionKey=null, jdbcUser=<irrelevant>, dataSourceStr=<irrelevant>, dataSource=<irrelevant>, jdbcConnectionUuid=SampleDataUUID, useSchemaPool=true, useContentChecksum=true, ma p-size=<irrelevant>, md5-map-size=<irrelevant> The relevant part is that both the jdbcConnectionUuid and the useContentChecsum info match the info added to datasources.xml
          Brandon Bruce made changes -
          Assignee Brandon Bruce [ bbruce ] Unassigned User [ unassigned ]
          Tyler Band made changes -
          Assignee Unassigned User [ unassigned ] Tyler Band [ tband ]
          Hide
          Tyler Band added a comment -
          Unable to complete testing
          Show
          Tyler Band added a comment - Unable to complete testing
          Tyler Band made changes -
          Assignee Tyler Band [ tband ] Unassigned User [ unassigned ]
          Brandon Bruce made changes -
          Assignee Unassigned User [ unassigned ] Brandon Bruce [ bbruce ]
          Hide
          Brandon Bruce added a comment -
          This was verified by running a query through rex. As a note, you must enable mondrian logging before verifying this.
          Show
          Brandon Bruce added a comment - This was verified by running a query through rex. As a note, you must enable mondrian logging before verifying this.
          Brandon Bruce made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          QA Validation Status Not Yet Validated Validated by QA
          Sulaiman Karmali made changes -
          Link This issue is related to PRD-4003 [ PRD-4003 ]
          Kurtis Cruzada made changes -
          Link This issue relates to BISERVER-7641 [ BISERVER-7641 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Open Open
          5d 19h 43m 1 Jared Cornelius 06/Sep/12 9:08 AM
          Open Open Resolved Resolved
          15d 4h 44m 1 Julian Hyde 21/Sep/12 1:52 PM
          Resolved Resolved Closed Closed
          20d 2h 22m 1 Brandon Bruce 11/Oct/12 4:14 PM

            People

            • Assignee:
              Brandon Bruce
              Reporter:
              Julian Hyde
            • Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: