Uploaded image for project: 'Pentaho Data Mining - Weka'
  1. Pentaho Data Mining - Weka
  2. DATAMINING-782

EM crashing on very large numbers due to invalid number of KMeans clusters

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Severity: Unknown
    • Resolution: Unresolved
    • Affects Version/s: Master
    • Fix Version/s: None
    • Component/s: Weka packages
    • Labels:
      None
    • Environment:
      weka version: 3.9.4
    • Story Points:
      0
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Steps to Reproduce:
      Hide

      Load [this dataset|https://user.informatik.uni-goettingen.de/~sherbol/MaxDouble.arff].

      Train the clusterer with this dataset (via buildClusterer), then classify every data instance from the same dataset with clusterInstance and distributionForInstance.

       

      Show
      Load [this dataset| https://user.informatik.uni-goettingen.de/~sherbol/MaxDouble.arff ]. Train the clusterer with this dataset (via buildClusterer), then classify every data instance from the same dataset with clusterInstance and distributionForInstance.  

      Description

      EM seems to have problems with clustering data close to MaxDouble (values >10^306) by not checking if the number of clusters to assign in the clustering step is actually 0. This causes the following error in my case which I think should be handled by the EM class itself instead:

      "java.lang.Exception: Number of clusters must be > 0
      at weka.clusterers.SimpleKMeans.setNumClusters(SimpleKMeans.java:1278)
      at weka.clusterers.EM.EM_Init(EM.java:851)
      at weka.clusterers.EM.iterate(EM.java:1997)
      at weka.clusterers.EM.doEM(EM.java:1810)
      at weka.clusterers.EM.buildClusterer(EM.java:1701)
      at weka.clusterers.WEKA_EM_AtomlTest.test_MaxDouble(WEKA_EM_AtomlTest.java:4628)
      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.base/java.lang.reflect.Method.invoke(Method.java:566)
      at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
      at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
      at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
      at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      at java.base/java.lang.Thread.run(Thread.java:834)
      "

        Attachments

          Activity

            People

            Assignee:
            project admin Triage
            Reporter:
            thaar Tobias Haar
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: