Uploaded image for project: 'Pentaho Data Mining - Weka'
  1. Pentaho Data Mining - Weka
  2. DATAMINING-781

Bug in SMO or Standardization with values close to MaxDouble

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Severity: Unknown
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      0
    • Notice:
      When an issue is open, the "Fix Version/s" field conveys a target, not necessarily a commitment. When an issue is closed, the "Fix Version/s" field conveys the version that the issue was fixed in.
    • Steps to Reproduce:
      Hide
      • Download this file
      • Run the SMO with the following parameters: -P 1E-12 -calibrator "weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4" -C 1.0 -V -1 -K "weka.classifiers.functions.supportVector.PolyKernel -E 1.0 -C 250007" -N 1

      Command line call:

      java -Xmx1524m -classpath weka.jar weka.classifiers.functions.SMO -P 1E-12 -calibrator "weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4" -C 1.0 -V -1 -K "weka.classifiers.functions.supportVector.PolyKernel -E 1.0 -C 250007" -N 1 -t C:\Users\sherbold\git\atoml\generated-tests\weka\src\test\resources\smokedata\MaxDouble_1_training.arff
      
      Show
      Download this file Run the SMO with the following parameters: -P 1E-12 -calibrator "weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4" -C 1.0 -V -1 -K "weka.classifiers.functions.supportVector.PolyKernel -E 1.0 -C 250007" -N 1 Command line call: java -Xmx1524m -classpath weka.jar weka.classifiers.functions.SMO -P 1E-12 -calibrator "weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4" -C 1.0 -V -1 -K "weka.classifiers.functions.supportVector.PolyKernel -E 1.0 -C 250007" -N 1 -t C:\Users\sherbold\git\atoml\generated-tests\weka\src\test\resources\smokedata\MaxDouble_1_training.arff

      Description

      Values close to double crash either the SMO or the unsupervised standard scaler:

      java.lang.Exception: A NaN value was generated while standardizing attribute feature_0 at weka.filters.unsupervised.attribute.Standardize.convertInstance(Standardize.java:247) at weka.filters.unsupervised.attribute.Standardize.batchFinished(Standardize.java:169) at weka.filters.Filter.useFilter(Filter.java:708) at weka.classifiers.functions.SMO.buildClassifier(SMO.java:1386) at weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:1623) at weka.classifiers.Evaluation.evaluateModel(Evaluation.java:668) at weka.classifiers.AbstractClassifier.runClassifier(AbstractClassifier.java:141) at weka.classifiers.functions.SMO.main(SMO.java:2301)
      

      The strange thing about this issue is that it depends on the hyper parameters and vanishes if they are modified, indicating that this is quite the corner case. 

        Attachments

          Activity

            People

            Assignee:
            project admin Triage
            Reporter:
            sherbold Steffen Herbold
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: