Group: nz.ac.waikato.cms.weka - All Dependencies

icon

complementNaiveBayes · Class for building and using a Complement class Naive Bayes classifier. For more information see: Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: ICML, 616-623, 2003. P.S.: TF, IDF and length normalization transforms, as described in the paper, can be performed through weka.filters.unsupervised.StringToWordVector.

Apr 29, 2014
icon

clojureClassifier · Wrapper classifier for classifiers written in the Clojure language.

Apr 28, 2014
icon
citationKNN 1.0.2

citationKNN · Modified version of the Citation kNN multi instance classifier. For more information see: Jun Wang, Zucker, Jean-Daniel: Solving Multiple-Instance Problem: A Lazy Learning Approach. In: 17th International Conference on Machine Learning, 1119-1125, 2000.

Apr 26, 2012
icon

chiSquaredAttributeEval · Evaluates the worth of an attribute by computing the value of the chi-squared statistic with respect to the class.

Apr 27, 2014
icon

bestFirstTree · Class for building a best-first decision tree classifier. This class uses binary split for both nominal and numeric attributes. For missing values, the method of 'fractional' instances is used. For more information, see: Haijian Shi (2007). Best-first decision tree learning. Hamilton, NZ. Jerome Friedman, Trevor Hastie, Robert Tibshirani (2000). Additive logistic regression : A statistical view of boosting. Annals of statistics. 28(2):337-407.

Apr 27, 2014
icon
CLOPE 1.0.2

CLOPE · Yiling Yang, Xudong Guan, Jinyuan You: CLOPE: a fast and effective clustering algorithm for transactional data. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 682-687, 2002.

Apr 26, 2012
icon
winnow 1.0.2

winnow · Implements Winnow and Balanced Winnow algorithms by Littlestone. For more information, see N. Littlestone (1988). Learning quickly when irrelevant attributes are abound: A new linear threshold algorithm. Machine Learning. 2:285-318; N. Littlestone (1989). Mistake bounds and logarithmic linear-threshold learning algorithms. University of California, Santa Cruz. Does classification for problems with nominal attributes (which it converts into binary attributes)

Apr 26, 2012
icon

votingFeatureIntervals · Classification by voting feature intervals. Intervals are constucted around each class for each attribute (basically discretization). Class counts are recorded for each interval on each attribute. Classification is by voting. For more info see: G. Demiroz, A. Guvenir: Classification by voting feature intervals. In: 9th European Conference on Machine Learning, 85-92, 1997.

Apr 26, 2012
icon
WekaODF 1.0.4

WekaODF · WekaODF adds support to directory read from and write to spreadsheets in ODF (Open Document Format for Office Applications, ISO/IEC 26300:2006) format. ODF is used by the OpenOffice.org suite, for instance. WekaODF uses jOpenDocument (http://www.jOpenDocument.org, GPL) in order to read/write ODF spreadsheets.

May 13, 2012
icon

userClassifier · Interactively classify through visual means. You are Presented with a scatter graph of the data against two user selectable attributes, as well as a view of the decision tree. You can create binary splits by creating polygons around data plotted on the scatter graph, as well as by allowing another classifier to take over at points in the decision tree should you see fit. For more information see: Malcolm Ware, Eibe Frank, Geoffrey Holmes, Mark Hall, Ian H. Witten (2001). Interactive machine learning: letting users build classifiers. Int. J. Hum.-Comput. Stud. 55(3):281-292.

Apr 25, 2014
icon

timeSeriesFilters · Description=Provides a set of filters for time series. Currently contains PAA and SAX transformation filters and a filter that converts symbolic time series to string attribute values. The time series need to be given as values of a relation-valued attribute in the ARFF file. For example data in ARFF format, check the data directory of this package.

Feb 02, 2015
icon

tabuAndScatterSearch · Search methods contributed by Adrian Pino (ScatterSearchV1, TabuSearch). ScatterSearch: Performs an Scatter Search through the space of attribute subsets. Start with a population of many significants and diverses subset stops when the result is higher than a given treshold or there's not more improvement. For more information see: Felix Garcia Lopez (2004). Solving feature subset selection problem by a Parallel Scatter Search. Elsevier. Tabu Search: Abdel-Rahman Hedar, Jue Wangy, Masao Fukushima (2006). Tabu Search for Attribute Reduction in Rough Set Theory.

Apr 26, 2012
icon

SVMAttributeEval · Evaluates the worth of an attribute by using an SVM classifier. Attributes are ranked by the square of the weight assigned by the SVM. Attribute selection for multiclass problems is handled by ranking attributes for each class seperately using a one-vs-all method and then "dealing" from the top of each pile to give a final ranking. For more information see: I. Guyon, J. Weston, S. Barnhill, V. Vapnik (2002). Gene selection for cancer classification using support vector machines. Machine Learning. 46:389-422.

Apr 26, 2012
icon

simpleEducationalLearningSchemes · Simple learning schemes for educational purposes (Prism, Id3, IB1 and NaiveBayesSimple).

Apr 26, 2012
icon

sequentialInformationalBottleneckClusterer · Cluster data using the sequential information bottleneck algorithm. Note: only hard clustering scheme is supported. sIB assign for each instance the cluster that have the minimum cost/distance to the instance. The trade-off beta is set to infinite so 1/beta is zero. For more information, see: Noam Slonim, Nir Friedman, Naftali Tishby: Unsupervised document classification using sequential information maximization. In: Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval, 129-136, 2002.

Apr 26, 2012
icon

scriptingClassifiers · Wrapper classifiers for Jython and Groovy code. Even though the classifier is serializable, the trained classifier cannot be stored persistently. I.e., one cannot store a model file and re-load it at a later point in time again to make predictions.

Apr 26, 2012
icon
SPegasos 1.0.2

SPegasos · Implements the stochastic variant of the Pegasos (Primal Estimated sub-GrAdient SOlver for SVM) method of Shalev-Shwartz et al. (2007). This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes, so the coefficients in the output are based on the normalized data. Can either minimize the hinge loss (SVM) or log loss (logistic regression). For more information, see S. Shalev-Shwartz, Y. Singer, N. Srebro: Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. In: 24th International Conference on MachineLearning, 807-814, 2007.

Apr 26, 2012
icon
ridor 1.0.2

ridor · An implementation of a RIpple-DOwn Rule learner. It generates a default rule first and then the exceptions for the default rule with the least (weighted) error rate. Then it generates the "best" exceptions for each exception and iterates until pure. Thus it performs a tree-like expansion of exceptions.The exceptions are a set of rules that predict classes other than the default. IREP is used to generate the exceptions. For more information about Ripple-Down Rules, see: Brian R. Gaines, Paul Compton (1995). Induction of Ripple-Down Rules Applied to Modeling Large Databases. J. Intell. Inf. Syst. 5(3):211-228.

Apr 26, 2012
icon

racedIncrementalLogitBoost · Classifier for incremental learning of large datasets by way of racing logit-boosted committees. For more information see: Eibe Frank, Geoffrey Holmes, Richard Kirkby, Mark Hall: Racing committees for large datasets. In: Proceedings of the 5th International Conferenceon Discovery Science, 153-164, 2002.

Apr 26, 2012
icon
raceSearch 1.0.2

raceSearch · Races the cross validation error of competing attribute subsets. Use in conjuction with a ClassifierSubsetEval. RaceSearch has four modes: forward selection races all single attribute additions to a base set (initially no attributes), selects the winner to become the new base set and then iterates until there is no improvement over the base set. Backward elimination is similar but the initial base set has all attributes included and races all single attribute deletions. Schemata search is a bit different. Each iteration a series of races are run in parallel. Each race in a set determines whether a particular attribute should be included or not---ie the race is between the attribute being "in" or "out". The other attributes for this race are included or excluded randomly at each point in the evaluation. As soon as one race has a clear winner (ie it has been decided whether a particular attribute should be inor not) then the next set of races begins, using the result of the winning race from the previous iteration as new base set. Rank race first ranks the attributes using an attribute evaluator and then races the ranking. The race includes no attributes, the top ranked attribute, the top two attributes, the top three attributes, etc. It is also possible to generate a raked list of attributes through the forward racing process. If generateRanking is set to true then a complete forward race will be run---that is, racing continues until all attributes have been selected. The order that they are added in determines a complete ranking of all the attributes. Racing uses paired and unpaired t-tests on cross-validation errors of competing subsets. When there is a significant difference between the means of the errors of two competing subsets then the poorer of the two can be eliminated from the race. Similarly, if there is no significant difference between the mean errors of two competing subsets and they are within some threshold of each other, then one can be eliminated from the race.

Apr 26, 2012
icon

probabilisticSignificanceAE · Evaluates the worth of an attribute by computing the Probabilistic Significance as a two-way function (attribute-classes and classes-attribute association). For more information see: Amir Ahmad, Lipika Dey (2004). A feature selection technique for classificatory analysis.

Apr 26, 2012
icon
RBFNetwork 1.0.8

RBFNetwork · RBFNetwork implements a normalized Gaussian radial basisbasis function network. It uses the k-means clustering algorithm to provide the basis functions and learns either a logistic regression (discrete class problems) or linear regression (numeric class problems) on top of that. Symmetric multivariate Gaussians are fit to the data from each cluster. If the class is nominal it uses the given number of clusters per class. RBFRegressor implements radial basis function networks for regression, trained in a fully supervised manner using WEKA's Optimization class by minimizing squared error with the BFGS method. It is possible to use conjugate gradient descent rather than BFGS updates, which is faster for cases with many parameters, and to use normalized basis functions instead of unnormalized ones.

Jan 16, 2015
icon

averagedOneDependenceEstimators · AODE achieves highly accurate classification by averaging over all of a small space of alternative naive-Bayes-like models that have weaker (and hence less detrimental) independence assumptions than naive Bayes. The resulting algorithm is computationally efficient while delivering highly accurate classification on many learning tasks. For more information, see G. Webb, J. Boughton, Z. Wang (2005). Not So Naive Bayes: Aggregating One-Dependence Estimators. Machine Learning. 58(1):5-24.

Jul 20, 2012
icon

attributeSelectionSearchMethods · This package provides four search methods for attribute selection: ExhaustiveSearch, GeneticSearch, RandomSearch and RankSearch. See: David E. Goldberg (1989). Genetic algorithms in search, optimization and machine learning. Addison-Wesley. Mark Hall, Geoffrey Holmes (2003). Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering. 15(6):1437-1447.

Apr 27, 2014

Advertisement

Top Dependency Usages

Feb 13, 2021
95.1k usages
8.5k stars
Jun 02, 2023
69.4k usages
14.3k stars
Mar 17, 2023
51k usages
2.1k stars
Jul 31, 2023
27.1k usages
50.1k stars
Aug 09, 2023
25k usages
2.7k stars