Group: nz.ac.waikato.cms.weka - All Dependencies

icon

scatterPlot3D · A visualization component for displaying a 3D scatter plot of the data using Java 3D. Requires Java 3D to be installed. This version adds built-in sampling controls to the GUI. The default sampling percentage is set so that a maximum of 5000 instances are plotted. The user can adjust this higher or lower to suit their available processing speed and memory.

Feb 04, 2019
icon
wekaPython 1.0.18

wekaPython · Integration with CPython for Weka. Python version 2.7.x or higher is required. Also requires the following packages to be installed in python: numpy, pandas, matplotlib and scikit-learn. This package provides a wrapper classifier and clusterer that, between them, cover 60+ scikit-learn algorithms. It also provides a general scripting step for the Knowlege Flow along with scripting plugin environments for the Explorer and Knowledge Flow.

Nov 25, 2022
icon

javaFXScatter3D · A visualization component for displaying a 3D scatter plot of the data using Java 3D. Requires Java 3D to be installed. This version adds built-in sampling controls to the GUI. The default sampling percentage is set so that a maximum of 5000 instances are plotted. The user can adjust this higher or lower to suit their available processing speed and memory.

Oct 31, 2019
icon
tiny-weka 3.9.15955

tiny-weka · The Waikato Environment for Knowledge Analysis (WEKA), a machine learning workbench. This artifact represents the bare API of the developer version, with no package manager, PMML, XML or user interface. It is aimed at commercial applications that license some of WEKA's algorithms.

Mar 03, 2022
7 stars
icon
hotSpot 1.0.14

hotSpot · HotSpot learns a set of rules (displayed in a tree-like structure) that maximize/minimize a target variable/value of interest. With a nominal target, one might want to look for segments of the data where there is a high probability of a minority value occuring (given the constraint of a minimum support). For a numeric target, one might be interested in finding segments where this is higher on average than in the whole data set. For example, in a health insurance scenario, find which health insurance groups are at the highest risk (have the highest claim ratio), or, which groups have the highest average insurance payout.

Aug 10, 2021
icon
NNge 1.0.2

NNge · Nearest-neighbor-like algorithm using non-nested generalized exemplars (which are hyperrectangles that can be viewed as if-then rules). For more information, see Brent Martin (1995). Instance-Based learning: Nearest Neighbor With Generalization. Hamilton, New Zealand. Sylvain Roy (2002). Nearest Neighbor With Generalization. Christchurch, New Zealand.

Apr 26, 2012
icon
realAdaBoost 1.0.2

realAdaBoost · Class for boosting a 2-class classifier using the Real Adaboost method. For more information, see J. Friedman, T. Hastie, R. Tibshirani (2000). Additive Logistic Regression: a Statistical View of Boosting. Annals of Statistics. 95(2):337-407.

Apr 26, 2012
icon

localOutlierFactor · A filter that applies the LOF (Local Outlier Factor) algorithm to compute an outlier score for each instance in the data. Can use multiple cores/cpus to speed up the LOF computation for large datasets. Nearest neighbor search methods and distance functions are pluggable. For more information, see: Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jorg Sander (2000). LOF: Identifying Density-Based Local Outliers. ACM SIGMOD Record. 29(2):93-104.

Jul 23, 2013
icon
simpleCART 1.0.2

simpleCART · Class implementing minimal cost-complexity pruning. Note when dealing with missing values, use "fractional instances" method instead of surrogate split method. For more information, see: Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California.

Apr 26, 2012
icon
tertius 1.0.2

tertius · Finds rules according to confirmation measure (Tertius-type algorithm). For more information see: P. A. Flach, N. Lachiche (1999). Confirmation-Guided Discovery of first-order rules with Tertius. Machine Learning. 42:61-95.

Apr 26, 2012
icon

DilcaDistance · This package implements the parameter free version of the DILCA distance. This approach allows to learn value-to-value distances between each pair of values for each attribute of the dataset. The distance between two values is computed indirectly based on the their distribution w.r.t. a set of related attributes (the context) carefully chosen.

Apr 26, 2014
icon
wavelet 1.0.2

wavelet · A filter for wavelet transformation. For more information see: Wikipedia (2004). Discrete wavelet transform. Kristian Sandberg (2000). The Haar wavelet transform. University of Colorado at Boulder, USA.

Apr 26, 2012
icon

hiddenNaiveBayes · Contructs Hidden Naive Bayes classification model with high classification accuracy and AUC. For more information refer to: H. Zhang, L. Jiang, J. Su: Hidden Naive Bayes. In: Twentieth National Conference on Artificial Intelligence, 919-924, 2005.

Apr 26, 2012
icon

isotonicRegression · Learns an isotonic regression model. Picks the attribute that results in the lowest squared error. Missing values are not allowed. Can only deal with numeric attributes. Considers the monotonically increasing case as well as the monotonically decreasing case.

Apr 26, 2012
icon

classAssociationRules · Class association rules algorithms (including an implementation of the CBA algorithm). For more information see: W. Li, J. Han, J.Pei: CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. In ICDM'01:369-376,2001. B. Liu, W. Hsu, Y. Ma: Integrating Classification and Association Rule Mining. In KDD'98:80-86,1998.

Jul 29, 2014
icon

generalizedSequentialPatterns · Class implementing a GSP algorithm for discovering sequential patterns in a sequential data set. The attribute identifying the distinct data sequences contained in the set can be determined by the respective option. Furthermore, the set of output results can be restricted by specifying one or more attributes that have to be contained in each element/itemset of a sequence. For further information see: Ramakrishnan Srikant, Rakesh Agrawal (1996). Mining Sequential Patterns: Generalizations and Performance Improvements.

Apr 26, 2012
icon
phmm4weka 1.1.3

phmm4weka · This Java software implements Profile Hidden Markov Models (PHMMs) for protein classification for the WEKA workbench. Standard PHMMs and newly introduced binary PHMMs are used. In addition the software allows propositionalisation of PHMMs.

Apr 27, 2012
icon

thresholdSelector · A metaclassifier that selecting a mid-point threshold on the probability output by a Classifier. The midpoint threshold is set so that a given performance measure is optimized. Currently this is the F-measure. Performance is measured either on the training data, a hold-out set or using cross-validation. In addition, the probabilities returned by the base learner can have their range expanded so that the output probabilities will reside between 0 and 1 (this is useful if the scheme normally produces probabilities in a very narrow range).

Apr 25, 2014
icon

lazyBayesianRules · Lazy Bayesian Rules Classifier. The naive Bayesian classifier provides a simple and effective approach to classifier learning, but its attribute independence assumption is often violated in the real world. Lazy Bayesian Rules selectively relaxes the independence assumption, achieving lower error rates over a range of learning tasks. LBR defers processing to classification time, making it a highly efficient and accurate classification algorithm when small numbers of objects are to be classified. For more information, see: Zijian Zheng, G. Webb (2000). Lazy Learning of Bayesian Rules. Machine Learning. 4(1):53-84.

Apr 26, 2012
icon

ensembleLibrary · Manages a libary of ensemble classifiers

Apr 26, 2012
icon

ordinalStochasticDominance · An implementation of the Ordinal Stochastic Dominance Learner. Further information regarding the OSDL-algorithm can be found in: S. Lievens, B. De Baets, K. Cao-Van (2006). A Probabilistic Framework for the Design of Instance-Based Supervised Ranking Algorithms in an Ordinal Setting. Annals of Operations Research; Kim Cao-Van (2003). Supervised ranking: from semantics to algorithms; Stijn Lievens (2004). Studie en implementatie van instantie-gebaseerde algoritmen voor gesuperviseerd rangschikken

Apr 26, 2012
icon

paceRegression · Class for building pace regression linear models and using them for prediction. Under regularity conditions, pace regression is provably optimal when the number of coefficients tends to infinity. It consists of a group of estimators that are either overall optimal or optimal under certain conditions. The current work of the pace regression theory, and therefore also this implementation, do not handle: - missing values - non-binary nominal attributes - the case that n - k is small where n is the number of instances and k is the number of coefficients (the threshold used in this implmentation is 20) For more information see: Wang, Y (2000). A new approach to fitting linear models in high dimensional spaces. Hamilton, New Zealand. Wang, Y., Witten, I. H.: Modeling for optimal probability prediction. In: Proceedings of the Nineteenth International Conference in Machine Learning, Sydney, Australia, 650-657, 2002.

Apr 26, 2012
icon

ordinalLearningMethod · An implementation of the Ordinal Learning Method (OLM). Further information regarding the algorithm and variants can be found in: Arie Ben-David (1992). Automatic Generation of Symbolic Multiattribute Ordinal Knowledge-Based DSSs: methodology and Applications. Decision Sciences. 23:1357-1372.

Apr 26, 2012
icon

oneClassClassifier · Performs one-class classification on a dataset. Classifier reduces the class being classified to just a single class, and learns the datawithout using any information from other classes. The testing stage will classify as 'target'or 'outlier' - so in order to calculate the outlier pass rate the dataset must contain informationfrom more than one class. Also, the output varies depending on whether the label 'outlier' exists in the instances usedto build the classifier. If so, then 'outlier' will be predicted, if not, then the label willbe considered missing when the prediction does not favour the target class. The 'outlier' classwill not be used to build the model if there are instances of this class in the dataset. It cansimply be used as a flag, you do not need to relabel any classes. For more information, see: Kathryn Hempstalk, Eibe Frank, Ian H. Witten: One-Class Classification by Combining Density and Class Probability Estimation. In: Proceedings of the 12th European Conference on Principles and Practice of Knowledge Discovery in Databases and 19th European Conference on Machine Learning, ECMLPKDD2008, Berlin, 505--519, 2008.

May 14, 2013

Advertisement

Top Dependency Usages

Feb 13, 2021
95.1k usages
8.4k stars
Jun 02, 2023
69.4k usages
14.2k stars
Mar 17, 2023
51k usages
2.1k stars
Jul 31, 2023
27.1k usages
49k stars
Aug 09, 2023
25k usages
2.7k stars