ORCA
ORCA (Ordinal Regression and Classification Algorithms) is a MATLAB framework that implements and integrates a wide range of ordinal regression methods and performance metrics from the paper “Ordinal regression methods: survey and experimental study” published in IEEE Transactions on Knowledge and Data Engineering. ORCA also helps to accelerate classifier experimental comparison with automatic fold execution, experiment paralellisation and performance reports. A basic definition of ordinal regression can be found at Wikipedia.
As a generic experimental framework, its two main objectives are:
- To run experiments easily to facilitate the comparison between algorithms and datasets.
- To provide an easy way of including new algorithms into the framework by simply defining the training and test methods and the hyperparameters of the algorithms.
To help these purposes, ORCA is mainly used through configuration files that describe experiments, but the methods can also be easily used through a common API.
Cite ORCA
If you use ORCA and/or associated datasets, please cite the following works:
J. Sánchez-Monedero, P. A. Gutiérrez and M. Pérez-Ortiz,
"ORCA: A Matlab/Octave Toolbox for Ordinal Regression",
Journal of Machine Learning Research. Vol. 20. Issue 125. 2019. http://jmlr.org/papers/v20/18-349.html
P.A. Gutiérrez, M. Pérez-Ortiz, J. Sánchez-Monedero, F. Fernandez-Navarro and C. Hervás-Martínez.
"Ordinal regression methods: survey and experimental study",
IEEE Transactions on Knowledge and Data Engineering, Vol. 28, January, 2016, pp. 127-146. http://dx.doi.org/10.1109/TKDE.2015.2457911
Bibtex entry:
@article{JMLR:v20:18-349,
author = {Javier Snchez-Monedero and Pedro A. Gutirrez and Mara Prez-Ortiz},
title = {ORCA: A Matlab/Octave Toolbox for Ordinal Regression},
journal = {Journal of Machine Learning Research},
year = {2019},
volume = {20},
number = {125},
pages = {1-5},
url = {http://jmlr.org/papers/v20/18-349.html}
}
@Article{Gutierrez2015,
Title = {Ordinal regression methods: survey and experimental study},
Author = {P.A. Guti\'errez and M. P\'erez-Ortiz and J. S\'anchez-Monedero and F. Fernandez-Navarro and C. Herv\'as-Mart\'inez},
Journal = {IEEE Transactions on Knowledge and Data Engineering},
Year = {2016},
Url = {http://dx.doi.org/10.1109/TKDE.2015.2457911},
Volume = {28},
Number = {1},
pages = {127-146},
}
For more information about the paper and the ordinal datasets used please visit the associated website: http://www.uco.es/grupos/ayrna/orreview
For more information about our research group please visit Learning and Artificial Neural Networks (AYRNA) website at University of Córdoba (Spain).
Installation, tutorials and documentation
The documentation can be found in the doc folder and includes:
Methods included
The Algorithms folder includes the MATLAB classes for the algorithms included and the original code (if applicable). The config-files folder includes different configuration files for running all the algorithms. In order to use these files, the datasets used in the previously cited review paper are needed. To add your own method see Adding a new method to ORCA.
Running time of the algorithms was analysed in “Ordinal regression methods: survey and experimental study”
(2016). From this analysis, it can be concluded that ELMOP, SVORLin and POM are the best option if computational cost is a priority. The training time of neural network methods (NNPOM and NNOP) and GPOR is in general the highest. This cost can be assumed for GPOR, given that it obtains very good performance for balanced ordinal datasets, while neural network-based methods are generally beaten by the ordinal SVM variants. Concerning scalability, the experimental setup in the review also included some relatively large datasets, so the practitioner could check the time it took to train one of those models with the ORCA framework. In general, linear models such as POM and SVORLin perform very well in these scenarios where there is plenty of data while still having a reasonably low running time (e.g. around 10 seconds for cross-validating, training and testing on a dataset of almost 22.000 patterns). Although very high-dimensional datasets were not considered in the analysis, it is well-known that SVMs can handle high-dimensional data, and given that they are one of the best performing methods in ordinal regression, this might be a good choice in such scenario.
Ordinal regression algorithms
- SVR [2]: Standard Support Vector Regression with normalised targets (considered as a naïve approach for ordinal regression since equal distances between targets are assumed).
- CSSVC [1]: Nominal SVM with the OneVsAll decomposition, where absolute costs are included as different weights for the negative class of each decomposition (it is considered as a naïve approach for ordinal regression since equal distances between targets are assumed).
- SVMOP [3,4]: Binary ordinal decomposition methodology with SVM as base method, it imposes explicit weights over the patterns and uses a probabilistic framework for the prediction.
- ELMOP [5]: Standard Extreme Learning Machine imposing an ordinal structure in the coding scheme representing the target variable.
- POM [6]: Extension of the linear binary Logistic Regression methodology to Ordinal Classification by means of Cumulative Link Functions.
- SVOREX [7]: Ordinal formulation of the SVM paradigm, which computes discriminant parallel hyperplanes for the data and a set of thresholds by imposing explicit constraints in the optimization problem.
- SVORIM [7]: Ordinal formulation of the SVM paradigm, which computes discriminant parallel hyperplanes for the data and a set of thresholds by imposing implicit constraints in the optimization problem.
- SVORLin [7]: Linear version of the SVORIM method (considering a linear kernel instead of the Gaussian one) to check how the kernel trick affects the final performance.
- KDLOR [8]: Reformulation of the well-known Kernel Discriminant Analysis for Ordinal Regression by imposing an order constraint in the projected classes.
- NNPOM [6,9]: Neural Network based on Proportional Odd Model (NNPOM), implementing a neural network model for ordinal regression. The model has one hidden layer and one output layer with only one neuron but as many thresholds as the number of classes minus one. The standard POM model is applied in this neuron to provide probabilistic outputs.
- NNOP [10]: Neural Network with Ordered Partitions (NNOP), this model considers the OrderedPartitions coding scheme for the labels and a rule for decisions based on the first node whose output is higher than a predefined threshold (T=0.5). The model has one hidden layer and one output layer with as many neurons as the number of classes minus one.
- REDSVM [11]: Augmented Binary Classification framework that solves the Ordinal Regression problem by a single binary model (SVM is applied in this case).
- ORBoost [12]: This is an ensemble model based on the threshold model structure, where normalised sigmoid functions are used as the base classifier. The weights parameter configures whether the All margins versions is used (
weights=true
) or the Left-Right margin is used (weights=false
).
- OPBE [13]: Ordinal projection-based ensemble (OPBE) based on three-class decompositions, following the ordinal structure. A specific method for fusing the probabilities returned by the different three-class classifiers is implemented (product combiner, logit function and equal distribution of the probabilities). The base classifier is SVORIM but potentially any of the methods in ORCA can be setup as base classifier.
Partial order methods
- HPOLD [16]: Hierarchical Partial Order Label Decomposition with linear and non-linear base methods.
Nominal methods
- SVC1V1 [1]: Nominal Support Vector Machine using the OneVsOne formulation (considered as a naïve approach for ordinal regression since it ignores the order information).
- SVC1VA [1]: Nominal Support Vector Machine with the OneVsAll paradigm (considered as a naïve approach for ordinal regression since it ignores the order information).
- LIBLINEAR: Implementation of logistic regression and linear SVM based on LIBLINEAR.
The measures folder contains the MATLAB classes for the metrics used for evaluating the classifiers. The measures included in ORCA are the following (more details about the metrics can be found in [14,15]:
- MAE: Mean Absolute Error between predicted and expected categories, representing classes as integer numbers (1, 2, …).
- MZE: Mean Zero-one Error or standard classification error (1-accuracy).
- AMAE: Average MAE, considering MAEs individually calculated for each class.
- CCR: Correctly Classified Ration or percentage of correctly classified patterns.
- GM: Geometric Mean of the sensitivities individually calculated for each class.
- MMAE: Maximum MAE, considering MAEs individually calculated for each class.
- MS: Minimum Sensitivity, representing the ratio of correctly classified patterns for the worst classified class.
- Spearman: Spearman Rho.
- Tkendall: Tau of Kendall.
- Wkappa: Weighted Kappa statistic, using ordinal weights.
Utilities, classes and scripts
- DataSet.m: Class for data preprocessing.
- Experiment.m: Class that runs the different experiments.
- Utilities.m: Class that pre-process the experiment files, run the different algorithms and produces the results.
- runtests_single.m: Script to run all the methods using the ORCA API. Reference performance is compared with
toy
dataset in order to check that the installation is correct.
- runtests_cv.m: This script runs full experiment tests using the ORCA configuration files to describe experiments.
Datasets
The example-data folder includes partitions of several small ordinal datasets for code testing purposes. We have also collected 44 publicly available ordinal datasets from various sources. These can be downloaded from: datasets-OR-review. The link also contains data partitions as used in different papers in the literature to ease experimental comparison. The characteristics of these datasets are the following:
Dataset |
#Pat. |
#Attr. |
#Classes |
Class distribution |
pyrim5 (P5) |
74 |
27 |
5 |
~15 per class |
machine5 (M5) |
209 |
7 |
5 |
~42 per class |
housing5 (H5) |
506 |
14 |
5 |
~101 per class |
stock5 (S5) |
700 |
9 |
5 |
140 per class |
abalone5 (A5) |
4177 |
11 |
5 |
~836 per class |
bank5 (B5) |
8192 |
8 |
5 |
~1639 per class |
bank5’ (BB5) |
8192 |
32 |
5 |
~1639 per class |
computer5 (C5) |
8192 |
12 |
5 |
~1639 per class |
computer5’ (CC5) |
8192 |
21 |
5 |
~1639 per class |
cal.housing5 (CH5) |
20640 |
8 |
5 |
4128 per class |
census5 (CE5) |
22784 |
8 |
5 |
~4557 per class |
census5’ (CEE5) |
22784 |
16 |
5 |
~4557 per class |
pyrim10 (P10) |
74 |
27 |
10 |
~8 per class |
machine10 (M10) |
209 |
7 |
10 |
~21 per class |
housing10 (H10) |
506 |
14 |
10 |
~51 per class |
stock10 (S10) |
700 |
9 |
10 |
70 per class |
abalone10 (A10) |
4177 |
11 |
10 |
~418 per class |
bank10 (B10) |
8192 |
8 |
10 |
~820 per class |
bank10’ (BB10) |
8192 |
32 |
10 |
~820 per class |
computer10 (C10) |
8192 |
12 |
10 |
~820 per class |
computer10’ (CC10) |
8192 |
21 |
10 |
~820 per class |
cal.housing (CH10) |
20640 |
8 |
10 |
2064 per class |
census10 (CE10) |
22784 |
8 |
10 |
~2279 per class |
census10’ (CEE10) |
22784 |
16 |
10 |
~2279 per class |
Dataset |
#Pat. |
#Attr. |
#Classes |
Class distribution |
contact-lenses (CL) |
24 |
6 |
3 |
(15,5,4) |
pasture (PA) |
36 |
25 |
3 |
(12,12,12) |
squash-stored (SS) |
52 |
51 |
3 |
(23,21,8) |
squash-unstored (SU) |
52 |
52 |
3 |
(24,24,4) |
tae (TA) |
151 |
54 |
3 |
(49,50,52) |
newthyroid (NT) |
215 |
5 |
3 |
(30,150,35) |
balance-scale (BS) |
625 |
4 |
3 |
(288,49,288) |
SWD (SW) |
1000 |
10 |
4 |
(32,352,399,217) |
car (CA) |
1728 |
21 |
4 |
(1210,384,69,65) |
bondrate (BO) |
57 |
37 |
5 |
(6,33,12,5,1) |
toy (TO) |
300 |
2 |
5 |
(35,87,79,68,31) |
eucalyptus (EU) |
736 |
91 |
5 |
(180,107,130,214,105) |
LEV (LE) |
1000 |
4 |
5 |
(93,280,403,197,27) |
automobile (AU) |
205 |
71 |
6 |
(3,22,67,54,32,27) |
winequality-red (WR) |
1599 |
11 |
6 |
(10,53,681,638,199,18) |
ESL (ES) |
488 |
4 |
9 |
(2,12,38,100,116,135,62,19,4) |
ERA (ER) |
1000 |
4 |
9 |
(92,142,181,172,158,118,88,31,18) |
marketing |
8993 |
74 |
9 |
(1745,775,667,813,722,1110,969,1308,884) |
thyroid |
7200 |
21 |
3 |
(6666,166,368) |
winequality-white |
4898 |
11 |
7 |
(20,163,1457,2198,880,175,5) |
Experiments parallelization with HTCondor
The condor folder contains the necessary files and steps for using HTCondor with our framework.
External software
ORCA makes use of the following external software implementations. For some of them, a Matlab interface has been developed through the use of MEX files.
- libsvm-weights-3.12: framework used for Support Vector Machine algorithms. The version considered was 3.12.
- libsvm-rank-2.81: implementation used for the REDSVM method. The version considered was 2.81.
- orensemble: implementation used for the ORBoost method.
- SVOR: implementation used for the SVOREX, SVORIM and SVORIMLin methods.
Other contributors
Apart from the authors of the paper and the authors of the implementations referenced in “External software” section, the following persons also contributed to ORCA framework:
References
- [1] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multi-class support vector machines,” IEEE Transaction on Neural Networks, vol. 13, no. 2, pp. 415–425, 2002.
- [2] A. Smola and B. Schölkopf, “A tutorial on support vector regression,” Statistics and Computing, vol. 14, no. 3, pp. 199–222, 2004.
- [3] E. Frank and M. Hall, “A simple approach to ordinal classification,” in Proceedings of the 12th European Conference on Machine Learning, ser. EMCL ’01. London, UK: Springer-Verlag, 2001, pp. 145–156.
- [4] W. Waegeman and L. Boullart, “An ensemble of weighted support vector machines for ordinal regression,” International Journal of Computer Systems Science and Engineering, vol. 3, no. 1, pp. 47–51, 2009.
- [5] W.-Y. Deng, Q.-H. Zheng, S. Lian, L. Chen, and X. Wang, “Ordinal extreme learning machine,” Neurocomputing, vol. 74, no. 1–3, pp. 447– 456, 2010.
- [6] P. McCullagh, “Regression models for ordinal data,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 42, no. 2, pp. 109–142, 1980.
- [7] W. Chu and S. S. Keerthi, “Support Vector Ordinal Regression,” Neural Computation, vol. 19, no. 3, pp. 792–815, 2007.
- [8] B.-Y. Sun, J. Li, D. D. Wu, X.-M. Zhang, and W.-B. Li, “Kernel discriminant learning for ordinal regression,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 6, pp. 906–910, 2010.
- [9] M. J. Mathieson, Ordinal models for neural networks, in Proc. 3rd Int. Conf. Neural Netw. Capital Markets, 1996, pp. 523-536.
- [10] J. Cheng, Z. Wang, and G. Pollastri, “A neural network approach to ordinal regression,” in Proc. IEEE Int. Joint Conf. Neural Netw. (IEEE World Congr. Comput. Intell.), 2008, pp. 1279-1284.
- [11] H.-T. Lin and L. Li, “Reduction from cost-sensitive ordinal ranking to weighted binary classification,” Neural Computation, vol. 24, no. 5, pp. 1329–1367, 2012.
- [12] H.-T. Lin and L. Li, “Large-margin thresholded ensembles for ordinal regression: Theory and practice,” in Proc. of the 17th Algorithmic Learning Theory International Conference, ser. Lecture Notes in Artificial Intelligence (LNAI), J. L. Balcazar, P. M. Long, and F. Stephan, Eds., vol. 4264. Springer-Verlag, October 2006, pp. 319–333.
- [13] M. Pérez-Ortiz, P. A. Gutiérrez y C. Hervás-Martínez. “Projection based ensemble learning for ordinal regression”, IEEE Transactions on Cybernetics, Vol. 44, May, 2014, pp. 681-694.
- [14] M. Cruz-Ramírez, C. Hervás-Martínez, J. Sánchez-Monedero and P. A. Gutiérrez. “Metrics to guide a multi-objective evolutionary algorithm for ordinal classification,” Neurocomputing, Vol. 135, July, 2014, pp. 21-31.
- [15] J. C. Fernandez-Caballero, F. J. Martínez-Estudillo, C. Hervás-Martínez and P. A. Gutiérrez. “Sensitivity Versus Accuracy in Multiclass Problems Using Memetic Pareto Evolutionary Neural Networks,” IEEE Transacctions on Neural Networks, Vol. 21. 2010, pp. 750-770.
- [16] J. Sánchez-Monedero, M. Pérez-Ortiz, A. Sáez, P.A. Gutiérrez and C. Hervás-Martínez. “Partial order label decomposition approaches for melanoma diagnosis”. Applied Soft Computing. Vol. 64, March 2018, pp. 341-355.