A Logical Approach for Empirical Risk Minimization in Machine Learning for Data Stratification

A Logical Approach for Empirical Risk Minimization in Machine Learning for Data Stratification

1Taiwo, O. O., 2Awodele O., 3Kuyoro, S. O.

1,2,3Department of Computer Science, Babcock University, Ilishan-Remo, Ogun State, Nigeria

Research Journal of Mathematics and Computer Science

The data-driven methods capable of understanding, mimicking and aiding the information processing tasks of Machine Learning (ML) have been applied in an increasing range over the past years in diverse areas at a very high rate, and had achieved great success in predicting and stratifying given data instances of a problem domain. There has been generalization on the performance of the classifier to be the optimal based on the existing performance benchmarks such as accuracy, speed, time to learn, number of features, comprehensibility, robustness, scalability and interpretability. However, these benchmarks alone do not guarantee the successful adoption of an algorithm for prediction and stratification since there may be an incurring risk in its adoption. Therefore, this paper aims at developing a logical approach for using Empirical Risk Minimization (ERM) technique to determine the machine learning classifier with the minimum risk function for data stratification. The generalization on the performance of optimal algorithm was tested on BayesNet, Multilayered perceptron, Projective Adaptive Resonance Theory (PART) and Logistic Model Trees algorithms based on existing performance benchmarks such as correctly classified instances, time to build, kappa statistics, sensitivity and specificity to determine the algorithms with great performances. The study showed that PART and Logistic Model Trees algorithms perform well than others. Hence, a logical approach to apply Empirical Risk Minimization technique on PART and Logistic Model Trees algorithms is shown to give a detailed procedure of determining their empirical risk function to aid the decision of choosing an algorithm to be the best fit classifier for data stratification. This therefore serves as a benchmark for selecting an optimal algorithm for stratification and prediction alongside other benchmarks.

Keywords: Classification Algorithm, Machine Learning, Supervised Learning, Empirical Risk Minimization, Data Stratification

Free Full-text PDF

How to cite this article:
Taiwo et al.,. A Logical Approach for Empirical Risk Minimization in Machine Learning for Data Stratification. Research Journal of Mathematics and Computer Science, 2017; 1:3


[1]. Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms (1st ed.). Cambridge, USA, ISBN 978-1-107-05713-5.
[2]. Matthew, B. (2014). Advances in empirical risk minimization for image analysis and pattern recognition. Retrieved from https://tel.archives-ouvertes.fr/tel-01086088
[3]. Han, J., Kamber, M., & Pei, J. (2012). Data mining: Concepts and techniques (3rd ed.). Morgan Kaufmann.
[4]. Ian, H. W., Frank, E., & Hall, M. A. (2011). Data mining-practical machine learning tools and techniques (3rd ed.). Burlington, USA: Morgan Kaufmann – Elsevier.
[5]. Yuchen, Z. (2016). Distributed Machine Learning with Communication Constraints (A published Doctoral thesis). California, Berkeley.
[6]. Darwin, C. R. (1859). On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life (2nd ed.). London, UK: John Murray.
[7]. Anish, T., & Yogesh, K. (2013). Machine learning: An artificial intelligence methodology. International Journal of Engineering and Computer Science, 2(12), 3400-3405.
[8]. Ian, H. W., & Eibe, F. (2005). Data mining practical machine learning tools and techniques (2nd ed.). Department of Computer Science, University of Waikato. The Morgan Kaufmann Series in Data Management Systems, Waikato.
[9]. Sepp, H. (2013). Theoretical bioinformatics and machine learning (2nd ed.). Wellington, New Zealand.
[10]. Barnabas H. J. (2012). Time-to-Event predictive modeling for Chronic conditions using Empirical Risk Minimization technique. IEEE Intelligent Systems, 29(3), 14-20.
[11]. Liyang, X. (2016). Comparison of two models in differentially private distributed learning (A published M.Sc dissertation). New Brunswick, New Jersey.
[12]. Stephan, C., Igor, C., & Aurelien, B. (2016). Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics. Journal of Machine Learning Research, 17, 1-36.
[13]. Ji, Z., Jiang, X., Wang, S., Xiong, L., & Ohno-Machado, L. (2014). Differentially private distributed logistic regression using private and public data. BMC medical genomics, 7(1), Suppl 1, S14.
[14]. Hardt, M., Ligett, K., & McSherry, F., (2012). A simple and practical algorithm for differentially private data release. In Advances in Neural Information Processing Systems, 23, 2339-2347.
[15]. Tomaso, P. (2011). The Learning Problem and Regularization. Association for Computational Linguistics, 38(3), 479-526.
[16]. Chaudhuri, K., Sarwate, A. D., & Sinha, K. (2013). A near-optimal algorithm for differentially-private principal components. The Journal of Machine Learning Research, 14(1), 2905-2943.
[17]. Mahdavi, M., Zhang, L., & Jin, R. (2014). Binary excess risk for smooth convex surrogates. arXiv preprint arXiv:1402.1792
[18]. Poline, J.-B., Breeze, J. L., Ghosh, S., Gorgolewski, K., Halchenko, Y. O., Hanke, M., … Marcus, D. S. (2012). Data sharing in neuroimaging research. Frontiers in Neuroinformatics, 6(10) 87-109.
[19]. Dey, P., Lamba, A., Kumari, S., & Marwaha, N. (2011). Application of an artificial neural network in the prognosis of Chronic myeloid leukemia. Anal Quant Cytol Histol, 33(6), 335-339.
[20]. Patrick T., Laduzenski S., Edwards D., Ellis N., Boezaart A. P., & Aygtug H. (2011). Use of machine learning theory to predict the need for femoral nerve block following ACL repair. Pain Medicine, 12(10), 1566-1575.
[21]. Atul, K. P., Prabhat, P., & Jaiswal, K. L. (2014). Classification model for the Heart disease diagnosis. Global Journal of Medical Research Diseases, 14(1), 1-7, online ISSN: 2249-4618 & Print ISSN: 0975-5888.
[22]. Meng, Z., Zhaoqi, L., Xiang-Sun, Z., & Yong, W. (2015). NCC-AUC: An AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data. Bioinformatics, 31(20), 3330-3338, DOI: 10.1093/bioinformatics/btv374.
[23]. Sanoob, M. U., Anand, M., Ajesh, K. R., & Surekha, M. V. (2016). Artificial neural network for diagnosis of Pancreatic cancer. International Journal on Cybernetics & Informatics (IJCI), 5(2), DOI: 10.5121/ijci.2016.5205.
[24]. Safoora, Y., Fatemeh, A., Mohamed, A., Coco, D., Joshua, E. L., Congzheng, S., … Lee, A. D. C. (2017). Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models. BioRxiv Journal, doi: http://dx.doi.org/10.1101/131367.
[25]. Kamalika, C., Claire, M., & Anand, D. S. (2011). Differentially private empirical risk minimization. Journal of Machine Learning Research, 12, 1069-1109.
[26]. Letham, B., Cynthia, R., & David, M. (2013). Sequential Event Prediction. Machine Learning, 93(2-3), 357-380.
[27]. Michael, C., & S´ebastien, L. (2015). Bandwidth selection in kernel empirical risk minimization via the gradient. The Annals of Statistics, 43(4), 1617-1646.
[28]. Aryan, M. (2017). Efficient methods for large-scale empirical risk minimization (A Doctoral thesis). Philadelphia, Pennsylvania.