Алгоритм интерпретации линейных и неэлементарных линейных регрессионных моделей в условиях мультиколлинеарности

M P Bazilevski

doi:10.22213/2410-9304-2023-3-40-47

Authors

M. P. Bazilevski Irkutsk State Transport University

DOI:

https://doi.org/10.22213/2410-9304-2023-3-40-47

Keywords:

least squares, multicollinearity, non-elementary linear regression, linear regression, interpretability, machine learning

Abstract

When constructing machine learning models, more and more researchers come to understanding that in addition to the accuracy of the model, its interpretability is also important, that means the degree of understandability to a person. Currently, a new field is being formed in science - interpretable machine learning. This article is devoted to the study of linear regression models interpretation questions. Traditionally, linear regressions are interpreted in terms of weak correlation of explanatory variables. In this case, according to a well-known pattern, it is possible to explain the influence of each input variable on the output variable. Often the explanatory variables are highly correlated with each other. In such a situation, it is recommended to exclude strongly correlated variables, that leads, firstly, to a decrease in the model quality, and secondly, to integrity loss of the study and the interpretation of a process or phenomenon under study. In this paper, we propose an algorithm for interpreting a linear regression constructed for any degree of explanatory variablecorrelation. The algorithm gives the traditional interpretation of linear regression with weak correlation of all explanatory variables, while in case of strong correlation, as it follows from the theorem proved in the article, regression interpretation with several variables is reduced to interpreting an equation with one variable, without losing information about the relationships between pairs of explanatory variables. The algorithm can be used not only for simple linear regressions, but also for non-elementary ones, containing, in addition to explanatory variables, their pairs transformed by means of binary operations min and max. The operation of the algorithm is demonstrated on specific examples.

Author Biography

M. P. Bazilevski, Irkutsk State Transport University

PhD in Engineering, Associate Professor

References

Базилевский М. П. Метод построения неэлементарных линейных регрессий на основе аппарата математического программирования // Проблемы управления. 2022. № 4. С. 3-14.

Базилевский М. П. Отбор информативных операций при построении линейно-неэлементарных регрессионных моделей // International Journal of Open Information Technologies. 2021. Т. 9, № 5. С. 30-35.

Базилевский М. П. Оценивание линейно-неэлементарных регрессионных моделей с помощью метода наименьших квадратов // Моделирование, оптимизация и информационные технологии. 2020. Т. 8, № 4 (31).

Базилевский М. П. Метод построения неэлементарных линейных регрессий на основе аппарата математического программирования // Проблемы управления. 2022. № 4. С. 3-14.

Фёрстер Э., Рёнц Б. Методы корреляционного и регрессионного анализа. М.: Финансы и статистика, 1983. 303 с.

Гефан Г. Д. Эконометрика. Иркутск: ИрГУПС, 2005. 84 с.

Горбач А. Н., Цейтлин Н. А. Покупательское поведение: анализ спонтанных последовательностей и регрессионных моделей в маркетинговых исследованиях. Киев: Освiта УкраЇны, 2011. 220 с.

Kim J. H. Multicollinearity and misleading statistical results // Korean journal of anesthesiology. 2019. Vol. 72, no. 6. Pp. 558-569.

Shrestha N. Detecting multicollinearity in regression analysis // American Journal of Applied Mathematics and Statistics. 2020. Vol. 8, no. 2. Pp. 39-42.

Носков С. И. Технология моделирования объектов с нестабильным функционированием и неопределенностью в данных. Иркутск: Облинформпечать, 1996. 321 с.

Du M., Liu N., Hu X. Techniques for interpretable machine learning // Communications of the ACM. 2019. Vol. 63, no. 1. Pp. 68-77.

Molnar C.Interpretable machine learning. Lulu.com, 2020.

Doshi-Velez F., Kim B. Towards a rigorous science of interpretable machine learning // arXiv preprint arXiv:1702.08608. 2017.

Sarker I. H. Machine learning: Algorithms, real-world applications and research directions // SN computer science. 2021. Vol. 2, no. 3. P. 160.

Janiesch C., Zschech P., Heinrich K. Machine learning and deep learning // Electronic Markets. 2021. Vol. 31, no. 3. Pp. 685-695.

A guide to machine learning for biologists /j. G. Greener, S. M. Kandathil, L. Moffat, D. T. Jones // Nature Reviews Molecular Cell Biology. 2022. Vol. 23, no. 1. Pp. 40-55.