Multiple Linear Regression: Premises, Method and Uses

The Multiple linear regression Is a calculation tool that investigates cause-effect relationships of the objects of study and checks complex hypotheses.

It is used in mathematics and statistics. This type of linear regression requires dependent variables (in other words, results) and independent (ie, causes) that follow a hierarchical order, in addition to other factors inherent to various areas of study.

Multiple linear regression

Usually, the linear regression is that which is represented by a linear function that is calculated from two dependent variables. This has as more important case in which the studied phenomenon has a line of regression straight.

In a given set of data (x1, y1) (xn, yn) and values ​​corresponding to a pair of random variables in direct correlation with each other, the regression line can take the form of an equation, As y = a · x + b.

Theoretical assumptions of calculation in multiple linear regression

Any calculation by multiple linear regression will depend a lot on the object studied and the area of ​​study, as the economy, since the variables make the formulas used have complexities that vary according to the case.

This means that the more intricate the question, the more factors must be taken into account, the more data must be collected and therefore the greater the volume of elements to include in the calculation, which will make the formula larger.

However, the common thing in all these formulas is that there is a vertical axis (the ordinate, or Y axis) and a horizontal axis (the one of abscissa, or axis X) that after being calculated are represented graphically by a Cartesian system.

From there the interpretations of the data are realized (see next section) and the conclusions or predictions are elaborated. In any circumstance, premises prior to the statistical study can be used to weigh the variables, such as:

1- Weak exogeneity

It means that the variable would have to be assumed with a fixed value that can hardly lend itself to changes in its model due to causes external to itself.

2- Linear character

It implies that the values ​​of the variables, as well as other parameters and prediction coefficients, must be shown as a linear combination of elements represented in the graph in the Cartesian system.

3- Homoskedasticity

This must be constant. Here we mean that indistinctly from the predictive variables there must be the same variance of the errors for each different response variable.

4- Independence

This applies only to the errors of the response variables, which must be displayed in isolation and not as a group of errors that represent a defined pattern.

5- Absence of multicollinearity

It is used for independent variables. It happens when you try to study something but you have very little information available, so there can be many answers and therefore the values ​​can have many interpretations, which ultimately do not solve the problem.

There are other premises that are taken into account, but those presented above make it clear that multiple linear regression requires a lot of information not only to have a more rigorous, complete and bias-free study, but for the solution to the question Proposal is concrete.

That is to say, it must get to the point with something very punctual, specific, that does not lend itself to vagueness and that to the lesser extent possible gives rise to errors.

Note that multiple linear regression is not infallible and may be prone to errors and inaccuracies in computation. This is due not so much to whoever performs the study, but to the fact that a particular phenomenon of nature is not completely predictable nor necessarily is the product of a particular cause.

Many times it happens that any object can change suddenly or that an event arises by the action (or inaction) of numerous elements that interact with each other.

Interpretations of the graphs

Once the data are calculated according to the models designed in previous phases of the study, the formulas will show values ​​that can be represented in a graph.

In this order of ideas, the Cartesian system will show not a few points that will correspond to the calculated variables. Some will be seen more in the axis of the ordinates, while others will be more in the axis of the abscissas. Some will be more grouped, while others will be more isolated.

In order to note the complexity of interpreting the data in the graphs, one can observe, for example, the Ascombe Quartet. Four different sets of data are handled in this quartet, and each one is on a separate chart, which therefore deserves a separate analysis.

The linearity remains, but the points in the Cartesian system must be looked at very carefully before knowing how the pieces of the puzzle are joined. The relevant conclusions can then be drawn.

There are, of course, several means for fitting such parts, though following various methods which are described in specialized calculation manuals.

Multiple linear regression, as already mentioned, depends on many variables depending on the object of study and the field in which it is applied, so that the procedures in economics are not the same as in medicine or computer science. In all, yes, an estimate is made, a hypothesis that is then checked at the end.

Extensions of multiple linear regression

There are several types of linear regression, such as the simple and general, but there are also several facets of multiple regression that are adapted to various objects of study and therefore to the needs of science.

These usually handle a large number of variables, so you can often see models such as multivariate or multilevel. Each uses postulates and formulas of various complexity, so the interpretation of their results tends to be of greater scope.

Methods of estimation

There is a wide range of procedures for estimating the data obtained in multiple linear regression.

Once again, everything here will depend on the solidity of the model used, the calculation formulas, the number of variables, the theoretical postulates that were taken into account, the area of ​​study, the algorithms that are programmed in the specialized computer programs and , Par excellence, the complexity of the object, phenomenon or event being analyzed.

Each estimation method uses completely different formulas. None is perfect, but it does have unique virtues that must be used in consonance with the statistical study carried out.

There are all types: instrumental variables, generalized least squares, Bayesian linear regression, mixed models, regularization of Tíjonov, quantile regression, Theil-Sen estimator and a long etcetera of tools with which to study the data with greater precision.

Practical Uses

Multiple linear regression is used in various fields of study and often requires the assistance of computer programs in order to obtain more accurate data.

In this way, the margins of error that may arise from manual calculations are reduced (given the presence of many independent and dependent variables, it is not surprising that this type of linear regression lends itself to mistakes, since there are many data and factors Processed).

In the analysis of market trends, for example, it examines if any data such as the prices of a product have increased and decreased, but especially when and why.

When it is analyzed just when there are significant variations in numbers over a given period of time, especially if the changes are unexpected. Why it looks for the precise or probable factors by which that product raised, lowered or maintained its sale price to the public.

Similarly, the health sciences (medicine, bioanalysis, pharmacy, epidemiology, among others) benefit from multiple linear regression, through which health indicators such as mortality rate, morbidity and birth rate are studied.

In these cases, we can start with a study that begins with observation, although a model is then made to determine if the variation of some of these indicators is due to some specific cause, when and why.

Finance also uses multiple linear regression to investigate the advantages and disadvantages of making certain investments. Here it is always necessary to know when the financial operations are done, with whom and what the expected benefits were.

The risk levels will be higher or lower according to the various factors that are taken into account when assessing the quality of these investments, also considering the volume of monetary exchange.

However, it is in the economy that this calculation tool is most used. Therefore, multiple linear regression is used in this science to predict expenditure on consumption, investment expenditures, purchases, exports, imports, assets, Labor demand, job offers and many more.

All of them are related to macroeconomics and microeconomics, the former being the one where data analysis variables are most abundant because they are located at a global level.

References

  1. Baldor, Aurelio (1967). Flat geometry and space, with an introduction to trigonometry. Caracas: Editorial Cultura Venezolana, S.A.
  2. University Hospital Ramón y Cajal (2017). Multiple linear regression model. Madrid, Spain: HRC, Community of Madrid. Retrieved from www.hrc.es.
  3. Pedhazur, Elazar J. (1982). Multiple regression in behavioral research: Explanation and prediction, 2nd edition. New York: Holt, Rinehart & Winston.
  4. Rojo Abuín, J.M. (2007). Multiple linear regression. Madrid, Spain: Center for Human and Social Sciences. Retrieved from humanities.cchs.csic.es.
  5. Autonomous University of Madrid (2008). Multiple linear regression. Madrid, Spain: UAM. Recovered from web.uam.es.
  6. University of Coruña (2017). Multiple linear regression model; Correlation. A Coruña, Spain: UDC, Department of Mathematics. Retrieved from dm.udc.es.
  7. Uriel, E. (2017). Multiple linear regression: estimation and properties. Valencia, Spain: Universitat de València. Recovered from www.uv.es.
  8. Barrio Castro, Tomás del; Clar López, Miquel and Suriñach Caral, Jordi (2002). Multiple linear regression model: specification, estimation and contrast. Catalonia: UOC Editorial.

Loading ..

Recent Posts

Loading ..