# Least Squares Regression

Creates a regression polynomial of the chosen order such that the sum of the squares of the differences (residuals) between output response values predicted by the regression model and the corresponding simulation model is minimized.

For example,
where $n$ is the number of designs, ${f}^{predicted}$ is the output response value predicted by the regression model for the ith design, and $f$ is the output response value from the simulation of the ith design. This is achieved by finding the regression model coefficient values that sets the derivative of $Ε$ , with respect to each unknown coefficient, to zero.

## Least Squares Regression Model

The least squares regression model in HyperStudy is the polynomial expression that relates the output response of interest to the factors that were varied.

Selection of the proper model is required to create an accurate approximation. However this requires a prior knowledge of the behavior of the output responses (linear, non linear, noisy, and so on) and enough runs to feed the selected model.

Types of regression models include:
Linear Regression Model
$F\left(x\right)={a}_{0}+{a}_{1}{x}_{1}+{a}_{2}{x}_{2}+\left(error\right)$
Interaction Regression Model
$F\left(x\right)={a}_{0}+{a}_{1}{x}_{1}+{a}_{2}{x}_{2}+{a}_{3}{x}_{1}{x}_{2}+\left(error\right)$
$F\left(x\right)={a}_{0}+{a}_{1}{x}_{1}+{a}_{2}{x}_{2}+{a}_{3}{x}_{1}{x}_{2}+{a}_{4}{x}_{1}{}^{2}+{a}_{5}{x}_{2}{}^{2}+\left(error\right)$

An approximation is only as good as the uniformity of the design sampling and, for example, a two-level parameter only has a linear relationship in the regression. Higher order polynomials can be introduced by using more levels for the factors, but then, using more levels results in more runs.

If $n$ is the number of input variables:
• A linear regression model requires $n+1$ runs.
• An interaction regression model requires .
• A quadratic regression model requires .

## Usability Characteristics

• HyperStudy will create the least squares regression of any order, however, in most cases polynomials of the 4th order or higher do not increase accuracy.
Note: A custom order can be defined from the Regression Terms tab.
• Suppress regression terms that are known to be insignificant.
• Residuals and diagnostics should be used to gain an understanding of the quality of the Fit.
• Quality of a Least Squares Regression Fit is a function of the number of runs, order of the polynomial, and the behavior of the application.
• If the residuals and diagnostics are not good for a Least Squares Regression Fit, than you can increase the order of the Fit provided you have enough runs to fit that specific order.
Note: If $n$ is the number of input variables:
• A linear model requires $n+1$ runs.
• An interaction model requires $\frac{\left(n+1\right)\left(n+2\right)}{2}-n$ runs.
• A quadratic model requires $\frac{\left(n+1\right)\left(n+2\right)}{2}$ runs.
• If increasing the order does not improve the Fit quality, then you may want to inspect the input matrix collinearity and optionally add more runs. You should try the other available Fit methods as your application may have more non-linearity than polynomials can handle.

## Settings

In the Specifications step, Settings tab, change method settings.
Parameter Default Range Description
Regression Model Linear
• Linear
• Squared
• Cubic
• Interaction
• Full Cubic
• Custom
Linear
First order terms only.
y=A+Bx+Cy
Squared
Second order without cross terms.
y=A+Bx+Cy+Dx^2+Ey^2
Cubic
Third order without cross terms.
y=A+Bx+Cy+Dx^2+Ey^2+Fx^3+Gy^3
Interaction
Linear and the cross terms.
y=A+Bx+Cy+Dxy