HS-3000: Fit Method Comparison - Approximation on the Arm Model

Learn how to create approximations for the output responses of the arm example introduced in tutorial, and review the differences between different Fit methods.

Before you begin, complete HS-2000: DOE Method Comparison: Arm Model Study or import the HS-2000.hstx archive file, available in <hst.zip>/HS-3000/.

In HS-2000: DOE Method Comparison: Arm Model Study, you learned that instead of using the nine input variables, you could continue additional studies just as effectively with six shapes since the others did not have a great influence on the output responses. This will save computational effort.

In this tutorial, you will use the six shapes variables.
  • Length1: Lower Bound = -0.5, Initial Bound = 0.0, Upper Bound = 2.0
  • Length2: Lower Bound = 0.0, Initial Bound = 0.0, Upper Bound = 2.0
  • Length3: Lower Bound = -1.0, Initial Bound = 0.0, Upper Bound = 1.0
  • Length4: Lower Bound = -1.0, Initial Bound = 0.0, Upper Bound = 1.0
  • Length5: Lower Bound = -1.0, Initial Bound = 0.0, Upper Bound = 1.0
  • Height: Lower Bound = -1.0, Initial Bound = 0.0, Upper Bound = 1.0

Run MELS DOE Study

In this step you will create a Modified Extensible Lattice Sequence (MELS) DOE. You will use this matrix to create the Fits for both output responses.

MELS is a space filling DOE designed to equally spread out points in a space by minimizing clumps and empty spaces. The minimal required number of points to create a second order polynomial with N variables is 1.1*(N + 1)*(N + 2)/2.

In order to create the approximations to be used as surrogate models, you must perform specific DOEs that will serve as the input matrix. You will need to run a DOE suitable to be used in response surface creation, such as MELS.

  1. Add a DOE.
    1. In the Explorer, right-click and select Add from the context menu.
    2. In the Add dialog, select DOE and click OK.
  2. Modify input variables.
    1. Go to the Setup > Definition > Define Input Variables step.
    2. In the work area, Active column, clear the radius_1, radius_2, and radius_3 checkboxes.


    Figure 1.
  3. Define specifications.
    1. Go to the DOE 5 > Specifications step.
    2. In the work area, set the Mode to Modified Extensible Lattice Sequence (MELS).
    3. In the Settings tab, verify that the Number of Runs is set to 31.
    4. Click Apply.
  4. Evaluate tasks.
    1. Go to the DOE 5 > Evaluate step.
    2. Click Evaluate Tasks.
  5. Go to the DOE 5 > Post-Processing step.
  6. Click the Scatter tab to review a 2D scatter plot of the results from the MELS DOE.


    Figure 2. Typical Sampling of the MELS DOE with 31 Runs (length_1 vs. length_2). This visualization is a projection of 31 points distributed in 6 dimensions onto a 2 dimensional plane.

Optional: Run DOE with Less Runs

In this step you will create an optional second DOE with less number of runs to be used as a Validation matrix in the Fit approach.

A Validation matrix provides information on the Fit’s prediction accuracy.
Note: You should not use MELS as a Validation matrix, as it will take the same first runs from the MELS Input matrix due to its extensibility.

In this tutorial, you will use the Hammersley method to create the Validation matrix.

  1. Add a DOE.
    1. In the Explorer, right-click and select Add from the context menu.
      The Add dialog opens.
    2. For Definition from, select DOE 5.
    3. Select DOE.
    4. Click OK.
  2. Define specifications.
    1. Go to the DOE 6 > Specifications step.
    2. In the work area, set the Mode to Hammersley.
    3. In the Settings tab, change the Number of Runs to 12.


      Figure 3.
    4. Click Apply.
  3. Evaluate tasks.
    1. Go to the DOE 6 > Evaluate step.
    2. Click Evaluate Tasks.

Run Fits

In this step you will use the 31 runs from the MELS DOE as an Input matrix and the 12 runs from the Hammersley DOE as a Validation matrix to create four Fits using Least Square Regression (LSR), Moving Least Square Method (MLSM), HyperKriging (HK), and Radial Basis Function (RBF).

  1. Add a Fit.
    1. In the Explorer, right-click and select Add from the context menu.
    2. In the Add dialog, select Fit and click OK.
  2. Import matrix.
    1. Go to the Fit > Specifications step.
    2. Click Add Matrix twice.
    3. In the work area, define Fit Matrix 1 and Fit Matrix 2 by selecting the options indicated in Figure 4.
    4. Click Apply.


    Figure 4.
  3. Define specifications.
    1. In the work area, Fit Type column, select the appropriate Fit method.
      Important:

      For the Least Sqaure Regressions (LSR) Fit, in the Settings tab, set Regression Model to Interaction.

      An Interaction regression model enables linear and cross terms to be considered in the function f(x,y)=A+Bx+Cy+Dxy; where the first three terms are linear, and the last term is a cross term between the variables.

    2. Click Apply.
  4. Evaluate tasks.
    1. Go to the Fit > Evaluate step.
    2. Click Evaluate Tasks.
  5. Go to the Fit > Post-Processing step.
  6. Click the Scatter tab to compare the original Max_Stress output response to the Fit Max_Stress.

    The scatter shows the Fit accuracy. The closer together the points are along the diagonal, the better the fit. In the Max_Stress vs Max_Stress_LSR plot, you can see some dispersed points, which indicates the Fit has some inaccuracy. In comparison, the points in the Max_Stress vs Max_Stress_MLSM plot follow the diagonal more closely, which indicates it provides better Fit accuracy on Max_Stress.

    You will not compare HyperKriging and Radial Basis Function using scatter plots, because the results will be misleading. HyperKriging and Radial Basis Function go through the exact points by default, therefore the scatter plot comparing the original output response vs. the Fit output response will produce a straight line. However, this does not necessarily mean that the Fit has good predictive capability.


    Figure 5. Max_stress and Max_stress, LSR Comparison


    Figure 6. Max_stress and Max_stress, MLSM Comparison
  7. Click the Diagnostics tab to review the diagnostics of the Fit study.
    The R-Square value measures how much of the variability of the response data around its mean is captured. If the model perfectly predicts the known values, R-Square will have a maximum possible value of 1.0.


    Figure 7. Diagnostics for Max_Stress, LSR


    Figure 8. Diagnostics for Max_Stress, MLSM
    The R-square value for an Input Matrix in HyperKriging and Radial Basis Function has no meaning because the runs will always go through the exact data points, which will result in a value of 1.0. Although the value is 1.0, this does not mean the Fit will be accurate. In HyperKriging and Radial Basis Function, the only meaningful diagnostic values are for Cross-validation Matrix and Validation Matrix.


    Figure 9. Diagnostics for Max_Stress, HK


    Figure 10. Diagnostics for Max_Stress, RBF
  8. Click the Residuals tab to review the Error (and Percent Error) between the original output response and the Fit output response for each run of the Input and Testing matrices.


    Figure 11. Input Matrix Residuals on Max_Stress, LSR


    Figure 12. Testing Matrix Residuals on Max_Stress, LSR
    The Input Matrix Residual errors are slightly smaller with Least Square Regression, than they are with Moving Least Square Method, but the Testing Matrix Residual errors are much smaller with Moving Least Square Method.


    Figure 13. Input Matrix Residuals on Max_Stress, MLSM


    Figure 14. Testing Matrix Residuals on Max_Stress, MLSM
    The Input Matrix Residuals are meaningless for HyperKriging and Radial Basis Function, as indicated in the Testing Matrix Residuals below.


    Figure 15. Testing Matrix Residuals on Max_Stress, HK


    Figure 16. Testing Matrix Residuals on Max_Stress, RBF

Fit Comparison

Overview of the max percent of errors for Input and Testing matrices.

Table 1. Input Matrix Residuals
  LSR (Interaction Regression Model) MLSM HK RBF
Max_Disp -1.18% -2.79% - -
Max_Stress -6.72% -10.92% - -
Table 2. Testing Matrix Residuals
  LSR (Interaction Regression Model) MLSM HK RBF
Max_Disp 7.26% -3.00% 9.57% -2.45%
Max_Stress 34.74% -19.72% 35.06% 17.99%

It can be seen that the percent of errors for Max_Disp are smaller than Max_Stress. These results indicate the Fit approach works well for Max_Disp, but is not very efficient for Max_Stress.

These finding suggest that it is best to use the Fit model obtained from the MLSM for Max_Disp. An output response such as Max_Stress is a global envelope of localized effects. The nature of such an envelope type of output responses makes them difficult to capture accurately with a Fit. In contrast, the Max_Disp output response is not influenced by localized effects, therefore it is easier to use a Fit for such data. When proceeding in this situation, it is recommended that you either increase the number of samples, which is not guaranteed to improve the accuracy, or create a series of more localized output responses that would be simpler functions of the input variables; for example, several output responses that each capture the stress in specific regions. The image below highlights the areas of high stresses from the runs in the Input matrix.


Figure 17.