Altair SmartWorks Analytics

 

Cleaning Columns

The Clean operation is found within the Column Changes node and allows you to remove specified characters, extra spaces, and numeric or non-alphanumeric characters from a column. This operation is applicable to fields of the data type text.

Prerequisites

  • An Execution Profile with an active session linked to the workflow

  • A Column Change node connected to a Data Frame node

  •  

Steps

  1. Configure the Column Changes node by double-clicking on the node or using the Open option provided in the node menu.

  2. The table displays.

  3. Select the string column you wish to clean and then click the Clean item from the Change menu.

Removing Specified Characters

Steps

  1. To remove a specific character from all rows in the column, click Remove Specified Characters in the Clean menu.

  2. Type the character(s) you wish to remove from the column into the Specified Characters field. When specifying multiple characters to remove, do not add a space or punctuation between characters unless you wish to remove spaces and the punctuation used as well.

  3.  

     

  4. Tick the box for Ignore case if you would like the application to ignore case when looking for the specified characters to remove. For example, if this option is selected and you specify n as the character to eliminate, all instances of N and n in the field are deleted.

  5. If you wish to create a new column for your newly cleaned rows, tick the Create New Column box and provide a name for the new column.

  6.  

     

  7. If you wish to apply the clean operation to multiple columns, click on the Apply to multiple columns link and then choose which columns to apply the same operation to. Click Apply when you are done to save your selection(s). This option is disabled if you choose to create a new column in which to display the clean rows.

  8.  

     

  9. Select Apply when you are done.

  10. You can abandon the clean operation by clicking the Cancel button instead.

    If you chose to create a new column, the new column is added to your table, as shown below. Here, all instances of ! in the column Salesperson were removed and the clean rows are displayed in the column Salesperson-Clean. Otherwise, the values in your original column are updated.

     

     

  11. To check the code that will be executed for your specified Column Changes configuration, save your specifications and then click on the Code tab of the Column Changes Node Viewer. You can also use the tab that displays to refine the code further.

  12.  

  13. To complete the Column Changes node configuration, press Save. To cancel your changes and return to the Workflow Canvas, press Discard or simply close the Column Changes Node Viewer. To execute your column changes, click the Run button.

Removing Spaces

The Remove Spaces function allows you to remove all leading, trailing, and consecutive spaces from a text column.

Steps

  1. To remove unwanted spaces from all rows in the column, click Remove Spaces in the Clean menu.

  2. Choose whether to remove all leading and trailing spaces or all consecutive spaces from your field values.

  3.  

  4. In the following example, the left-hand column represents the original column and the right-hand column represents the values obtained when leading, trailing, and consecutive spaces are removed.

  5.     

     

  6. To check the code that will be executed for your specified Column Changes configuration, save your specifications and then click on the Code tab of the Column Changes Node Viewer. You can also use the tab that displays to refine the code further.

  7.  

  8. To complete the Column Changes node configuration, press Save. To cancel your changes and return to the Workflow Canvas, press Discard or simply close the Column Changes Node Viewer. To execute your column changes, click the Run button.

Removing Numeric or Non-Alphanumeric Characters

The Remove Characters > Numeric Characters and Remove  Characters > Non-alphanumeric Characters functions allow you to remove all numeric and non-alphanumeric characters from a text column.

Steps

  1. To remove a specific type of characters from the field, select Remove Characters in the Clean menu.

  2. Choose whether to remove all numeric characters or all non-alphanumeric characters from your field values.

  3.  

  4. In the following example, the left-hand column represents the original column and the right-hand column represents the values obtained when all numeric and non-alphanumeric characters are removed.

  5.    

  6. To check the code that will be executed for your specified Column Changes configuration, save your specifications and then click on the Code tab of the Column Changes Node Viewer. You can also use the tab that displays to refine the code further.

  7.  

  8. To complete the Column Changes node configuration, press Save. To cancel your changes and return to the Workflow Canvas, press Discard or simply close the Column Changes Node Viewer. To execute your column changes, click the Run button.