Altair SmartWorks Analytics

 

Working with ML Flow

SmartWorks Analytics allows you to add ML Flow server configurations. The ML Flow server can be added as an internal connection to Execution Profile.

 

The ML Flow must be configured so that it can be used in the Machine Learning nodes.

Steps

  1. Navigate to any Execution Profile and select the Internal Connections tabbed page.

  2. Open any of the internal connections of the selected Execution profile and provide the ML Flow server URL.

  3. The ML Flow server URL provides details such as the experiments status, parameters, metrics, and artifacts.

    The ML Flow server is added as an internal connection and displayed in the Auto ML and Rapids AI Node Viewer.

     

    The output generated by ML Flow is stored in Ceph's high performance Cloud storage server.

    ML Flow is used to log models, and every model run in SmartWorks Analytics creates an artifact in the Experiments view (for example: https://mlflow-<smartworks_analytics_URL>/#/).

     

    The Experiments view provides access to parameters, metrics for each trained model, and allows for model comparison.

  4. The artifacts of any model can obtained by clicking on any of the model line items in the Experiments View:

  5.  

    These artifacts can be Registered as a new Model:

     

    The Registered Models can be found in the Registry (for example: https://mlflow-<smartworks_analytics_URL>/#/models).

     

SERVING MODELS WITH BUILT-IN ML FLOW SERVER (LOCAL SERVER EXAMPLE)

The following example explains models in the platform that have been logged with ML Flow in Spark and can be served in a variety of ways including the 'built-in' model.

Steps

  1. Once a model is logged, it becomes available in the Experiments UI for ML Flow.

  2. The location of the model with the complete path is as shown:

     

  3. Navigate to the Execution Profiles tabbed page, click to access the Jupyter Notebook in the same session that was used to build the original model, and then select Open Jupyter Notebook from the options that are displayed.

  4.  

    An empty Jupyter Notebook opens.

     

  5. Click the JupyterHub logo on the top left corner and then create a new notebook to stop the workflow from being locked.

  6. Use the following model serve command, replacing the s3 path with your own model path, and then paste it in a new Jupyter Notebook cell and running that cell.

  7. ! mlflow models serve -m

    s3://library/0/e33b63129ec646de939250fc03332974/artifacts/default_classifier -p 1234

     

    When the model is successfully served, the Notebook kernel becomes busy and the cell will run (denoted by a *) while the model is live.

    NOTE: Keeping the notebook in the “kernel busy” state will lock the use of the workflow that has been used to train the model. This is only an example to illustrate the capabilities of ML Flow. However, this method is not recommended for production serving.

     

    The model can be accessed with a POST command that can be sent through terminal using curl or the requests library in Python.

  8. Open a new Terminal in JupyterHub  to ensure that you are not locked out of the Jupyter kernel used in the previous step.

  9. (Use this if curl is available in your environment (currently not available in the Spark cluster))

     

    With curl through terminal example:

curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["age", "sex", "hours-per-week"],"data":[[40, "male", 40]]}' http://127.0.0.1:1234/invocations

 

With curl through Python example, using the requests library:

import requests

headers = {

'Content-Type': 'application/json; format=pandas-split',

}  

data = '{"columns":["age"],"data":[[45]]}'

response = requests.post('http://127.0.0.1:1234/invocations', headers=headers, data=data)

print(response.text)