Altair SmartWorks Analytics

 

Basic Serving of Sklearn Models

This topic describes how Sklearn models are handled inside of the Docker containers deployed on Seldon Core.

Preprocessing

All data are converted into a pandas DataFrame before being processed by the model. If the user passes in the list of names, then it will be applied to the columns of the DataFrame.

 

if names:

    df = pd.DataFrame(X, columns=names)

else:

    df = pd.DataFrame(X)

 

 

The choice of whether to use the predict or predict_proba method on the model is chosen based on the meta passed in through the request. That is, the meta dictionary will be checked for the method key-value. If the value of method is one of predict or predict_proba, then that method will be run for the model. The default method is predict.

 

method = self.default_method

if meta and isinstance(meta.get("method", 0), str):

    method = meta["method"]

 

 

Predictions

Based on the method chosen by the user, the predict or predict_proba method of the model is executed. The total predictions are reshaped into a 2D numpy array.

 

if method == "predict_proba":

    preds = self.model.predict_proba(df)

else:

    preds = self.model.predict(df)

    

result = preds.reshape(len(df), -1)

 

 

Methods for Different Model Types

Method Type

Valid for Regression?

Valid for Classification?

Valid for Clustering?

predict

Yes

Yes

Some**

predict_proba

No

Most*

No

 

* Some sklearn classification models do not have a predict_proba method

** Some sklearn clustering models do not have a predict method

The names list can be passed in the request at the user’s discretion. Generally speaking, if the model has any column-specific transformations, such as a column transformer from sklearn, then they will need to pass the names list in the request.