Basic Serving of Sklearn Models
This topic describes how Sklearn models are handled inside of the Docker containers deployed on Seldon Core.
Preprocessing
All data are converted into a pandas DataFrame before being processed by the model. If the user passes in the list of names, then it will be applied to the columns of the DataFrame.
if names: df = pd.DataFrame(X, columns=names) else: df = pd.DataFrame(X)
|
The choice of whether to use the predict or predict_proba method on the model is chosen based on the meta passed in through the request. That is, the meta dictionary will be checked for the method key-value. If the value of method is one of predict or predict_proba, then that method will be run for the model. The default method is predict.
method = self.default_method if meta and isinstance(meta.get("method", 0), str): method = meta["method"]
|
Predictions
Based on the method chosen by the user, the predict or predict_proba method of the model is executed. The total predictions are reshaped into a 2D numpy array.
if method == "predict_proba": preds = self.model.predict_proba(df) else: preds = self.model.predict(df)
result = preds.reshape(len(df), -1)
|
Methods for Different Model Types
Method Type |
Valid for Regression? |
Valid for Classification? |
Valid for Clustering? |
predict |
Yes |
Yes |
Some** |
predict_proba |
No |
Most* |
No |
* Some sklearn classification models do not have a predict_proba method
** Some sklearn clustering models do not have a predict method
The names list can be passed in the request at the user’s discretion. Generally speaking, if the model has any column-specific transformations, such as a column transformer from sklearn, then they will need to pass the names list in the request.