Note
Go to the end to download the full example code
Creating models that are accelerated by Intel(R) Extension for scikit-learn
Introduction
This guide demonstrates how under certain conditions, Intel(R) Extension for
Scikit-learn (also scikit-learn-intelex
, or sklearnex
) can be used to
speed up inference of Scikit-learn models.
The extension supports most of Scikit-learn’s classical machine learning algorithms, like k-nearest neighbors, support vector machines, linear/logistic regression, and more. Stock Scikit-learn implementations are used where no optimized version is available, making this package 100% compatible with existing code. Note while compatibility is assured by continuous testing, equivalence of results between the two packages is not guaranteed. In fact, due to independent implementations, intermediate results differ in many cases. An up-to-date list of supported algorithms can be found in the official documentation.
Intel(R) Extension for Scikit-learn accelerates Scikit-learn algorithms by using the latest hardware features and optimized caching and threading strategies. Find more details in Intel’s blog posts on Medium (1, 2). In many cases, optimizations translate to hardware from other vendors, albeit with smaller performance gains.
For this example, we train two simple
sklearn.neighbors.KNeighborsClassifier
instances, one with and one
without using sklearnex
, and compare inference times. Afterward, we upload
both models to the Hugging Face Model Hub. Hugging Face Hub supports
sklearnex
-optimized models, meaning the achieved speedup will translate for
Inference API users.
Imports
First, we import everything required for the rest of this document.
import os
import pickle
from pathlib import Path
from tempfile import NamedTemporaryFile, mkdtemp
from time import perf_counter
from uuid import uuid4
from huggingface_hub import delete_repo, whoami
from sklearn.datasets import make_classification
from sklearn.metrics import log_loss
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearnex.neighbors import KNeighborsClassifier as KNeighborsClassifierOptimized
from skops import card, hub_utils
Data
Next, we create some generic data. A dataset of 50k rows x 15 columns is big enough
to showcase a performance gain from using sklearnex
. Generally speaking,
larger datasets will benefit more from the sklearnex
optimizations. More
details can be found in the official
README.
X, y = make_classification(
n_samples=50_000,
n_features=15,
n_informative=15,
n_redundant=0,
n_clusters_per_class=1,
shuffle=False,
random_state=42,
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42
)
Training the stock model
Now we can train a stock Scikit-learn
sklearn.neighbors.KNeighborsClassifier
clf = KNeighborsClassifier(3, n_jobs=-1)
start = perf_counter()
clf.fit(X_train, y_train)
print(f"Training finished in {perf_counter() - start:.2f}s")
Training finished in 0.13s
Training the optimized model
Now we fit the optimized algorithm. Note, that rather than loading the model
from sklearnex
, we could also load and call patch_sklearn()
. Find more
details in the documentation.
clf_opt = KNeighborsClassifierOptimized(3, n_jobs=-1)
start = perf_counter()
clf_opt.fit(X_train, y_train)
print(f"Training finished in {perf_counter() - start:.2f}s")
Training finished in 0.08s
We are not comparing the k-NN fit times, since this is not a compute-intensive task and both are typically very fast.
Comparing inference times
Now to the interesting part: We measure the execution time of
predict_proba()
for the two models.
start = perf_counter()
y_proba = clf.predict_proba(X_test)
t_stock = perf_counter() - start
log_loss_score = log_loss(y_test, y_proba)
print(
f"[stock scikit-learn] Inference took t_stock = {t_stock:.2f}s with a "
f"log-loss score of {log_loss_score:.3f}"
)
start = perf_counter()
y_proba = clf_opt.predict_proba(X_test)
t_opt = perf_counter() - start
log_loss_score = log_loss(y_test, y_proba)
print(
f"[sklearnex] Inference took t_opt = {t_opt:.2f}s with a log-loss score of"
f" {log_loss_score:.3f}"
)
print(f"t_stock / t_opt = {t_stock/t_opt:.1f}")
[stock scikit-learn] Inference took t_stock = 5.55s with a log-loss score of 0.217
[sklearnex] Inference took t_opt = 2.26s with a log-loss score of 0.217
t_stock / t_opt = 2.5
We see that inference using sklearnex
is a lot faster while achieving the
same log-loss score.
Save and upload the models
Let’s save all required files to disk and initialize Hugging Face Model Hub repositories.
# replace with your own token or set it as an environment variable before
# running the script
token = os.environ["HF_HUB_TOKEN"]
with NamedTemporaryFile(mode="bw", prefix="stock-", suffix=".pkl") as fp:
pickle.dump(clf, file=fp)
stock_repo = mkdtemp(prefix="stock-")
hub_utils.init(
model=fp.name,
requirements=["scikit-learn=1.2.1"],
dst=stock_repo,
task="tabular-classification",
data=X_test,
)
with NamedTemporaryFile(mode="bw", prefix="opt-", suffix=".pkl") as fp:
pickle.dump(clf_opt, file=fp)
opt_repo = mkdtemp(prefix="opt-")
hub_utils.init(
model=fp.name,
requirements=["scikit-learn=1.2.1", "scikit-learn-intelex=2023.0.1"],
dst=opt_repo,
task="tabular-classification",
data=X_test,
use_intelex=True,
)
# Create Model cards with the most basic information
clf_card = card.Card(clf, metadata=card.metadata_from_config(Path(stock_repo)))
clf_card.metadata.license = "mit"
limitations = "This model is not ready to be used in production."
model_description = (
"This is a `KNeighborsClassifier` model trained on synthetic data. It is "
"trained with the stock scikit-learn algorithm and part of a "
"demonstration, showing how Intel(R) Extension for scikit-learn can be "
"used to speed up model inference times."
)
model_card_authors = "skops_user"
citation_bibtex = "**BibTeX**\n\n```\n@inproceedings{...,year={2020}}\n```"
clf_card.add(
**{
"Citation": citation_bibtex,
"Model Card Authors": model_card_authors,
"Model description": model_description,
"Model description/Intended uses & limitations": limitations,
}
)
clf_card.save(Path(stock_repo) / "README.md")
clf_opt_card = card.Card(clf_opt, metadata=card.metadata_from_config(Path(opt_repo)))
model_description = (
"This is a `KNeighborsClassifier` model trained on synthetic data. It is "
"trained with the Intel(R) extension for scikit-learn optimized version of "
"the algorithm, and part of a demonstration, showing how Intel(R) "
"Extension for scikit-learn can be used to speed up model inference times."
)
clf_card.add(
**{
"Citation": citation_bibtex,
"Model Card Authors": model_card_authors,
"Model description": model_description,
"Model description/Intended uses & limitations": limitations,
}
)
clf_opt_card.save(Path(opt_repo) / "README.md")
# Push everything to the Model hub
user_name = whoami(token=token)["name"]
uuid = uuid4()
repo_id_stock = f"{user_name}/knn-example-stock-{uuid}"
repo_id_opt = f"{user_name}/knn-example-intelex-{uuid}"
print(f"Pushing skl model to: {repo_id_stock}")
hub_utils.push(
repo_id=repo_id_stock,
source=stock_repo,
token=token,
commit_message="Add scikit-learn KNN model example",
create_remote=True,
private=False,
)
print(f"Pushing sklearnex model to: {repo_id_opt}")
hub_utils.push(
repo_id=repo_id_opt,
source=opt_repo,
token=token,
commit_message="Add scikit-learn-intelex KNN model example",
create_remote=True,
private=False,
)
Pushing skl model to: skops-ci/knn-example-stock-af989f34-21bb-4420-8b86-57c04a2bf2de
stock-wfqsdd1_.pkl: 0%| | 0.00/5.32M [00:00<?, ?B/s]
Upload 1 LFS files: 0%| | 0/1 [00:00<?, ?it/s]
stock-wfqsdd1_.pkl: 20%|#9 | 1.05M/5.32M [00:00<00:00, 10.3MB/s]
stock-wfqsdd1_.pkl: 100%|##########| 5.32M/5.32M [00:00<00:00, 8.31MB/s]
Upload 1 LFS files: 100%|##########| 1/1 [00:00<00:00, 1.36it/s]
Upload 1 LFS files: 100%|##########| 1/1 [00:00<00:00, 1.36it/s]
Pushing sklearnex model to: skops-ci/knn-example-intelex-af989f34-21bb-4420-8b86-57c04a2bf2de
Upload 1 LFS files: 0%| | 0/1 [00:00<?, ?it/s]
opt-6uv_7tpw.pkl: 0%| | 0.00/9.65M [00:00<?, ?B/s]
opt-6uv_7tpw.pkl: 100%|##########| 9.65M/9.65M [00:00<00:00, 50.1MB/s]
Upload 1 LFS files: 100%|##########| 1/1 [00:00<00:00, 3.15it/s]
Upload 1 LFS files: 100%|##########| 1/1 [00:00<00:00, 3.15it/s]
Delete Repository
At the end, we can delete the created repositories again using
delete_repo
. For more information please refer to the
documentation of huggingface_hub
library.
delete_repo(repo_id=repo_id_stock, token=token)
delete_repo(repo_id=repo_id_opt, token=token)
Total running time of the script: ( 0 minutes 13.907 seconds)