API Reference
This is the class and function reference of skops.
skops.hf_hub
: Hugging Face Hub Integration
- skops.hub_utils.add_files(*files, dst, exist_ok=False)[source]
Add files to initialized repo.
After having called
hub_utils.init()
, use this function to add arbitrary files to be uploaded in addition to the model and model card.In particular, it can be useful to upload the script itself that produces those artifacts by calling
hub_utils.add_files([__file__], dst=...)
.- Parameters:
- *filesstr or Path
The files to be added.
- dststr or Path
Path to the initialized repo, same as used during
hub_utils.init()
.- exist_okbool (default=False)
Whether it’s okay or not to add a file that already exists. If
True
, override the files, otherwise raise aFileExistsError
.
- Raises:
- FileNotFoundError
When the target folder or the files to be added are not found.
- FileExistsError
When a file is added that already exists at the target location and
exist_ok=False
.
- skops.hub_utils.download(*, repo_id, dst, revision=None, token=None, keep_cache=True, **kwargs)[source]
Download a repository into a directory.
The directory needs to be an empty or a non-existing one.
- Parameters:
- repo_id: str
The ID of the Hugging Face Hub repository in the form of
OWNER/REPO_NAME
.- dst: str, or Path
The directory to which the files are downloaded.
- revision: str, optional
The revision of the project to download. This can be a git tag, branch, or a git commit hash. By default the latest revision of the default branch is downloaded.
- token: str, optional
The token to be used to download the files. Only required if the repository is private.
- keep_cache: bool, default=True
Whether the cached data should be kept or removed after download. By default a copy of the cached files will be created in the
dst
folder. IfFalse
, the cache will be removed after the contents are copied. Note that the cache is git based and by default new files are only downloaded if there is a new revision of them on the hub. If you keep the cache, the old files are not removed after downloading the newer versions of them.- kwargs: dict
Other parameters to be passed to
huggingface_hub.snapshot_download()
.
- Returns:
- None
- skops.hub_utils.get_config(path)[source]
Returns the configuration of a project.
- Parameters:
- path: str
The path to the directory holding the project and its
config.json
configuration file.
- Returns:
- config: dict
A dictionary which holds the configs of the project.
- skops.hub_utils.get_model_output(repo_id, data, token=None)[source]
Returns the output of the model using Hugging Face Hub’s inference API.
See the User Guide for more details.
Deprecated since version 0.9: Will be removed in version 0.10. Use
huggingface_hub.InferenceClient
instead.- Parameters:
- repo_id: str
The ID of the Hugging Face Hub repository in the form of
OWNER/REPO_NAME
.- data: Any
The input to be given to the model. This can be a
pandas.DataFrame
or anumpy.ndarray
. If possible, you should always pass apandas.DataFrame
with correct column names.- token: str, optional
The token to be used to call the inference API. Only required if the repository is private.
- Returns:
- output: numpy.ndarray
The output of the model.
Notes
If there are warnings or exceptions during inference, this function raises a
RuntimeError
including the original errors and warnings returned from the server.Also note that if the model repo is private, the inference API would not be available.
- skops.hub_utils.get_requirements(path)[source]
Returns the requirements of a project.
- Parameters:
- path: str
The path to the director holding the project and its
config.json
configuration file.
- Returns:
- requirements: list of str
The list of requirements which can be passed to the package manager to be installed.
- skops.hub_utils.init(*, model, requirements, dst, task, data, model_format='auto', use_intelex=False)[source]
Initialize a scikit-learn based Hugging Face repo.
Given a pickled model and a set of required packages, this function initializes a folder to be a valid Hugging Face scikit-learn based repo.
- Parameters:
- model: str, or Path
The path to a model pickle file.
- requirements: list of str
A list of required packages. The versions are then extracted from the current environment.
- dst: str, or Path
The path to a non-existing or empty folder which is to be initialized.
- task: str
The task of the model, which determines the input and output type of the model. It can be one of:
tabular-classification
,tabular-regression
,text-classification
,text-regression
.- data: array-like
The input to the model. This is used for two purposes:
Save an example input to the model, which is used by HuggingFace’s backend and shown in the widget of the model’s page.
Store the columns and their order of the input, which is used by HuggingFace’s backend to pass the data in the right form to the model.
The first 3 input values are used as example inputs.
If
task
is"tabular-classification"
or"tabular-regression"
, the data needs to be apandas.DataFrame
or anumpy.ndarray
. Iftask
is"text-classification"
or"text-regression"
, the data needs to be alist
of strings.- model_format: str (default=”auto”)
The format the model was persisted in. Can be
"auto"
,"skops"
or"pickle"
. Defaults to"auto"
that relies on file extension.- use_intelex: bool (default=False)
Whether to enable
scikit-learn-intelex
. This can accelerate some sklearn models by a large factor with the right hardware. In most cases, enabling this option should not break any code, even if the model was not initially trained with scikit-learn intelex and even if the hardware does not support it. For more info, see https://intel.github.io/scikit-learn-intelex/.
- skops.hub_utils.push(*, repo_id, source, token=None, commit_message=None, create_remote=False, private=None)[source]
Pushes the contents of a model repo to Hugging Face Hub.
This function validates the contents of the folder before pushing it to the Hub.
- Parameters:
- repo_id: str
The ID of the destination repository in the form of
OWNER/REPO_NAME
.- source: str or Path
A folder where the contents of the model repo are located.
- token: str, optional
A token to push to the Hub. If not provided, the user should be already logged in using
huggingface-cli login
.- commit_message: str, optional
The commit message to be used when pushing to the repo.
- create_remote: bool, default=False
Whether to create the remote repository if it doesn’t exist. If the remote repository doesn’t exist and this parameter is
False
, it raises an error. Otherwise it checks if the remote repository exists, and would create it if it doesn’t.- private: bool, default=None
Whether the remote repository should be public or private. If
True
orFalse
is passed, this method will set the private/public status of the remote repository, regardless of it already existing or not. IfNone
, no change is applied.Added in version 0.3.
- Returns:
- None
- Raises:
- TypeError
This function raises a
TypeError
if the contents of the source folder do not make a valid Hugging Face Hub scikit-learn based repo.
- skops.hub_utils.update_env(*, path, requirements=None)[source]
Update the environment requirements of a repo.
This function takes the path to the repo, and updates the requirements of running the scikit-learn based model in the repo.
- Parameters:
- path: str, or Path
The path to an existing local repo.
- requirements: list of str, optional
The list of required packages for the model. If none is passed, the list of existing requirements is used and their versions are updated.
skops.card
: Model Card Utilities
- class skops.card.Card(model, model_diagram='auto', metadata=None, template='skops', trusted=False)[source]
Model card class that will be used to generate model card.
This class can be used to write information and plots to model card and save it. This class by default generates an interactive plot of the model and a table of hyperparameters. Some sections are added by default.
- Parameters:
- model: pathlib.Path, str, or sklearn estimator object
Path
/str
of the model or the actual model instance that will be documented. If aPath
orstr
is provided, model will be loaded.- model_diagram: bool or “auto” or str, default=”auto”
If using the skops template, setting this to
True
or"auto"
will add the model diagram, as generated by sckit-learn, to the default section, i.e “Model description/Training Procedure/Model Plot”. Passing a string tomodel_diagram
will instead use that string as the section name for the diagram. Set toFalse
to not include the model diagram.If using a non-skops template, passing
"auto"
won’t add the model diagram because there is no pre-defined section to put it. The model diagram can, however, always be added later usingCard.add_model_plot()
.- metadata: ModelCardData, optional
huggingface_hub.ModelCardData
object. The contents of this object are saved as metadata at the beginning of the output file, and used by Hugging Face Hub.You can use
metadata_from_config()
to create an instance pre-populated with necessary information based on the contents of theconfig.json
file, which itself is created byskops.hub_utils.init()
.- template: “skops”, dict, or None (default=”skops”)
Whether to add default sections or not. The template can be a predefined template, which at the moment can only be the string
"skops"
, which is a template provided byskops
that is geared towards typical sklearn models. If you don’t want any prefilled sections, just passNone
. If you want custom prefilled sections, pass adict
, where keys are the sections and values are the contents of the sections. Note that when you use no template or a custom template, some methods will not work, e.g.Card.add_metrics()
, since it’s not clear where to put the metrics when there is no template or a custom template.- trusted: bool, default=False
Passed to
skops.io.load()
if the model is a file path and it’s a skops file.
Examples
>>> from sklearn.metrics import ( ... ConfusionMatrixDisplay, ... confusion_matrix, ... accuracy_score, ... f1_score ... ) >>> import tempfile >>> from pathlib import Path >>> from sklearn.datasets import load_iris >>> from sklearn.linear_model import LogisticRegression >>> from skops.card import Card >>> X, y = load_iris(return_X_y=True) >>> model = LogisticRegression(solver="liblinear", random_state=0).fit(X, y) >>> model_card = Card(model) >>> model_card.metadata.license = "mit" >>> y_pred = model.predict(X) >>> model_card.add_metrics(**{ ... "accuracy": accuracy_score(y, y_pred), ... "f1 score": f1_score(y, y_pred, average="micro"), ... }) Card(...) >>> cm = confusion_matrix(y, y_pred,labels=model.classes_) >>> disp = ConfusionMatrixDisplay( ... confusion_matrix=cm, ... display_labels=model.classes_ ... ) >>> disp.plot() <sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay object at ...> >>> tmp_path = Path(tempfile.mkdtemp(prefix="skops-")) >>> disp.figure_.savefig(tmp_path / "confusion_matrix.png") ... >>> model_card.add_plot(**{ ... "Model description/Confusion Matrix": tmp_path / "confusion_matrix.png" ... }) Card(...) >>> # add new content to the existing section "Model description" >>> model_card.add(**{"Model description": "This is the best model"}) Card(...) >>> # add content to a new section >>> model_card.add(**{"A new section": "Please rate my model"}) Card(...) >>> # add new subsection to an existing section by using "/" >>> model_card.add(**{"Model description/Model name": "This model is called Bob"}) Card( model=LogisticRegression(random_state=0, solver='liblinear'), ... ) >>> # save the card to a README.md file >>> model_card.save(tmp_path / "README.md")
- Attributes:
- model: estimator object
The scikit-learn compatible model that will be documented.
- metadata: ModelCardData
Metadata to be stored at the beginning of the saved model card, as metadata to be understood by the Hugging Face Hub.
Methods
add
([folded])Add new section(s) to the model card.
add_fairlearn_metric_frame
(metric_frame[, ...])Add a
fairlearn.metrics.MetricFrame
table to the model card.add_get_started_code
([section, description, ...])Add getting started code
add_hyperparams
([section, description])Add the model's hyperparameters as a table
add_metrics
([section, description])Add metric values to the model card.
add_model_plot
([section, description])Add a model plot
add_permutation_importances
(...[, ...])Plots permutation importance and saves it to model card.
add_plot
(*[, description, alt_text, folded])Add plots to the model card.
add_table
(*[, description, folded])Add a table to the model card.
delete
(key)Delete a section from the model card.
Returns sklearn estimator object.
get_toc
()Get the table of contents for the model card.
render
()Render the final model card as a string.
save
(path[, copy_files])Save the model card.
select
(key)Select a section from the model card.
- add(folded=False, **kwargs)[source]
Add new section(s) to the model card.
Add one or multiple sections to the model card. The section names are taken from the keys and the contents are taken from the values.
To add to an existing section, use a
"/"
in the section name, e.g.:card.add(**{"Existing section/New section": "content"})
.If the parent section does not exist, it will be added automatically.
To add a section with
"/"
in its title (i.e. not inteded as a subsection), escape the slash like so,"\/"
, e.g.:card.add(**{"A section with\/a slash in the title": "content"})
.If a section of the given name already exists, its content will be overwritten.
- Parameters:
- foldedbool
Whether to fold the sections by default or not.
- **kwargsdict
The keys of the dictionary serve as the section title and the values as the section content. It’s possible to add to existing sections.
- Returns:
- selfobject
Card object.
- add_fairlearn_metric_frame(metric_frame, table_name='Fairlearn MetricFrame Table', transpose=True, description=None)[source]
Add a
fairlearn.metrics.MetricFrame
table to the model card. The table contains the difference, group_ma, group_min, and ratio for each metric.- Parameters:
- metric_frame: MetricFrame
The Fairlearn MetricFrame to add to the model card.
- table_name: str
The desired name of the table section in the model card.
- transpose: bool, default=True
Whether to transpose the table or not.
- descriptionstr | None (default=None)
An optional description to be added before the table.
- Returns:
- self: Card
The model card with the metric frame added.
Notes
You can check fairlearn’s documentation on how to work with `MetricFrame`s.
- add_get_started_code(section='How to Get Started with the Model', description=None, file_name=None, model_format=None)[source]
Add getting started code
This code can be copied by users to load the model and make predictions with it.
- Parameters:
- sectionstr (default=”How to Get Started with the Model”)
The section that the code for loading the model should be added to. By default, the section is set to fit the skops model card template. If you’re using a different template, you may have to choose a different section name.
- descriptionstr or None, default=None
An optional description to be added before the code. If you’re using the default skops template, a standard text is used. Pass a string here if you want to use your own text instead. Leave this empty to not add any description.
- file_namestr or None, default=None
The file name of the model. If no file name is indicated, there will be an attempt to read the file name from the card’s metadata. If that fails, an error is raised and you have to pass this argument explicitly.
- model_format“skops”, “pickle”, or None, default=None
The model format used to store the model.If format is indicated, there will be an attempt to read the model format from the card’s metadata. If that fails, an error is raised and you have to pass this argument explicitly.
- Returns:
- selfobject
Card object.
- add_hyperparams(section='Model description/Training Procedure/Hyperparameters', description=None)[source]
Add the model’s hyperparameters as a table
- Parameters:
- sectionstr (default=”Model description/Training Procedure/Hyperparameters”)
The section that the hyperparameters should be added to. By default, the section is set to fit the skops model card template. If you’re using a different template, you may have to choose a different section name.
- descriptionstr or None, default=None
An optional description to be added before the hyperparamters. If you’re using the default skops template, a standard text is used. Pass a string here if you want to use your own text instead. Leave this empty to not add any description.
- Returns:
- selfobject
Card object.
- add_metrics(section='Model description/Evaluation Results', description=None, **kwargs)[source]
Add metric values to the model card.
All metrics will be collected in, and then formatted to, a table.
- Parameters:
- sectionstr (default=”Model description/Evaluation Results”)
The section that metrics should be added to. By default, the section is set to fit the skops model card template. If you’re using a different template, you may have to choose a different section name.
- descriptionstr or None, default=None
An optional description to be added before the metrics. If you’re using the default skops template, a standard text is used. Pass a string here if you want to use your own text instead. Leave this empty to not add any description.
- **kwargsdict
A dictionary of the form
{metric name: metric value}
.
- Returns:
- selfobject
Card object.
- add_model_plot(section='Model description/Training Procedure/Model Plot', description=None)[source]
Add a model plot
Use sklearn model visualization to add create a diagram of the model. See the sklearn model visualization docs.
The model diagram is not added if the card class was instantiated with
model_diagram=False
.- Parameters:
- sectionstr (default=”Model description/Training Procedure/Model Plot”)
The section that the model plot should be added to. By default, the section is set to fit the skops model card template. If you’re using a different template, you may have to choose a different section name.
- descriptionstr or None, default=None
An optional description to be added before the model plot. If you’re using the default skops template, a standard text is used. Pass a string here if you want to use your own text instead. Leave this empty to not add any description.
- Returns:
- selfobject
Card object.
- add_permutation_importances(permutation_importances, columns, plot_file='permutation_importances.png', plot_name='Permutation Importances', overwrite=False, description=None)[source]
Plots permutation importance and saves it to model card.
- Parameters:
- permutation_importancessklearn.utils.Bunch
Output of
sklearn.inspection.permutation_importance()
.- columnsstr, list or pandas.Index
Column names of the data used to generate importances.
- plot_filestr or pathlib.Path
Filename for the plot.
- plot_namestr
Name of the plot.
- overwritebool (default=False)
Whether to overwrite the permutation importance plot file, if a plot by that name already exists.
- descriptionstr | None (default=None)
An optional description to be added before the plot.
- Returns:
- selfobject
Card object.
- add_plot(*, description=None, alt_text=None, folded=False, **kwargs)[source]
Add plots to the model card.
The plot should be saved on the file system and the path passed as value.
- Parameters:
- description: str or None (default=None)
If a string is passed as description, it is shown before the figure. If multiple figures are added with one call, they all get the same description. To add multiple figures with different descriptions, call this method multiple times.
- alt_text:str or None (default=None)
If a string is passed as
alt_text
, it is used as the alternative text for the figure (i.e. what is shown if the figure cannot be rendered). If this argument isNone
, the alt_text will just be the same as the section title. If multiple figures are added with one call, they all get the same alt text. To add multiple figures with different alt texts, call this method multiple times.- folded: bool (default=False)
If set to
True
, the plot will be enclosed in adetails
tag. That means the content is folded by default and users have to click to show the content. This option is useful if the added plot is large.- **kwargsdict
The arguments should be of the form
name=plot_path
, wherename
is the name of the plot and section, andplot_path
is the path to the plot on the file system (either a str orpathlib.Path
), relative to the root of the project. The plots should have already been saved under the project’s folder.
- Returns:
- selfobject
Card object.
- add_table(*, description=None, folded=False, **kwargs)[source]
Add a table to the model card.
Add a table to the model card. This can be especially useful when you using cross validation with sklearn. E.g. you can directly pass the result from calling
sklearn.model_selection.cross_validate()
or thecv_results_
attribute from any of the hyperparameter searches, such assklearn.model_selection.GridSearchCV
.Morevoer, you can pass any pandas
pandas.DataFrame
to this method and it will be rendered in the model card. You may consider selecting only a part of the table if it’s too big:search = GridSearchCV(...) search.fit(X, y) df = pd.DataFrame(search.cv_results_) # show only top 10 highest scores df = df.sort_values(["mean_test_score"], ascending=False).head(10) model_card = skops.card.Card(...) model_card.add_table(**{"Hyperparameter search results top 10": df})
- Parameters:
- description: str or None (default=None)
If a string is passed as description, it is shown before the table. If multiple tables are added with one call, they all get the same description. To add multiple tables with different descriptions, call this method multiple times.
- folded: bool (default=False)
If set to
True
, the table will be enclosed in adetails
tag. That means the content is folded by default and users have to click to show the content. This option is useful if the added table is large.- **kwargsdict
The keys should be strings, which will be used as the section headers, and the values should be tables. Tables can be either dicts with the key being strings that represent the column name, and the values being lists that represent the entries for each row. Alternatively, the table can be a
pandas.DataFrame
. The table must not be empty.
- Returns:
- selfobject
Card object.
- delete(key)[source]
Delete a section from the model card.
To delete a subsection of an existing section, use a
"/"
in the section name, e.g.:card.delete("Existing section/New section")
.Alternatively, a list of strings can be passed:
card.delete(["Existing section", "New section"])
.- Parameters:
- keystr or list of str
The name of the (sub)section to select. When selecting a subsection, either use a
"/"
in the name to separate the parent and child sections, or pass a list of strings.
- Raises:
- KeyError
If the given section name was not found, a
KeyError
is raised.
- get_model()[source]
Returns sklearn estimator object.
If the
model
is already loaded, return it as is. If themodel
attribute is aPath
/str
, load the model and return it.- Returns:
- modelBaseEstimator
The model instance.
- get_toc()[source]
Get the table of contents for the model card.
- Returns:
- tocstr
The table of contents for the model card formatted as a markdown string. Example:
- Model description
Intended uses & limitations
- Training Procedure
Hyperparameters
Model Plot
Evaluation Results
How to Get Started with the Model
Model Card Authors
Model Card Contact
- render()[source]
Render the final model card as a string.
- Returns:
- resultstr
The rendered model card with all placeholders filled and all extra sections inserted.
- save(path, copy_files=False)[source]
Save the model card.
This method renders the model card in markdown format and then saves it as the specified file.
- Parameters:
- path: Path
Filepath to save your card.
- plot_path: str
Filepath to save the plots. Use this when saving the model card before creating the repository. Without this path the README will have an absolute path to the plot that won’t exist in the repository.
Notes
The keys in model card metadata can be seen here.
- select(key)[source]
Select a section from the model card.
To select a subsection of an existing section, use a
"/"
in the section name, e.g.:card.select("Main section/Subsection")
.Alternatively, multiple
select
calls can be chained:card.select("Main section").select("Subsection")
.- Parameters:
- keystr
The name of the (sub)section to select. When selecting a subsection, either use a
"/"
in the name to separate the parent and child sections, chain multipleselect
calls.
- Returns:
- selfSection
A dataclass containing all information relevant to the selected section. Those are the title, the content, and subsections (in a dict).
- Raises:
- KeyError
If the given section name was not found, a
KeyError
is raised.
- skops.card.metadata_from_config(config_path)[source]
Construct a
ModelCardData
object from aconfig.json
file.Most information needed for the metadata section of a
README.md
file on Hugging Face Hub is included in theconfig.json
file. This utility function constructs ahuggingface_hub.ModelCardData
object which can then be passed to theCard
object.This method populates the following attributes of the instance:
library_name
: It needs to be"sklearn"
for scikit-learncompatible models.
tags
: Set to a list, containing"sklearn"
and the task of themodel. You can then add more tags to this list.
widget
: It is populated with the example data to be used by thewidget component of the Hugging Face Hub widget, on the model’s repository page.
- Parameters:
- config_path: str, or Path
Filepath to the
config.json
file, or the folder including that file.
- Returns:
- card_data: huggingface_hub.ModelCardData
huggingface_hub.ModelCardData
object.
- skops.card.parse_modelcard(path)[source]
Read a model card and return a Card object
This allows users to load a dumped model card and continue to edit it.
Using this function requires
pandoc
to be installed. Please follow these instructions:https://pandoc.org/installing.html
- Parameters:
- pathstr or pathlib.Path
The path to the existing model card.
- Returns:
- cardskops.card.Card
The model card object.
Notes
There are some known limitations to the parser that may result in the model card generated from the parsed file not being 100% identical to the original model card:
In markdown, bold and italic text can be encoded in different fashions, e.g.
_like this_
or*like this*
for italic text. Pandoc doesn’t differentiate between the two. Therefore, the output may use one method where the original card used the other. When rendered, the two results should, however, be the same.Table alignment may be different. At the moment, skops does not make use of column alignment information in tables, so that may differ.
Quote symbols may differ, e.g.
it’s
becomingit's
.The number of empty lines may differ, e.g. two empty lines being transformed into one empty line.
The optional title of links is not preserved, as e.g. in [text](https://example.com “this disappears”)
Trailing whitespace is removed.
Tab indentation may be removed, e.g. in raw html.
The yaml part of the model card can have some non-semantic differences, like omitting optional quotation marks.
For these reasons, please don’t expect the output of a parsed card to be 100% identical to the original input. However, none of the listed changes makes any _semantic_ difference. If you find that there is a semantic difference in the output, please open an issue on GitHub.
Examples
>>> import numpy as np >>> from sklearn.linear_model import LinearRegression >>> from skops.card import Card >>> from skops.card import parse_modelcard >>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]]) >>> y = np.dot(X, np.array([1, 2])) + 3 >>> regr = LinearRegression().fit(X, y) >>> card = Card(regr) >>> card.save("README.md") >>> # later, load the card again >>> parsed_card = parse_modelcard("README.md") >>> # continue editing the card >>> parsed_card.add(**{"My new section": "My new content"}) Card(...) >>> # overwrite old card with new one >>> parsed_card.save("README.md")
skops.io
: Secure persistence
- skops.io.dump(obj, file, *, compression=0, compresslevel=None)[source]
Save an object using the skops persistence format.
Skops aims at providing a secure persistence feature that does not rely on
pickle
, which is inherently insecure. For more information, please visit the Secure persistence with skops documentation.- Parameters:
- obj: object
The object to be saved. Usually a scikit-learn compatible model.
- file: str, path, or file-like object
The file name. A zip archive will automatically created. As a matter of convention, we recommend to use the “.skops” file extension, e.g.
save(model, "my-model.skops")
.- compression: int, default=zipfile.ZIP_STORED
The compression method to use. See
zipfile.ZipFile
for more information.Added in version 0.7.
- compresslevel: int, default=None
The compression level to use. See
zipfile.ZipFile
for more information.Added in version 0.7.
- skops.io.dumps(obj, *, compression=0, compresslevel=None)[source]
Save an object using the skops persistence format as a bytes object.
- Parameters:
- obj: object
The object to be saved. Usually a scikit-learn compatible model.
- compression: int, default=zipfile.ZIP_STORED
The compression method to use. See
zipfile.ZipFile
for more information.Added in version 0.7.
- compresslevel: int, default=None
The compression level to use. See
zipfile.ZipFile
for more information.Added in version 0.7.
- skops.io.get_untrusted_types(*, data=None, file=None)[source]
Get a list of untrusted types in a skops dump.
- Parameters:
- data: bytes
The data to be checked, in bytes format.
- file: str or Path
The file to be checked.
- Returns:
- untrusted_types: list of str
The list of untrusted types in the dump.
Notes
Only one of data or file should be passed.
- skops.io.load(file, trusted=False)[source]
Load an object saved with the skops persistence format.
Skops aims at providing a secure persistence feature that does not rely on
pickle
, which is inherently insecure. For more information, please visit the Secure persistence with skops documentation.Warning
This feature is heavily under development, which means the API is unstable and there might be security issues at the moment. Therefore, use caution when loading files from sources you don’t trust.
- Parameters:
- file: str or pathlib.Path
The file name of the object to be loaded.
- trusted: bool, or list of str, default=False
If
True
, the object will be loaded without any security checks. IfFalse
, the object will be loaded only if there are only trusted objects in the dumped file. If a list of strings, the object will be loaded only if there are only trusted objects and objects of types listed intrusted
in the dumped file.
- Returns:
- instance: object
The loaded object.
- skops.io.loads(data, trusted=False)[source]
Load an object saved with the skops persistence format from a bytes object.
Warning
This feature is heavily under development, which means the API is unstable and there might be security issues at the moment. Therefore, use caution when loading files from sources you don’t trust.
- Parameters:
- data: bytes
The dumped data to be loaded in bytes format.
- trusted: bool, or list of str, default=False
If
True
, the object will be loaded without any security checks. IfFalse
, the object will be loaded only if there are only trusted objects in the dumped file. If a list of strings, the object will be loaded only if there are only trusted objects and objects of types listed intrusted
in the dumped file.
- Returns:
- instance: object
The loaded object.
- skops.io.visualize(file, *, show='all', trusted=False, sink=<function pretty_print_tree>, **kwargs)[source]
Visualize the contents of a skops file.
Shows the schema of a skops file as a tree view. In particular, highlights untrusted nodes. A node is considered untrusted if at least one of its child nodes is untrusted.
Visualizing the tree using the default visualization function requires the
rich
library, which can be installed as:python -m pip install rich
If passing a custom visualization function to
sink
,rich
is not required.- Parameters:
- file: str or pathlib.Path
The file name of the object to be loaded.
- show: “all” or “untrusted” or “trusted”
Whether to print all nodes, only untrusted nodes, or only trusted nodes.
- trusted: bool, or list of str, default=False
If
True
, all nodes will be treated as trusted. IfFalse
, only default types are trusted. If a list of strings, where those strongs describe the trusted types, these types are trusted on top of the default trusted types.- sink: function (default=:func:`~pretty_print_tree`)
This function should take at least two arguments, an iterator of
NodeInfo
instances and an indicator of what to show. TheNodeInfo
contains the information about the node, namely:the level of nesting (int)
the key of the node (str)
the value of the node as a string representation (str)
the safety of the node and its children
The
show
argument is explained above. Any additionalkwargs
passed tovisualize
will also be passed tosink
.The default sink is
pretty_print_tree()
, which takes these additional parameters:tag_safe: The tag used to mark trusted nodes (default=””, i.e no tag)
tag_unsafe: The tag used to mark untrusted nodes (default=”[UNSAFE]”)
use_colors: Whether to colorize the nodes (default=True). Colors requires the
rich
package to be installed.color_safe: Color to use for trusted nodes (default=”green”)
color_unsafe: Color to use for untrusted nodes (default=”red”)
color_child_unsafe: Color to use for nodes that are trusted but that have untrusted child ndoes (default=”yellow”)
So if you don’t want to have colored output, just pass
use_colors=False
tovisualize
. The colors themselves, such as “red” and “green”, refer to the standard colors used byrich
.