API Reference

This is the class and function reference of skops.

skops.hf_hub: Hugging Face Hub Integration

skops.card: Model Card Utilities

class skops.card.Card(model, model_format=None, model_diagram='auto', template='skops', trusted=None, allow_pickle=False)[source]

Model card class that will be used to generate model card.

This class can be used to write information and plots to model card and save it. This class by default generates an interactive plot of the model and a table of hyperparameters. Some sections are added by default.

Parameters:
model: pathlib.Path, str, or sklearn estimator object

Path/str of the model or the actual model instance that will be documented. If a Path or str is provided, model will be loaded. Note that a a “get started” code block will be added to the card only if the model is a Path or str.

model_format: Literal[“pickle”, “skops”] or None (default=None)

The format of the model file. If None, the format will be inferred from the file extension of the model file if possible.

model_diagram: bool or “auto” or str, default=”auto”

If using the skops template, setting this to True or "auto" will add the model diagram, as generated by sckit-learn, to the default section, i.e “Model description/Training Procedure/Model Plot”. Passing a string to model_diagram will instead use that string as the section name for the diagram. Set to False to not include the model diagram.

If using a non-skops template, passing "auto" won’t add the model diagram because there is no pre-defined section to put it. The model diagram can, however, always be added later using Card.add_model_plot().

template: “skops”, dict, or None (default=”skops”)

Whether to add default sections or not. The template can be a predefined template, which at the moment can only be the string "skops", which is a template provided by skops that is geared towards typical sklearn models. If you don’t want any prefilled sections, just pass None. If you want custom prefilled sections, pass a dict, where keys are the sections and values are the contents of the sections. Note that when you use no template or a custom template, some methods will not work, e.g. Card.add_metrics(), since it’s not clear where to put the metrics when there is no template or a custom template.

trusted: list of str, default=None

Passed to skops.io.load() if the model is a file path and it’s a skops file.

allow_pickle: bool, default=False

If True, allows loading models using joblib.load. This may lead to security issues if the model file is not trustworthy.

Attributes:
model: estimator object

The scikit-learn compatible model that will be documented.

Methods

add([folded])

Add new section(s) to the model card.

add_fairlearn_metric_frame(metric_frame[, ...])

Add a fairlearn.metrics.MetricFrame table to the model card.

add_hyperparams([section, description])

Add the model's hyperparameters as a table

add_metrics([section, description])

Add metric values to the model card.

add_model_plot([section, description])

Add a model plot

add_permutation_importances(...[, ...])

Plots permutation importance and saves it to model card.

add_plot(*[, description, alt_text, folded])

Add plots to the model card.

add_table(*[, description, folded])

Add a table to the model card.

delete(key)

Delete a section from the model card.

get_model()

Returns sklearn estimator object.

get_toc()

Get the table of contents for the model card.

render()

Render the final model card as a string.

save(path[, copy_files])

Save the model card.

select(key)

Select a section from the model card.

Examples

>>> from sklearn.metrics import (
...     ConfusionMatrixDisplay,
...     confusion_matrix,
...     accuracy_score,
...     f1_score
... )
>>> import tempfile
>>> from pathlib import Path
>>> from sklearn.datasets import load_iris
>>> from sklearn.linear_model import LogisticRegression
>>> from skops.card import Card
>>> X, y = load_iris(return_X_y=True)
>>> model = LogisticRegression(solver="saga", random_state=0).fit(X, y)
>>> model_card = Card(model)
>>> y_pred = model.predict(X)
>>> model_card.add_metrics(**{
...     "accuracy": accuracy_score(y, y_pred),
...     "f1 score": f1_score(y, y_pred, average="micro"),
... })
Card(...)
>>> cm = confusion_matrix(y, y_pred,labels=model.classes_)
>>> disp = ConfusionMatrixDisplay(
...     confusion_matrix=cm,
...     display_labels=model.classes_
... )
>>> disp.plot()
<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay object at ...>
>>> tmp_path = Path(tempfile.mkdtemp(prefix="skops-"))
>>> disp.figure_.savefig(tmp_path / "confusion_matrix.png")
...
>>> model_card.add_plot(**{
...     "Model description/Confusion Matrix": tmp_path / "confusion_matrix.png"
... })
Card(...)
>>> # add new content to the existing section "Model description"
>>> model_card.add(**{"Model description": "This is the best model"})
Card(...)
>>> # add content to a new section
>>> model_card.add(**{"A new section": "Please rate my model"})
Card(...)
>>> # add new subsection to an existing section by using "/"
>>> model_card.add(**{"Model description/Model name": "This model is called Bob"})
Card(
  model=LogisticRegression(random_state=0, solver='saga'),
  ...
)
>>> # save the card to a README.md file
>>> model_card.save(tmp_path / "README.md")
add(folded=False, **kwargs)[source]

Add new section(s) to the model card.

Add one or multiple sections to the model card. The section names are taken from the keys and the contents are taken from the values.

To add to an existing section, use a "/" in the section name, e.g.:

card.add(**{"Existing section/New section": "content"}).

If the parent section does not exist, it will be added automatically.

To add a section with "/" in its title (i.e. not inteded as a subsection), escape the slash like so, "\/", e.g.:

card.add(**{"A section with\/a slash in the title": "content"}).

If a section of the given name already exists, its content will be overwritten.

Parameters:
foldedbool

Whether to fold the sections by default or not.

**kwargsdict

The keys of the dictionary serve as the section title and the values as the section content. It’s possible to add to existing sections.

Returns:
selfobject

Card object.

add_fairlearn_metric_frame(metric_frame, table_name='Fairlearn MetricFrame Table', transpose=True, description=None)[source]

Add a fairlearn.metrics.MetricFrame table to the model card. The table contains the difference, group_ma, group_min, and ratio for each metric.

Parameters:
metric_frame: MetricFrame

The Fairlearn MetricFrame to add to the model card.

table_name: str

The desired name of the table section in the model card.

transpose: bool, default=True

Whether to transpose the table or not.

descriptionstr | None (default=None)

An optional description to be added before the table.

Returns:
self: Card

The model card with the metric frame added.

Notes

You can check fairlearn’s documentation on how to work with `MetricFrame`s.

add_hyperparams(section='Model description/Training Procedure/Hyperparameters', description=None)[source]

Add the model’s hyperparameters as a table

Parameters:
sectionstr (default=”Model description/Training Procedure/Hyperparameters”)

The section that the hyperparameters should be added to. By default, the section is set to fit the skops model card template. If you’re using a different template, you may have to choose a different section name.

descriptionstr or None, default=None

An optional description to be added before the hyperparamters. If you’re using the default skops template, a standard text is used. Pass a string here if you want to use your own text instead. Leave this empty to not add any description.

Returns:
selfobject

Card object.

add_metrics(section='Model description/Evaluation Results', description=None, **kwargs)[source]

Add metric values to the model card.

All metrics will be collected in, and then formatted to, a table.

Parameters:
sectionstr (default=”Model description/Evaluation Results”)

The section that metrics should be added to. By default, the section is set to fit the skops model card template. If you’re using a different template, you may have to choose a different section name.

descriptionstr or None, default=None

An optional description to be added before the metrics. If you’re using the default skops template, a standard text is used. Pass a string here if you want to use your own text instead. Leave this empty to not add any description.

**kwargsdict

A dictionary of the form {metric name: metric value}.

Returns:
selfobject

Card object.

add_model_plot(section='Model description/Training Procedure/Model Plot', description=None)[source]

Add a model plot

Use sklearn model visualization to add create a diagram of the model. See the sklearn model visualization docs.

The model diagram is not added if the card class was instantiated with model_diagram=False.

Parameters:
sectionstr (default=”Model description/Training Procedure/Model Plot”)

The section that the model plot should be added to. By default, the section is set to fit the skops model card template. If you’re using a different template, you may have to choose a different section name.

descriptionstr or None, default=None

An optional description to be added before the model plot. If you’re using the default skops template, a standard text is used. Pass a string here if you want to use your own text instead. Leave this empty to not add any description.

Returns:
selfobject

Card object.

add_permutation_importances(permutation_importances, columns, plot_file='permutation_importances.png', plot_name='Permutation Importances', overwrite=False, description=None)[source]

Plots permutation importance and saves it to model card.

Parameters:
permutation_importancessklearn.utils.Bunch

Output of sklearn.inspection.permutation_importance().

columnsstr, list or pandas.Index

Column names of the data used to generate importances.

plot_filestr or pathlib.Path

Filename for the plot.

plot_namestr

Name of the plot.

overwritebool (default=False)

Whether to overwrite the permutation importance plot file, if a plot by that name already exists.

descriptionstr | None (default=None)

An optional description to be added before the plot.

Returns:
selfobject

Card object.

add_plot(*, description=None, alt_text=None, folded=False, **kwargs)[source]

Add plots to the model card.

The plot should be saved on the file system and the path passed as value.

Parameters:
description: str or None (default=None)

If a string is passed as description, it is shown before the figure. If multiple figures are added with one call, they all get the same description. To add multiple figures with different descriptions, call this method multiple times.

alt_text:str or None (default=None)

If a string is passed as alt_text, it is used as the alternative text for the figure (i.e. what is shown if the figure cannot be rendered). If this argument is None, the alt_text will just be the same as the section title. If multiple figures are added with one call, they all get the same alt text. To add multiple figures with different alt texts, call this method multiple times.

folded: bool (default=False)

If set to True, the plot will be enclosed in a details tag. That means the content is folded by default and users have to click to show the content. This option is useful if the added plot is large.

**kwargsdict

The arguments should be of the form name=plot_path, where name is the name of the plot and section, and plot_path is the path to the plot on the file system (either a str or pathlib.Path), relative to the root of the project. The plots should have already been saved under the project’s folder.

Returns:
selfobject

Card object.

add_table(*, description=None, folded=False, **kwargs)[source]

Add a table to the model card.

Add a table to the model card. This can be especially useful when you using cross validation with sklearn. E.g. you can directly pass the result from calling sklearn.model_selection.cross_validate() or the cv_results_ attribute from any of the hyperparameter searches, such as sklearn.model_selection.GridSearchCV.

Morevoer, you can pass any pandas pandas.DataFrame to this method and it will be rendered in the model card. You may consider selecting only a part of the table if it’s too big:

search = GridSearchCV(...)
search.fit(X, y)
df = pd.DataFrame(search.cv_results_)
# show only top 10 highest scores
df = df.sort_values(["mean_test_score"], ascending=False).head(10)
model_card = skops.card.Card(...)
model_card.add_table(**{"Hyperparameter search results top 10": df})
Parameters:
description: str or None (default=None)

If a string is passed as description, it is shown before the table. If multiple tables are added with one call, they all get the same description. To add multiple tables with different descriptions, call this method multiple times.

folded: bool (default=False)

If set to True, the table will be enclosed in a details tag. That means the content is folded by default and users have to click to show the content. This option is useful if the added table is large.

**kwargsdict

The keys should be strings, which will be used as the section headers, and the values should be tables. Tables can be either dicts with the key being strings that represent the column name, and the values being lists that represent the entries for each row. Alternatively, the table can be a pandas.DataFrame. The table must not be empty.

Returns:
selfobject

Card object.

delete(key)[source]

Delete a section from the model card.

To delete a subsection of an existing section, use a "/" in the section name, e.g.:

card.delete("Existing section/New section").

Alternatively, a list of strings can be passed:

card.delete(["Existing section", "New section"]).

Parameters:
keystr or list of str

The name of the (sub)section to select. When selecting a subsection, either use a "/" in the name to separate the parent and child sections, or pass a list of strings.

Raises:
KeyError

If the given section name was not found, a KeyError is raised.

get_model()[source]

Returns sklearn estimator object.

If the model is already loaded, return it as is. If the model attribute is a Path/str, load the model and return it.

Returns:
modelBaseEstimator

The model instance.

get_toc()[source]

Get the table of contents for the model card.

Returns:
tocstr

The table of contents for the model card formatted as a markdown string. Example:

  • Model description
    • Intended uses & limitations

    • Training Procedure
      • Hyperparameters

      • Model Plot

    • Evaluation Results

  • How to Get Started with the Model

  • Model Card Authors

  • Model Card Contact

render()[source]

Render the final model card as a string.

Returns:
resultstr

The rendered model card with all placeholders filled and all extra sections inserted.

save(path, copy_files=False)[source]

Save the model card.

This method renders the model card in markdown format and then saves it as the specified file.

Parameters:
path: Path

Filepath to save your card.

plot_path: str

Filepath to save the plots. Use this when saving the model card before creating the repository. Without this path the README will have an absolute path to the plot that won’t exist in the repository.

select(key)[source]

Select a section from the model card.

To select a subsection of an existing section, use a "/" in the section name, e.g.:

card.select("Main section/Subsection").

Alternatively, multiple select calls can be chained:

card.select("Main section").select("Subsection").

Parameters:
keystr

The name of the (sub)section to select. When selecting a subsection, either use a "/" in the name to separate the parent and child sections, chain multiple select calls.

Returns:
selfSection

A dataclass containing all information relevant to the selected section. Those are the title, the content, and subsections (in a dict).

Raises:
KeyError

If the given section name was not found, a KeyError is raised.

skops.card.parse_modelcard(path)[source]

Read a model card and return a Card object

This allows users to load a dumped model card and continue to edit it.

Using this function requires pandoc to be installed. Please follow these instructions:

https://pandoc.org/installing.html

Parameters:
pathstr or pathlib.Path

The path to the existing model card.

Returns:
cardskops.card.Card

The model card object.

Notes

There are some known limitations to the parser that may result in the model card generated from the parsed file not being 100% identical to the original model card:

  • In markdown, bold and italic text can be encoded in different fashions, e.g. _like this_ or *like this* for italic text. Pandoc doesn’t differentiate between the two. Therefore, the output may use one method where the original card used the other. When rendered, the two results should, however, be the same.

  • Table alignment may be different. At the moment, skops does not make use of column alignment information in tables, so that may differ.

  • Quote symbols may differ, e.g. it’s becoming it's.

  • The number of empty lines may differ, e.g. two empty lines being transformed into one empty line.

  • The optional title of links is not preserved, as e.g. in [text](https://example.com “this disappears”)

  • Trailing whitespace is removed.

  • Tab indentation may be removed, e.g. in raw html.

  • The yaml part of the model card can have some non-semantic differences, like omitting optional quotation marks.

For these reasons, please don’t expect the output of a parsed card to be 100% identical to the original input. However, none of the listed changes makes any _semantic_ difference. If you find that there is a semantic difference in the output, please open an issue on GitHub.

Examples

>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> from skops.card import Card
>>> from skops.card import parse_modelcard
>>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
>>> y = np.dot(X, np.array([1, 2])) + 3
>>> regr = LinearRegression().fit(X, y)
>>> card = Card(regr)
>>> card.save("README.md")
>>> # later, load the card again
>>> parsed_card = parse_modelcard("README.md")
>>> # continue editing the card
>>> parsed_card.add(**{"My new section": "My new content"})
Card(...)
>>> # overwrite old card with new one
>>> parsed_card.save("README.md")

skops.io: Secure persistence

skops.io.dump(obj, file, *, compression=0, compresslevel=None)[source]

Save an object using the skops persistence format.

Skops aims at providing a secure persistence feature that does not rely on pickle, which is inherently insecure. For more information, please visit the Secure persistence with skops documentation.

Parameters:
obj: object

The object to be saved. Usually a scikit-learn compatible model.

file: str, path, or file-like object

The file name. A zip archive will automatically created. As a matter of convention, we recommend to use the “.skops” file extension, e.g. save(model, "my-model.skops").

compression: int, default=zipfile.ZIP_STORED

The compression method to use. See zipfile.ZipFile for more information.

Added in version 0.7.

compresslevel: int, default=None

The compression level to use. See zipfile.ZipFile for more information.

Added in version 0.7.

skops.io.dumps(obj, *, compression=0, compresslevel=None)[source]

Save an object using the skops persistence format as a bytes object.

Parameters:
obj: object

The object to be saved. Usually a scikit-learn compatible model.

compression: int, default=zipfile.ZIP_STORED

The compression method to use. See zipfile.ZipFile for more information.

Added in version 0.7.

compresslevel: int, default=None

The compression level to use. See zipfile.ZipFile for more information.

Added in version 0.7.

skops.io.get_untrusted_types(*, data=None, file=None)[source]

Get a list of untrusted types in a skops dump.

Parameters:
data: bytes

The data to be checked, in bytes format.

file: str or Path

The file to be checked.

Returns:
untrusted_types: list of str

The list of untrusted types in the dump.

Notes

Only one of data or file should be passed.

skops.io.load(file, trusted=None)[source]

Load an object saved with the skops persistence format.

Skops aims at providing a secure persistence feature that does not rely on pickle, which is inherently insecure. For more information, please visit the Secure persistence with skops documentation.

Parameters:
file: str or pathlib.Path

The file name of the object to be loaded.

trusted: list of str, default=None

The object will be loaded only if there are only trusted objects and objects of types listed in trusted in the dumped file.

Returns:
instance: object

The loaded object.

skops.io.loads(data, trusted=None)[source]

Load an object saved with the skops persistence format from a bytes object.

Parameters:
data: bytes

The dumped data to be loaded in bytes format.

trusted: bool, or list of str, default=False

The object will be loaded only if there are only trusted objects and objects of types listed in trusted in the dumped file.

Returns:
instance: object

The loaded object.

skops.io.visualize(file, *, show='all', trusted=None, sink=<function pretty_print_tree>, **kwargs)[source]

Visualize the contents of a skops file.

Shows the schema of a skops file as a tree view. In particular, highlights untrusted nodes. A node is considered untrusted if at least one of its child nodes is untrusted.

Visualizing the tree using the default visualization function requires the rich library, which can be installed as:

python -m pip install rich

If passing a custom visualization function to sink, rich is not required.

Parameters:
file: str or pathlib.Path

The file name of the object to be loaded.

show: “all” or “untrusted” or “trusted”

Whether to print all nodes, only untrusted nodes, or only trusted nodes.

trusted: bool, or list of str, default=False

The object will be loaded only if there are only trusted objects and objects of types listed in trusted in the dumped file.

sink: function (default=:func:`~pretty_print_tree`)

This function should take at least two arguments, an iterator of NodeInfo instances and an indicator of what to show. The NodeInfo contains the information about the node, namely:

  • the level of nesting (int)

  • the key of the node (str)

  • the value of the node as a string representation (str)

  • the safety of the node and its children

The show argument is explained above. Any additional kwargs passed to visualize will also be passed to sink.

The default sink is pretty_print_tree(), which takes these additional parameters:

  • tag_safe: The tag used to mark trusted nodes (default=””, i.e no tag)

  • tag_unsafe: The tag used to mark untrusted nodes (default=”[UNSAFE]”)

  • use_colors: Whether to colorize the nodes (default=True). Colors requires the rich package to be installed.

  • color_safe: Color to use for trusted nodes (default=”orange1”)

  • color_unsafe: Color to use for untrusted nodes (default=”cyan”)

  • color_child_unsafe: Color to use for nodes that are trusted but that have untrusted child ndoes (default=”magenta”)

So if you don’t want to have colored output, just pass use_colors=False to visualize. The colors themselves, such as “orange1” and “cyan”, refer to the standard colors used by rich.