.. _model_card:
Model Cards for scikit-learn
============================
This library allows you to automatically create model cards for your models,
which are a short documentation explaining what the model does, how it's
trained, and its limitations. `Hugging Face Hub `__
expects a ``README.md`` file containing a certain set of metadata at the
beginning of it, following with the content of the model card in markdown
format.
Metadata
--------
The metadata part of the file needs to follow the specifications `here
`__. It
includes simple attributes of your models such as the task you're solving,
dataset you trained the model with, evaluation results and more.
Here's an example of the metadata section of the ``README.md`` file:
.. code-block:: yaml
---
library_name: sklearn
tags:
- tabular-classification
license: mit
datasets:
- breast-cancer
metrics:
- accuracy
---
``skops`` creates this section of the file for you, and you almost never need
to touch it yourself.
Model Card Content
------------------
The markdown part does not necessarily need to follow any specification in
terms of information passed, which gives the user a lot of flexibility. The
markdown part of the ``README.md`` file comes with a couple of defaults provided
by ``skops``, which includes the following slots for free text sections:
- ``"Mode description"``: A description of the model.
- ``"Intended uses & limitations"``: Intended use for the model, limitations and
potential biases. This section should also include risks of using models in
certain domains if relevant.
- ``"How to Get Started with the Model"``: Code the user can run to load and use
the model.
- ``"Model Card Authors"``: Authors of the model card. This section includes
authors of the model card
- ``"Model Card Contact"``: Contact information of people whom can be reached
out, in case of questions about the model or the model card.
- ``"Citation"``: Bibtex style citations for the model or resources used to
train the model.
- ``"Evaluation Results"``: Evaluation results that are later parsed as a table
by :class:`skops.card.Card`.
The template also contains the following sections that are automatically
generated by ``skops``.
- ``"Hyperparameters"``: Hyperparameters of the model.
- ``"Model Plot"``: A diagram of the model, most relevant in case the model is
a complex scikit-learn :class:`~sklearn.pipeline.Pipeline`.
Furthermore, it is possible to add plots and tables to the model card. To add
plots, save them on disk and then add them to the card by passing the path name
to the :meth:`.Card.add_plot` method. For tables, you can pass either
dictionaries with the key being the header and the values being list of row
entries, or a pandas ``DataFrame``; use the :meth:`.Card.add_table` method for
this. If you would like to add permutation importance results, you can pass
your importances to :meth:`.Card.add_permutation_importances`. If you want to
have multiple importance plots, you should pass a file name and a title for the
plot. This will create a boxplot and write it to the model card for you.
To add content to an existing subsection, or create a new subsection, use a
``"/"`` to indicate the subsection. E.g. let's assume you would like to add a
subsection called ``"Figures"`` to the existing section ``"Model description"``,
as well as adding some subsections with plots below that, you can call the
:meth:`Card.add` method like this:
.. code-block:: python
card.add(**{"Model description/Figures": "Here are some nice figures"})
card.add_plot(**{
"Model description/Figures/Confusion Matrix": "path-to-confusion-matrix.png",
"Model description/Figures/ROC": "path-to-roc.png",
})
Furthermore, you can select existing sections (as well as their subsections)
using :meth:`.Card.select`, and you can delete sections using
:meth:`.Card.delete`:
.. code-block:: python
section = card.select("Model description/Figures")
print(section.content) # 'Here are some nice figures'
print(section.subsections)
card.delete("Model description/Figures/ROC")
To see how you can use the API in ``skops`` to create a model card, please
refer to :ref:`sphx_glr_auto_examples_plot_model_card.py`.
You can also fold sections after adding them to the model card. This is useful
if you have a lot of content in a section that you don't want to show by
default. To fold a section, you can use the :attr:`.Section.folded` property:
.. code-block:: python
section = card.select("Model description/Figures")
section.folded = True
After setting :attr:`.Section.folded` to ``True``, the section will be collapsed by default
when the model card is rendered.
Saving and Loading Model Cards
------------------------------
Once you have finished creating and modifying the model card, you can save it
using the :meth:`.Card.save` method:
.. code-block:: python
card.save("README.md")
This renders the content of the model card to markdown format and stores it in
the indicated file. It is now ready to be uploaded to Hugging Face Hub.
If you have a finished model card but want to load to make some modifications,
you can use the function :func:`skops.card.parse_modelcard`. This function
parses the model card back into a :class:`.Card` instance that you can work on
further:
.. code-block:: python
from skops import card
model_card = card.parse_modelcard("README.md")
model_card.add(**{"A new section": "Some new content"})
model_card.save("README.md")
When the card is parsed, some minor details of the model card can change, e.g.
if you used different column alignment than the default, this could change, as
well as removing excess empty lines or trailing whitespace. However, the content
itself should be exactly the same. All known deviations are documented in the
`parse_modelcard docs
`_
For the parsing part, we rely on `pandoc `_. If you haven't
installed it, please follow `these instructions
`_. The advantage of using pandoc is that
it's a very mature library and that it supports many different document formats.
Therefore, it should be possible to parse model cards even if they use a format
that's not markdown, for instance reStructuredText, org, or asciidoc. For
saving, we only support markdown for now.