Model Cards for scikit-learn

This library allows you to automatically create model cards for your models, which are a short documentation explaining what the model does, how it’s trained, and its limitations. Hugging Face Hub expects a README.md file containing a certain set of metadata at the beginning of it, following with the content of the model card in markdown format. The metadata section is used to make models searchable on the Hub, and get the inference API and the widgets on the website working.

Metadata

The metadata part of the file needs to follow the specifications here. It includes simple attributes of your models such as the task you’re solving, dataset you trained the model with, evaluation results and more. When the model is hosted on the Hub, information in metadata like task name or dataset help your model be discovered on the Hugging Face Hub. The task identifiers should follow the task taxonomy defined in Hugging Face Hub, as it enables the inference widget on the model page. An example to task identifier can be "tabular-classification" or "text-regression".

Here’s an example of the metadata section of the README.md file:

---
library_name: sklearn
tags:
- tabular-classification
license: mit
datasets:
- breast-cancer
metrics:
- accuracy
---

skops creates this section of the file for you, and you almost never need to touch it yourself.

Model Card Content

The markdown part does not necessarily need to follow any specification in terms of information passed, which gives the user a lot of flexibility. The markdown part of the README.md file comes with a couple of defaults provided by skops, which includes the following slots for free text sections:

  • "Mode description": A description of the model.

  • "Intended uses & limitations": Intended use for the model, limitations and potential biases. This section should also include risks of using models in certain domains if relevant.

  • "How to Get Started with the Model": Code the user can run to load and use the model.

  • "Model Card Authors": Authors of the model card. This section includes authors of the model card

  • "Model Card Contact": Contact information of people whom can be reached out, in case of questions about the model or the model card.

  • "Citation": Bibtex style citations for the model or resources used to train the model.

  • "Evaluation Results": Evaluation results that are later parsed as a table by skops.card.Card.

The template also contains the following sections that are automatically generated by skops.

  • "Hyperparameters": Hyperparameters of the model.

  • "Model Plot": A diagram of the model, most relevant in case the model is a complex scikit-learn Pipeline.

Furthermore, it is possible to add plots and tables to the model card. To add plots, save them on disk and then add them to the card by passing the path name to the Card.add_plot() method. For tables, you can pass either dictionaries with the key being the header and the values being list of row entries, or a pandas DataFrame; use the Card.add_table() method for this. If you would like to add permutation importance results, you can pass your importances to Card.add_permutation_importances(). If you want to have multiple importance plots, you should pass a file name and a title for the plot. This will create a boxplot and write it to the model card for you.

To add content to an existing subsection, or create a new subsection, use a "/" to indicate the subsection. E.g. let’s assume you would like to add a subsection called "Figures" to the existing section "Model description", as well as adding some subsections with plots below that, you can call the Card.add() method like this:

card.add(**{"Model description/Figures": "Here are some nice figures"})
card.add_plot(**{
    "Model description/Figures/Confusion Matrix": "path-to-confusion-matrix.png",
    "Model description/Figures/ROC": "path-to-roc.png",
})

Furthermore, you can select existing sections (as well as their subsections) using Card.select(), and you can delete sections using Card.delete():

section = card.select("Model description/Figures")
print(section.content)  # 'Here are some nice figures'
print(section.subsections)
card.delete("Model description/Figures/ROC")

To see how you can use the API in skops to create a model card, please refer to scikit-learn model cards.

You can also fold sections after adding them to the model card. This is useful if you have a lot of content in a section that you don’t want to show by default. To fold a section, you can use the Section.folded property:

section = card.select("Model description/Figures")
section.folded = True

After setting Section.folded to True, the section will be collapsed by default when the model card is rendered.

Saving and Loading Model Cards

Once you have finished creating and modifying the model card, you can save it using the Card.save() method:

card.save("README.md")

This renders the content of the model card to markdown format and stores it in the indicated file. It is now ready to be uploaded to Hugging Face Hub.

If you have a finished model card but want to load to make some modifications, you can use the function skops.card.parse_modelcard(). This function parses the model card back into a Card instance that you can work on further:

from skops import card
model_card = card.parse_modelcard("README.md")
model_card.add(**{"A new section": "Some new content"})
model_card.save("README.md")

When the card is parsed, some minor details of the model card can change, e.g. if you used different column alignment than the default, this could change, as well as removing excess empty lines or trailing whitespace. However, the content itself should be exactly the same. All known deviations are documented in the parse_modelcard docs

For the parsing part, we rely on pandoc. If you haven’t installed it, please follow these instructions. The advantage of using pandoc is that it’s a very mature library and that it supports many different document formats. Therefore, it should be possible to parse model cards even if they use a format that’s not markdown, for instance reStructuredText, org, or asciidoc. For saving, we only support markdown for now.