.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/plot_california_housing.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_plot_california_housing.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_plot_california_housing.py:


Improve your data science workflow with skops
=============================================

.. GENERATED FROM PYTHON SOURCE LINES 7-9

Introduction
------------

.. GENERATED FROM PYTHON SOURCE LINES 11-14

The goal of this exercise is to go through a semi-realistic data science
and machine learning task and develop a practical solution for it. We
will learn about the following topics:

.. GENERATED FROM PYTHON SOURCE LINES 16-24

- Perform *exploratory data analysis*
- Do some non-trivial *feature engineering*
- Explain how the feature engineering informs the *choice of machine
  learning model* and vice versa
- Show how to make use of a couple of *advanced scikit-learn* features
  and explain why we use them - Create a *model card* that provides
  useful information about the model
- Share the model by uploading it to the *Hugging Face Hub*

.. GENERATED FROM PYTHON SOURCE LINES 26-28

Imports
-------

.. GENERATED FROM PYTHON SOURCE LINES 30-33

Before we start, we need to import a couple of packages. In particular,
to run this code, we need to have the following 3rd party packages: jupyter,
matplotlib, pandas, scikit-learn, skops.

.. GENERATED FROM PYTHON SOURCE LINES 35-37

So if you want to run this exercise yourself and these packages are not
installed yet in your Python environment, you should run:

.. GENERATED FROM PYTHON SOURCE LINES 39-40

``python -m pip install jupyter matplotlib pandas scikit-learn skops``

.. GENERATED FROM PYTHON SOURCE LINES 42-70

.. code-block:: Python

    from operator import itemgetter
    from pathlib import Path
    from tempfile import mkdtemp

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd
    from matplotlib.patches import Rectangle
    from sklearn.compose import ColumnTransformer
    from sklearn.datasets import fetch_california_housing
    from sklearn.dummy import DummyRegressor
    from sklearn.ensemble import (
        GradientBoostingRegressor,
        RandomForestRegressor,
        StackingRegressor,
    )
    from sklearn.inspection import DecisionBoundaryDisplay, permutation_importance
    from sklearn.linear_model import LinearRegression
    from sklearn.metrics import get_scorer
    from sklearn.model_selection import GridSearchCV, cross_val_predict, train_test_split
    from sklearn.neighbors import KNeighborsRegressor
    from sklearn.pipeline import Pipeline
    from sklearn.preprocessing import FunctionTransformer
    from sklearn.tree import DecisionTreeRegressor

    from skops import card
    from skops import io as sio


.. GENERATED FROM PYTHON SOURCE LINES 71-73

.. code-block:: Python

    plt.style.use("seaborn-v0_8")


.. GENERATED FROM PYTHON SOURCE LINES 74-76

Analyzing the dataset
---------------------

.. GENERATED FROM PYTHON SOURCE LINES 78-80

Fetch the data
~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 82-86

First of all, let’s load our dataset. For this exercise, we use the
California Housing dataset. It can be downloaded using the function from
scikit-learn. If called the first time, this function will download the
dataset, on subsequent calls, it will load the cached version.

.. GENERATED FROM PYTHON SOURCE LINES 88-91

We will make use of the option to set , which will return the data as a
pandas. This will make our much easier than working with a numpy array, which
it would return otherwise.

.. GENERATED FROM PYTHON SOURCE LINES 93-95

.. code-block:: Python

    data = fetch_california_housing(as_frame=True)


.. GENERATED FROM PYTHON SOURCE LINES 96-98

The dataset comes with a description. If not already familiar with the
dataset, it’s always a good idea to read the included description.

.. GENERATED FROM PYTHON SOURCE LINES 100-102

.. code-block:: Python

    print(data.DESCR)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    .. _california_housing_dataset:

    California Housing dataset
    --------------------------

    **Data Set Characteristics:**

    :Number of Instances: 20640

    :Number of Attributes: 8 numeric, predictive attributes and the target

    :Attribute Information:
        - MedInc        median income in block group
        - HouseAge      median house age in block group
        - AveRooms      average number of rooms per household
        - AveBedrms     average number of bedrooms per household
        - Population    block group population
        - AveOccup      average number of household members
        - Latitude      block group latitude
        - Longitude     block group longitude

    :Missing Attribute Values: None

    This dataset was obtained from the StatLib repository.
    https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html

    The target variable is the median house value for California districts,
    expressed in hundreds of thousands of dollars ($100,000).

    This dataset was derived from the 1990 U.S. census, using one row per census
    block group. A block group is the smallest geographical unit for which the U.S.
    Census Bureau publishes sample data (a block group typically has a population
    of 600 to 3,000 people).

    A household is a group of people residing within a home. Since the average
    number of rooms and bedrooms in this dataset are provided per household, these
    columns may take surprisingly large values for block groups with few households
    and many empty houses, such as vacation resorts.

    It can be downloaded/loaded using the
    :func:`sklearn.datasets.fetch_california_housing` function.

    .. rubric:: References

    - Pace, R. Kelley and Ronald Barry, Sparse Spatial Autoregressions,
      Statistics and Probability Letters, 33:291-297, 1997.


.. GENERATED FROM PYTHON SOURCE LINES 103-105

Exploratory data analysis
~~~~~~~~~~~~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 107-109

Now it’s time to start exploring the dataset. First of all, let’s
determine what the target for this task is:

.. GENERATED FROM PYTHON SOURCE LINES 111-114

.. code-block:: Python

    target_col = data.target_names[0]
    print(target_col)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    MedHouseVal


.. GENERATED FROM PYTHON SOURCE LINES 115-119

The target column is called "MedHouseVal" and from the
description, we know it designates "the median house value for
California districts, expressed in hundreds of thousands of
dollars".

.. GENERATED FROM PYTHON SOURCE LINES 121-123

Next let’s extract the actual data, which, as mentioned, is contained in
a pandas ``DataFrame``:

.. GENERATED FROM PYTHON SOURCE LINES 125-127

.. code-block:: Python

    df = data["frame"]


.. GENERATED FROM PYTHON SOURCE LINES 128-132

For now, we leave the target variable inside the ``DataFrame``, as
this will facilitate the upcoming analysis. Once we get to modeling, we
should of course separate the target data to avoid accidentally training
on the target.

.. GENERATED FROM PYTHON SOURCE LINES 134-135

Let’s peak at some properties of the data.

.. GENERATED FROM PYTHON SOURCE LINES 137-139

.. code-block:: Python

    df.shape


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    (20640, 9)


.. GENERATED FROM PYTHON SOURCE LINES 140-142

.. code-block:: Python

    df.head()


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>MedInc</th>
          <th>HouseAge</th>
          <th>AveRooms</th>
          <th>AveBedrms</th>
          <th>Population</th>
          <th>AveOccup</th>
          <th>Latitude</th>
          <th>Longitude</th>
          <th>MedHouseVal</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>8.3252</td>
          <td>41.0</td>
          <td>6.984127</td>
          <td>1.023810</td>
          <td>322.0</td>
          <td>2.555556</td>
          <td>37.88</td>
          <td>-122.23</td>
          <td>4.526</td>
        </tr>
        <tr>
          <th>1</th>
          <td>8.3014</td>
          <td>21.0</td>
          <td>6.238137</td>
          <td>0.971880</td>
          <td>2401.0</td>
          <td>2.109842</td>
          <td>37.86</td>
          <td>-122.22</td>
          <td>3.585</td>
        </tr>
        <tr>
          <th>2</th>
          <td>7.2574</td>
          <td>52.0</td>
          <td>8.288136</td>
          <td>1.073446</td>
          <td>496.0</td>
          <td>2.802260</td>
          <td>37.85</td>
          <td>-122.24</td>
          <td>3.521</td>
        </tr>
        <tr>
          <th>3</th>
          <td>5.6431</td>
          <td>52.0</td>
          <td>5.817352</td>
          <td>1.073059</td>
          <td>558.0</td>
          <td>2.547945</td>
          <td>37.85</td>
          <td>-122.25</td>
          <td>3.413</td>
        </tr>
        <tr>
          <th>4</th>
          <td>3.8462</td>
          <td>52.0</td>
          <td>6.281853</td>
          <td>1.081081</td>
          <td>565.0</td>
          <td>2.181467</td>
          <td>37.85</td>
          <td>-122.25</td>
          <td>3.422</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 143-145

.. code-block:: Python

    df.describe()


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>MedInc</th>
          <th>HouseAge</th>
          <th>AveRooms</th>
          <th>AveBedrms</th>
          <th>Population</th>
          <th>AveOccup</th>
          <th>Latitude</th>
          <th>Longitude</th>
          <th>MedHouseVal</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>count</th>
          <td>20640.000000</td>
          <td>20640.000000</td>
          <td>20640.000000</td>
          <td>20640.000000</td>
          <td>20640.000000</td>
          <td>20640.000000</td>
          <td>20640.000000</td>
          <td>20640.000000</td>
          <td>20640.000000</td>
        </tr>
        <tr>
          <th>mean</th>
          <td>3.870671</td>
          <td>28.639486</td>
          <td>5.429000</td>
          <td>1.096675</td>
          <td>1425.476744</td>
          <td>3.070655</td>
          <td>35.631861</td>
          <td>-119.569704</td>
          <td>2.068558</td>
        </tr>
        <tr>
          <th>std</th>
          <td>1.899822</td>
          <td>12.585558</td>
          <td>2.474173</td>
          <td>0.473911</td>
          <td>1132.462122</td>
          <td>10.386050</td>
          <td>2.135952</td>
          <td>2.003532</td>
          <td>1.153956</td>
        </tr>
        <tr>
          <th>min</th>
          <td>0.499900</td>
          <td>1.000000</td>
          <td>0.846154</td>
          <td>0.333333</td>
          <td>3.000000</td>
          <td>0.692308</td>
          <td>32.540000</td>
          <td>-124.350000</td>
          <td>0.149990</td>
        </tr>
        <tr>
          <th>25%</th>
          <td>2.563400</td>
          <td>18.000000</td>
          <td>4.440716</td>
          <td>1.006079</td>
          <td>787.000000</td>
          <td>2.429741</td>
          <td>33.930000</td>
          <td>-121.800000</td>
          <td>1.196000</td>
        </tr>
        <tr>
          <th>50%</th>
          <td>3.534800</td>
          <td>29.000000</td>
          <td>5.229129</td>
          <td>1.048780</td>
          <td>1166.000000</td>
          <td>2.818116</td>
          <td>34.260000</td>
          <td>-118.490000</td>
          <td>1.797000</td>
        </tr>
        <tr>
          <th>75%</th>
          <td>4.743250</td>
          <td>37.000000</td>
          <td>6.052381</td>
          <td>1.099526</td>
          <td>1725.000000</td>
          <td>3.282261</td>
          <td>37.710000</td>
          <td>-118.010000</td>
          <td>2.647250</td>
        </tr>
        <tr>
          <th>max</th>
          <td>15.000100</td>
          <td>52.000000</td>
          <td>141.909091</td>
          <td>34.066667</td>
          <td>35682.000000</td>
          <td>1243.333333</td>
          <td>41.950000</td>
          <td>-114.310000</td>
          <td>5.000010</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 146-148

Scaling the target
^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 150-155

Before we continue working with the data, let’s do a small adjustment.
From the description, we know that the target variable, the median house
price, is expressed in units of $100,000. For our first row, that means
the actual price is $52,600, not $52.6 (which would be very cheap, even
for 1990).

.. GENERATED FROM PYTHON SOURCE LINES 157-161

In theory, the unit should not matter for our work, but let’s still
convert it to $. This is because when we search for general solutions to
this task, most people work with $ values. If we use a different unit
here, it makes comparison to these results unnecessarily difficult.

.. GENERATED FROM PYTHON SOURCE LINES 163-165

.. code-block:: Python

    df["MedHouseVal"] = 100_000 * df["MedHouseVal"]


.. GENERATED FROM PYTHON SOURCE LINES 166-168

Differences to other versions of the dataset
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 170-178

Another notable difference in this particular dataset is that some
columns are already averages. E.g. we find the column AveRooms, which is
the average number of rooms per household of this group of houses. In
other versions of the dataset (like the one on
`kaggle <https://www.kaggle.com/datasets/camnugent/california-housing-prices>`__),
we will, however, find the total number of rooms and the number of
households. So here, some feature engineering was already performed by
calculating the averages. This is fine and we can keep it like this.

.. GENERATED FROM PYTHON SOURCE LINES 180-182

Missing values
^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 184-187

Furthermore, in the description, we find: Missing Attribute Values:
None. This probably means that there are no missing values, but let’s
check ourselves just to be certain:

.. GENERATED FROM PYTHON SOURCE LINES 190-192

.. code-block:: Python

    df.isna().any()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    MedInc         False
    HouseAge       False
    AveRooms       False
    AveBedrms      False
    Population     False
    AveOccup       False
    Latitude       False
    Longitude      False
    MedHouseVal    False
    dtype: bool


.. GENERATED FROM PYTHON SOURCE LINES 193-196

Indeed, the dataset contains no missing values. If it did, we could have
made use of the `imputation features of
sklearn <https://scikit-learn.org/stable/modules/impute.html>`__.

.. GENERATED FROM PYTHON SOURCE LINES 198-200

Distributions of variables
^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 202-206

It’s always useful to take a look at the distributions of our variables.
This is best achieved by visual inspection, which is why we will create
a couple of plots. For this exercise, we use matplotlib for plotting,
since it’s very powerful and popular. It also works well with pandas.

.. GENERATED FROM PYTHON SOURCE LINES 208-209

First we want to focus on the following columns:

.. GENERATED FROM PYTHON SOURCE LINES 212-214

.. code-block:: Python

    cols = ["MedInc", "HouseAge", "AveRooms", "AveBedrms", "Population", "AveOccup"]


.. GENERATED FROM PYTHON SOURCE LINES 215-218

These are all our feature variables except for the geospatial features,
"Longitude" and "Latitude". We will analyze those more
closely later.

.. GENERATED FROM PYTHON SOURCE LINES 220-222

The most basic plot we can create for analyzing the distribution is the
histogram. So let’s plot the histograms for each of those variables.

.. GENERATED FROM PYTHON SOURCE LINES 224-229

Before we take a closer look at the data, just a few words on how we
create the plots. Since we have 6 variables, it would be convenient to
plot the data in a 3x2 grid. That’s why createa matplotlib figure with 6
subplots, using 3 rows and 2 columns. The resulting ``axes``
variable is a 3x2 numpy array that contains the individual subplots.

.. GENERATED FROM PYTHON SOURCE LINES 231-235

We also want to make use of the pandas plotting method, which we can
call using ``df.plot``. This uses matplotlib under the hood, so
it’s not strictly needed. But the nice thing is that pandas provides
some extra convenience, e.g. by automatically labeling the axes.

.. GENERATED FROM PYTHON SOURCE LINES 237-242

In order for pandas to plot onto our created figure with its subplots, we pass
the ``X`` argument to ``df.plot``. This tells pandas to plot onto this
subplot, instead of creating a new plot. Another little trick is to
``flatten`` the subplot array while looping over it. That way, we don’t need
to take care of looping over its two dimensions separately.

.. GENERATED FROM PYTHON SOURCE LINES 244-246

Finally, we should also call ``plt.tight_layout()`` to prevent the subplots
from overlapping.

.. GENERATED FROM PYTHON SOURCE LINES 248-254

.. code-block:: Python


    fig, axes = plt.subplots(3, 2, figsize=(8, 12))
    for ax, col in zip(axes.flatten(), cols):
        df.plot(kind="hist", y=col, bins=100, title=col, ax=ax, legend=None)
    plt.tight_layout()


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_001.png
   :alt: MedInc, HouseAge, AveRooms, AveBedrms, Population, AveOccup
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 255-260

Already, we can make some interesting observations about the data. For
"MedInc", but especially for "HouseAge", we see a large
bin to the right side of the distribution. This could be an indicator
that values might have been clipped. Let’s look at "HouseAge"
more closely:

.. GENERATED FROM PYTHON SOURCE LINES 262-264

.. code-block:: Python

    df["HouseAge"].describe()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    count    20640.000000
    mean        28.639486
    std         12.585558
    min          1.000000
    25%         18.000000
    50%         29.000000
    75%         37.000000
    max         52.000000
    Name: HouseAge, dtype: float64


.. GENERATED FROM PYTHON SOURCE LINES 265-267

.. code-block:: Python

    df["HouseAge"].value_counts().head().to_frame("count")


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>count</th>
        </tr>
        <tr>
          <th>HouseAge</th>
          <th></th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>52.0</th>
          <td>1273</td>
        </tr>
        <tr>
          <th>36.0</th>
          <td>862</td>
        </tr>
        <tr>
          <th>35.0</th>
          <td>824</td>
        </tr>
        <tr>
          <th>16.0</th>
          <td>771</td>
        </tr>
        <tr>
          <th>17.0</th>
          <td>698</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 268-274

So we see that there are 1273 samples with a house age of 52, which also
happens to be the maximum value. Could this be coincidence? Perhaps, but
it’s unlikely. The more likely explanation is that houses that were
older than 52 years were just clipped at 52. In the context of the
problem we’re trying to solve, this doesn’t make a big difference, so we
will just accept this peculiarity.

.. GENERATED FROM PYTHON SOURCE LINES 276-280

Next we can see in the histograms above that for "AveRooms", "AveBedrms",
"Population", and "AveOccup", the bins are squished to the left. This means
that there is a fat right tail in the distribution, which might be
problematic. When looking at the description, we find a potential explanation:

.. GENERATED FROM PYTHON SOURCE LINES 282-287

An household is a group of people residing within a home. Since the
average number of rooms and bedrooms in this dataset are provided per
household, these columns may take surpinsingly large values for block
groups with few households and many empty houses, such as vacation
resorts.

.. GENERATED FROM PYTHON SOURCE LINES 290-301

So what should we do about this? For the purpose of plotting the values,
we should certainly think about removing these extreme values. For the
machine learning model we will train later, the answer is: it depends.
When we use a model like linear regressions or a neural network, these
extreme values can be problematic and it would make sense to scale the
values to have a more uniform or normal distribution. When using
decision tree-based models, these exteme values are not problematic
though. A decision tree will split the data into those samples that are
less than or greater than a certain value – it doesn’t matter how much
smaller or greater the values are. Since we will actually rely on
tree-based models, let’s leave the data as is (except for plotting).

.. GENERATED FROM PYTHON SOURCE LINES 303-311

When it comes to plotting, how can we deal with these extreme values? We
could scale the data, e.g. by taking a ``log``. This can be
achieved by passing ``logx=True`` to the plotting method. However,
the log scale makes it harder to read the actual values of the data.
Instead, let’s use a more brute force approach of simply excluding that
1% largest values. For this, we calculate the 99th percentile of the
values and exclude all values that exceed that percentile. Apart from
that change, the plots are the same as above:

.. GENERATED FROM PYTHON SOURCE LINES 314-321

.. code-block:: Python

    fig, axes = plt.subplots(3, 2, figsize=(8, 12))
    for ax, col in zip(axes.flatten(), cols):
        quantile = df[col].quantile(0.99)
        mask = df[col] < quantile
        df[mask].plot(kind="hist", y=col, bins=100, title=col, ax=ax, legend=None)
    plt.tight_layout()


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_002.png
   :alt: MedInc, HouseAge, AveRooms, AveBedrms, Population, AveOccup
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_002.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 322-325

With extremely large values excluded, we can see that the variables are
distributed in a very reasonable fashion. With a bit of squinting, the
distributions almost look Gaussian, which is what we like to see.

.. GENERATED FROM PYTHON SOURCE LINES 327-329

Correlations with the target
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 331-336

Now it’s time to look at how our features are correlated with our
target. After all, we plan on predicting the target based on the
features, and even though correlation with a target is not necessary for
a feature to be helpful, it’s a strong indicator. As before, we will
filter out extreme values.

.. GENERATED FROM PYTHON SOURCE LINES 338-350

.. code-block:: Python


    fig, axes = plt.subplots(3, 2, figsize=(8, 12))
    for ax, col in zip(axes.flatten(), cols):
        quantile = df[col].quantile(0.99)
        df_subset = df[df[col] < quantile]
        correlation = df_subset[[col, target_col]].corr().iloc[0, 1]
        title = f"Pearson correlation coefficient: {correlation:.3f}"
        df_subset.plot(
            kind="scatter", x=col, y=target_col, s=1.5, alpha=0.05, ax=ax, title=title
        )
    plt.tight_layout()


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_003.png
   :alt: Pearson correlation coefficient: 0.672, Pearson correlation coefficient: 0.040, Pearson correlation coefficient: 0.329, Pearson correlation coefficient: -0.090, Pearson correlation coefficient: -0.033, Pearson correlation coefficient: -0.280
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_003.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 351-355

Again, let’s try to improve our understanding of the dataset by visually
inspecting the outcomes. The first thing to notice is perhaps that our
target, the "MedHouseVal", seems to be clipped at $500,000. Let’s
remember that and return to it later.

.. GENERATED FROM PYTHON SOURCE LINES 357-361

Next we should notice that apart from "MedInc", none of the variables seem to
strongly correlate with the target. If this were a real business problem to
solve at a company, now would be a good time to get ahold of a domain expert
and verify that this is expected.

.. GENERATED FROM PYTHON SOURCE LINES 363-366

We also calculate the Pearson correlation coefficient and show it in the plot
titles, but honestly, it cannot tell us much we can’t already determine by
visual inspection.

.. GENERATED FROM PYTHON SOURCE LINES 368-370

Geospatial features
^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 372-376

Now it’s time to take a look at the geospatial data, namely
"Longitude" and "Latitude". We already know that our
dataset is limited to the US state of California, so we should expect
the data to be exclusively in that area.

.. GENERATED FROM PYTHON SOURCE LINES 378-382

Before actually plotting the data, let’s take a quick brake and form a
hypothesis. We can reasonably expect that housing prices should be high in
metropolitan areas. For California, this could be around Los Angeles and the
bay area. Does our data reflect that?

.. GENERATED FROM PYTHON SOURCE LINES 384-390

To answer this question, we can plot the target variable as a function of its
coordinates. Since we deal with longitude and latitude, and since it’s
reasonably likely that the Earth is not flat, it is not quite correct to just
plot the data as is. It would be more accurate to use a projection to map the
coordinates to 2 dimensions, e.g. by using `geopandas
<https://geopandas.org/en/stable/>`__.

.. GENERATED FROM PYTHON SOURCE LINES 392-395

For our purposes, however, despite the size of California, we should be able
to get away without using any projections. So let’s simplify our life by just
using raw longitude and latitude.

.. GENERATED FROM PYTHON SOURCE LINES 397-414

.. code-block:: Python

    fig, ax = plt.subplots(figsize=(10, 8))
    df.plot(
        kind="scatter",
        x="Longitude",
        y="Latitude",
        c=target_col,
        title="House value by location",
        cmap="coolwarm",
        s=2.5,
        ax=ax,
    )
    inset = (-122.5, 37.5)
    rect = Rectangle(
        inset, 0.5, 0.5, linewidth=1, edgecolor="k", facecolor="none", alpha=0.5
    )
    ax.add_patch(rect)


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_004.png
   :alt: House value by location
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_004.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    <matplotlib.patches.Rectangle object at 0x774b13b6f080>


.. GENERATED FROM PYTHON SOURCE LINES 415-416

This is an interesting plot. Let’s try to draw some conclusions.

.. GENERATED FROM PYTHON SOURCE LINES 418-423

First of all, we see that the locations are not evenly distributed across the
state. There are big patches without any data, especially on the eastern side
of the state, whereas the western coast is more highly populated. Data is also
more sparse in the mountainous area, where we can expect population density to
be lower.

.. GENERATED FROM PYTHON SOURCE LINES 425-430

Regarding our initial hypothesis about house prices, we can indeed find
clusters of high prices exceeding $400,000, whereas the majority of the data
points seem to fall below $50,000. After checking on a map, the high priced
areas seem to be indeed around Los Angeles and the bay area, but there are
other high priced areas as well, e.g. in San Diego.

.. GENERATED FROM PYTHON SOURCE LINES 432-440

From this, we can already start thinking about some possibly interesting
features. Some variations of this dataset contain, for instance, a
variable indicating the closeness to the Ocean, since it seems that
areas on the coast are more expensive on average. Another interesting
feature could be the distance to key cities like Los Angeles and San
Francisco. Here distance should probably be measured in terms of how
long it takes to drive there by car, not just purely in terms of spatial
distance – after all, most people only have a car and not a helicopter.

.. GENERATED FROM PYTHON SOURCE LINES 442-447

As you can imagine, getting this type of data requires additional data
sources and probably a lot of extra work. Therefore, we won’t take this
route. However, we will develop specific geospatial features below,
which will probably capture most of the information we need for this
task.

.. GENERATED FROM PYTHON SOURCE LINES 449-453

Before advancing further, we should zoom into the geospatial data, given how
difficult it is to see details on the plot above. For that, we take a small
inset of that figure (marked with a rectangle), which seems to be an
interesting spot, and plot it below:

.. GENERATED FROM PYTHON SOURCE LINES 456-469

.. code-block:: Python

    fig, ax = plt.subplots(figsize=(10, 8))
    df.plot(
        kind="scatter",
        x="Longitude",
        y="Latitude",
        c=target_col,
        title="House value by location",
        cmap="coolwarm",
        ax=ax,
    )
    ax.set_xlim([inset[0], inset[0] + 0.5])
    ax.set_ylim([inset[1], inset[1] + 0.5])


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_005.png
   :alt: House value by location
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_005.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    (37.5, 38.0)


.. GENERATED FROM PYTHON SOURCE LINES 470-476

What can we take away from this plot? First of all, it shows that even
though there are clear patterns of high and low price areas, they can
also be situated very close to each other. This could already give us
the idea that prices in the *neighborhood* could be a strong predictor
for our target variable, but the size of the neighborhood should not be
too large.

.. GENERATED FROM PYTHON SOURCE LINES 478-486

Another interesting pattern we see is that the data points are distributed on
a regular grid. This is not what we would expect if, say, each point
represented a community, town, or city, since those are not evenly spaced.
Instead, this indicates that the data was aggregated with a certain spatial
resolution. Also, given the gaps, it could be reasonable to assume that data
points with too few houses were removed from the dataset, maybe for privacy
concerns. Again, talking to a domain expert would help us better understand
the reason.

.. GENERATED FROM PYTHON SOURCE LINES 488-489

Anyway, we should keep this regular spatial distribution in mind for later.

.. GENERATED FROM PYTHON SOURCE LINES 491-495

A final observation about the coordinates. We might believe that for a
given longitude and latitude, there is exactly one sample (or 0, if not
present). However, there are duplicates when it comes to coordinates, as
shown below:

.. GENERATED FROM PYTHON SOURCE LINES 497-499

.. code-block:: Python

    df.duplicated(["Longitude", "Latitude"]).sum()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    np.int64(8050)


.. GENERATED FROM PYTHON SOURCE LINES 500-502

These rows are not completely duplicated, however, as there are no
duplicates when considering all columns:

.. GENERATED FROM PYTHON SOURCE LINES 504-506

.. code-block:: Python

    df.duplicated().sum()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    np.int64(0)


.. GENERATED FROM PYTHON SOURCE LINES 507-508

At the most extreme end, we find coordinates with 15 samples:

.. GENERATED FROM PYTHON SOURCE LINES 510-512

.. code-block:: Python

    df[["Longitude", "Latitude"]].value_counts().to_frame("count").reset_index().head()


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>Longitude</th>
          <th>Latitude</th>
          <th>count</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>-122.41</td>
          <td>37.80</td>
          <td>15</td>
        </tr>
        <tr>
          <th>1</th>
          <td>-122.42</td>
          <td>37.80</td>
          <td>11</td>
        </tr>
        <tr>
          <th>2</th>
          <td>-122.44</td>
          <td>37.78</td>
          <td>11</td>
        </tr>
        <tr>
          <th>3</th>
          <td>-122.27</td>
          <td>37.85</td>
          <td>10</td>
        </tr>
        <tr>
          <th>4</th>
          <td>-122.44</td>
          <td>37.80</td>
          <td>10</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 513-515

Overall, for more than a third of coordinates, we find more than one
data point:

.. GENERATED FROM PYTHON SOURCE LINES 517-519

.. code-block:: Python

    (df[["Longitude", "Latitude"]].value_counts() > 1).mean()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    np.float64(0.3457505957108816)


.. GENERATED FROM PYTHON SOURCE LINES 520-528

Again, if this were a real business case, we should investigate this
further and find someone who can explain this peculiarity in the data.
One possible explanation could be that those are samples for the same
location but from different points in time. If this were true, we would
have to make adjustments, e.g. by considering the time when we make a
train/test split. But we have no possibility to verify that time is the
explanation. As is, we just accept this fact and keep it in mind for
later.

.. GENERATED FROM PYTHON SOURCE LINES 530-532

Target variable
^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 534-536

Finally, we should not forget to take a look at the target variable
itself. Again, let’s start with plotting its disstribution:

.. GENERATED FROM PYTHON SOURCE LINES 538-541

.. code-block:: Python


    df.plot(kind="hist", y=target_col, bins=100)


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_006.png
   :alt: plot california housing
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_006.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    <Axes: ylabel='Frequency'>


.. GENERATED FROM PYTHON SOURCE LINES 542-544

As we have already established earlier, we find that the target data
seems to be clipped. Let’s take a closer look:

.. GENERATED FROM PYTHON SOURCE LINES 546-548

.. code-block:: Python

    df[target_col].value_counts().head()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    MedHouseVal
    500001.0    965
    137500.0    122
    162500.0    117
    112500.0    103
    187500.0     93
    Name: count, dtype: int64


.. GENERATED FROM PYTHON SOURCE LINES 549-555

Is it possible that this would occur naturally in the data? Sometimes,
we may find strange patterns. To give an example, if there was a law
that sales above a certain value are taxed differently, we could expect
prices to cluster at this value. But this appears to be very unlikely
here, especially since prices seem to be rounded to the closest $100, as
we can see from the following probe:

.. GENERATED FROM PYTHON SOURCE LINES 557-566

.. code-block:: Python

    (
        (df[target_col] % 100)
        .round()
        .value_counts()
        .to_frame()
        .reset_index()
        .rename(columns={"index": f"{target_col} % $100", target_col: "count"})
    )


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>count</th>
          <th>count</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>0</th>
          <td>0.0</td>
          <td>18396</td>
        </tr>
        <tr>
          <th>1</th>
          <td>100.0</td>
          <td>1275</td>
        </tr>
        <tr>
          <th>2</th>
          <td>1.0</td>
          <td>965</td>
        </tr>
        <tr>
          <th>3</th>
          <td>99.0</td>
          <td>4</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 567-572

So if we take the modulo of the house price to $100, we find that it it’s
almost always 0 (well, we see a few 100’s, but that’s just a rounding issue).
Then we see 965 prices ending in , which are exactly those 965 samples we
found with a price of $100,001. Finally, we see 4 prices ending in 9. Let’s
take a look:

.. GENERATED FROM PYTHON SOURCE LINES 574-576

.. code-block:: Python

    df[np.isclose(df[target_col] % 100, 99)][target_col].to_frame()


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>MedHouseVal</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>2521</th>
          <td>14999.0</td>
        </tr>
        <tr>
          <th>2799</th>
          <td>14999.0</td>
        </tr>
        <tr>
          <th>9188</th>
          <td>14999.0</td>
        </tr>
        <tr>
          <th>19802</th>
          <td>14999.0</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 577-579

.. code-block:: Python

    df[target_col].min()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    np.float64(14999.000000000002)


.. GENERATED FROM PYTHON SOURCE LINES 580-585

So these four samples actually correspond to the lowest prices in the dataset.
Therefore, it is very reasonable to assume that the dataset creators decided
to set a maximum price of $500,000 and a minimum price of $15,000, with all
prices falling outside that range being set to the max/min price +/- . For the
prices within the range, they decided to round to $100.

.. GENERATED FROM PYTHON SOURCE LINES 587-596

When it comes to our task of predicting the target, this clipping could
be dangerous. We cannot know by how much the actual price was clipped,
especially when it comes to high prices. For a machine learning model,
it could become extraordinarily hard to predict these high prices,
because even though the features may, for instance, indicate a price of
$1,000,000 for one sample and $500,100 for another, the model is supposed to
predict the same price for both. For this reason, let us remove the
clipped data from the dataset. Whether that’s a good idea will in
reality depend on the use case that we’re trying to solve.

.. GENERATED FROM PYTHON SOURCE LINES 598-600

Training a machine learning model
---------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 602-605

Now that we have gained a good understanding of our data, we move to the
next phase, which is training a machine learning model to predict the
target.

.. GENERATED FROM PYTHON SOURCE LINES 607-609

Remove samples with clipped target data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 611-613

As a first step, we will remove the samples where the target data has
been clipped, as discussed above.

.. GENERATED FROM PYTHON SOURCE LINES 616-622

.. code-block:: Python

    mask = (15_000 <= df[target_col]) & (df[target_col] <= 500_000)
    print(
        f"Discarding {(1 - mask).sum()} ({100 * (1 - mask.mean()):.1f}%) of rows because"
        " the target is clipped."
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Discarding 969 (4.7%) of rows because the target is clipped.


.. GENERATED FROM PYTHON SOURCE LINES 623-625

Train/test split
~~~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 627-635

Then we should split our data into a training and a test set. As every
data scientist should know, we need to evaluate the data on a different
set than we use for training. It would be even better if we created
three sets of data, train/valid/test, and only used the test set at the
very end. For the training part, we could use cross-validation to get
more reliable results, given that our dataset is not that big. For the
purpose of this exercise, we make our lifes simple by only using a
single train/test split though.

.. GENERATED FROM PYTHON SOURCE LINES 637-642

As to the split itself, we just split the data randomly using sklearn’s
``train_test_split``. There is nothing in the data that would
suggest we need to perform a non-random split but again, this will vary
from use case to use case. Note that we set the ``random_state``
to make the results reproducible.

.. GENERATED FROM PYTHON SOURCE LINES 645-647

.. code-block:: Python

    df_train, df_test = train_test_split(df[mask], random_state=0)


.. GENERATED FROM PYTHON SOURCE LINES 648-650

.. code-block:: Python

    df_train.shape, df_test.shape


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    ((14753, 9), (4918, 9))


.. GENERATED FROM PYTHON SOURCE LINES 651-655

After performing the split, it is now a good time to remove the target
data from the rest of the ``DataFrame``, so that we don’t run the
risk of accidentally training on the target. We use the ``pop``
method for that.

.. GENERATED FROM PYTHON SOURCE LINES 658-661

.. code-block:: Python

    y_train = df_train.pop(target_col).values
    y_test = df_test.pop(target_col).values


.. GENERATED FROM PYTHON SOURCE LINES 662-664

Feature engineering
~~~~~~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 666-669

Now let’s get to the feature engineering part. Here it’s worth it to
think a little bit about what type of ML model we plan to use and how
that decision should inform the feature engineering.

.. GENERATED FROM PYTHON SOURCE LINES 671-677

In general, we know that for tabular tasks as this one, ensembles of decision
trees perform exceptionally well, without the need for a lot of tuning.
Examples would be random forest or gradient boosting. We should not try to
challenge this conventional wisdom, this class of models is an excellent
choice for this task as well. There is, however, a caveat. Let’s explore it
more closely.

.. GENERATED FROM PYTHON SOURCE LINES 679-685

We know that the geospatial features are crucial for our task, since we saw
that there is a very strong relationship between the house price of a given
sample and the house price of its neighbors. On the other hand, we saw that
all the other variables, safe for "MedInc", are probably not strong
predictors. We thus need to ensure we can make the best use of "Longitude" and
"Latitude".

.. GENERATED FROM PYTHON SOURCE LINES 687-696

If we use a tree-based model, can we just input the longitude and
latitude as features and be done with it? In theory, this would indeed
work. However, let’s remember how decision trees work. According to a
certain criterion, they split the data at a certain value. As an
example, we could find that the first split of a tree is to split the
data into samples with latitude less than 34 and latitude greater than
34, which is roughly the median. Each time we split the data along
longitude and latitude, we can imagine the map being devided into four
quadrants. So far, so good.

.. GENERATED FROM PYTHON SOURCE LINES 698-702

The problem now comes with the patchiness of the data. Looking again at the
house value by location plot further above, we can immediately see that we
would need *a lot of splits* to capture the neighborhood relationship we find
in the data. How many splits would we need? Let’s take a look.

.. GENERATED FROM PYTHON SOURCE LINES 704-710

To study this question, we will take the ``DecisionTreeRegressor`` from
sklearn and fit it on the longitude and latitude features. The number of
splits is bounded by the ``max_depth`` parameter. So for each level of depth,
the tree splits the data once. That is, for a depth of 10, we get
``2**10=1024`` splits (in reality, the number could be lower, depending on the
other parameters of the tree).

.. GENERATED FROM PYTHON SOURCE LINES 712-718

To get a feeling of what that means in practice, let us first fit 4 decision
trees, using ``max_depth`` values of 1, 2, 5, and 10. Then we plot the
*decision boundary* of the tree after fitting it. We use sklearn’s
`DecisionBoundaryDisplay
<https://scikit-learn.org/stable/modules/generated/sklearn.inspection.DecisionBoundaryDisplay.html>`__
to plot the data. Here is what we get:

.. GENERATED FROM PYTHON SOURCE LINES 720-738

.. code-block:: Python

    _, axes = plt.subplots(2, 2, figsize=(8, 8))
    max_depths = [1, 2, 5, 10]
    for max_depth, ax in zip(max_depths, axes.flatten()):
        dt = DecisionTreeRegressor(random_state=0, max_depth=max_depth)
        dt.fit(df_train[["Longitude", "Latitude"]], y_train)
        DecisionBoundaryDisplay.from_estimator(
            dt,
            df_train[["Longitude", "Latitude"]],
            cmap="coolwarm",
            response_method="predict",
            ax=ax,
            xlabel="Longitude",
            ylabel="Latitude",
            grid_resolution=1000,
        )
        ax.set_title(f"Decision boundary for a depth of {max_depth}")
    plt.tight_layout()


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_007.png
   :alt: Decision boundary for a depth of 1, Decision boundary for a depth of 2, Decision boundary for a depth of 5, Decision boundary for a depth of 10
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_007.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 739-741

As we have described, with a depth of 1, we split our map in half, with
two splits, we get 4 quadrants. This is shown in the first row.

.. GENERATED FROM PYTHON SOURCE LINES 743-748

In the second row, we see the outcomes for depths of 5 and 10. Especially for
10, we see this resulting in quite a few "boxes" that represent neighborhoods
with similar prices. But even with this relatively high number of splits, the
"resolution" of the resulting map is quite bad, lumping together many areas
with quite different prices.

.. GENERATED FROM PYTHON SOURCE LINES 750-759

So what can we do about that? The easiest solution would be to increase
the ``max_depth`` parameter to such a high value that we can
actually model the spatial relationship well enough. But this has some
disadvantages. First of all, a high ``max_depth`` parameter can
result in overfitting on the training data. Remember that we also have
other variables that we want to include, the tree will also split on
those. Second of all, a high ``max_depth`` makes the model bigger
and slower. Depending on the application, this could be a problem for
productionizing the model.

.. GENERATED FROM PYTHON SOURCE LINES 761-767

Another solution is that we could use an ensemble of decision trees. The
result of that will be multiple maps as those above layered on top of each
other. Although this will certainly help, it’s still not a perfect solution,
since it would still require quite a lot of trees to achieve a good
resolution. Imagine a dataset with much more samples and much more
fine-grained data, we would need a giant ensemble to fit it.

.. GENERATED FROM PYTHON SOURCE LINES 769-777

At the end of the day, we have to admit that decision tree-based models
are just not the best fit when it comes to modeling geospatial
relationships. Can we think of a model that is better able to model this
type of data? Why, of course we can. The k-nearest neighbor (KNN) family
of models should be a perfect fit for this, since we want to model
*neighborhood* relationships. To see this in action, let’s again plot
the decision boundaries, this time using the
``KNeighborsRegressor`` from sklearn:

.. GENERATED FROM PYTHON SOURCE LINES 780-781

this controls the level of parallelism, feel free to set to a higher number

.. GENERATED FROM PYTHON SOURCE LINES 781-783

.. code-block:: Python

    N_JOBS = 1


.. GENERATED FROM PYTHON SOURCE LINES 784-799

.. code-block:: Python

    _, ax = plt.subplots(figsize=(5, 5))
    knn = KNeighborsRegressor(n_jobs=N_JOBS)
    knn.fit(df_train[["Longitude", "Latitude"]], y_train)
    DecisionBoundaryDisplay.from_estimator(
        knn,
        df_train[["Longitude", "Latitude"]],
        cmap="coolwarm",
        response_method="predict",
        ax=ax,
        xlabel="Longitude",
        ylabel="Latitude",
        grid_resolution=1000,
    )
    ax.set_title("Decision boundary of KNN")


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_008.png
   :alt: Decision boundary of KNN
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_008.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Text(0.5, 1.0, 'Decision boundary of KNN')


.. GENERATED FROM PYTHON SOURCE LINES 800-809

Now if we compare this to the decision boundaries of the decision trees, even
with a depth of 10, it’s a completely different picture. The KNN model, by its
very nature, can easily model very fine grained spatial differences. The
granularity will depend on the ``k`` part of the name, which indicates the
number of neighbors that are used to make the prediction. At the one extreme,
for ``k=1``, we would only consider a single data point, resulting in a very
spotty map. At the other extreme, when ``k`` is the size of the total dataset,
we would simply average across all data points. Choosing a good value for
``k`` is thus important.

.. GENERATED FROM PYTHON SOURCE LINES 811-815

We will now go into more details of the KNN model. If you’re wondering
how this is related to feature engineering, it will become clear later,
as will actually use the KNN model for the purpose of creating a new
feature. So please be patient.

.. GENERATED FROM PYTHON SOURCE LINES 817-819

Aside: distance metrics
^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 821-825

Before we explore this problem further, let’s talk about distance
metrics. For a KNN model to work, we need to define the distance metric
it uses to determine the closest neighbors. By default,
``KNeighborsRegressor`` uses the Euclidean distance.

.. GENERATED FROM PYTHON SOURCE LINES 827-833

Earlier, we saw that our data points are distributed on a regular grid.
This means, if we take a specific sample, and if we assume that it’s
neighboring spots are not empty, there should be 4 neighbors at exactly
the same distance, namely the neighbors directly north, east, south, and
west. Similarly, neighbors 5 to 8, in the directions NE, SE, SW, and NW,
should also have the exact same distances.

.. GENERATED FROM PYTHON SOURCE LINES 835-837

Let’s check if this is true. For this, we fit a KNN and then use the
``kneighbors`` method to return the closest neighbors.

.. GENERATED FROM PYTHON SOURCE LINES 840-843

.. code-block:: Python

    knn = KNeighborsRegressor(20)
    knn.fit(df[["Longitude", "Latitude"]], df[target_col])


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-7 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-7.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-7.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-7 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-7 pre {
      padding: 0;
    }

    #sk-container-id-7 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-7 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-7 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-7 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-7 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-7 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-7 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-7 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-7 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-7 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-7 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-7 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-7 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-7 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-7 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-7 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-7 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-7 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-7 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-7 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-7 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-7 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-7 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-7 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-7 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-7 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-7 div.sk-label label.sk-toggleable__label,
    #sk-container-id-7 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-7 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-7 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-7 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-7 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-7 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-7 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-7 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-7 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-7 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-7 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-7 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-7 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-7" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>KNeighborsRegressor(n_neighbors=20)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-15" type="checkbox" checked><label for="sk-estimator-id-15" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">20</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-7');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 844-849

.. code-block:: Python

    distances = knn.kneighbors(
        df[["Longitude", "Latitude"]].iloc[[123]], return_distance=True
    )[0].round(5)
    print(distances)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [[0.      0.01    0.01    0.01    0.01    0.01    0.01    0.01    0.01
      0.01    0.01    0.01    0.01    0.01414 0.01414 0.01414 0.01414 0.01414
      0.01414 0.01414]]


.. GENERATED FROM PYTHON SOURCE LINES 850-857

So we find indeed that the distances are discrete, with many samples
having exactly the same distance like 0.01. The closest neighbor has a
distance of 0 – this is not surprising, as this the sample itself.
However, unlike what we expected, we don’t find exactly four equally
distant closest neighbors. Instead, for this sample we find 12 samples
with a distance of exaclty 0.01 (i.e. exactly one step to the north,
east, south, or west). How come?

.. GENERATED FROM PYTHON SOURCE LINES 859-862

Remember from earlier that we found duplicate coordinates in our data?
This is most likely such a case. So let’s remove duplicates and
determine the distances again:

.. GENERATED FROM PYTHON SOURCE LINES 865-869

.. code-block:: Python

    knn = KNeighborsRegressor(20)
    df_no_dup = df.drop_duplicates(["Longitude", "Latitude"]).copy()
    knn.fit(df_no_dup[["Longitude", "Latitude"]], df_no_dup[target_col])


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-8 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-8.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-8.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-8 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-8 pre {
      padding: 0;
    }

    #sk-container-id-8 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-8 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-8 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-8 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-8 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-8 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-8 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-8 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-8 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-8 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-8 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-8 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-8 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-8 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-8 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-8 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-8 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-8 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-8 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-8 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-8 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-8 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-8 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-8 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-8 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-8 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-8 div.sk-label label.sk-toggleable__label,
    #sk-container-id-8 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-8 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-8 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-8 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-8 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-8 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-8 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-8 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-8 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-8 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-8 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-8 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-8 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-8" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>KNeighborsRegressor(n_neighbors=20)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-16" type="checkbox" checked><label for="sk-estimator-id-16" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">20</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-8');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 870-875

.. code-block:: Python

    distances = knn.kneighbors(
        df[["Longitude", "Latitude"]].iloc[[123]], return_distance=True
    )[0].round(5)
    print(distances)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [[0.      0.01    0.01    0.01    0.01    0.01414 0.01414 0.01414 0.01414
      0.02    0.02    0.02    0.02    0.02236 0.02236 0.02236 0.02236 0.02236
      0.02236 0.02828]]


.. GENERATED FROM PYTHON SOURCE LINES 876-878

This looks much better. Now we find exactly 4 neighbors tied for the
closest distance, 4 tied for the second closest distance, etc.

.. GENERATED FROM PYTHON SOURCE LINES 880-886

The same thing should become even clearer when we change the metric from
Euclidean to Manhattan distance. In this context, the Manhattan distance
basically means how far two points are from each other, if we can only
take steps along the north-south or east-west axis. To calculate this
metric, we can set the ``p`` parameter of
``KNeighborsRegressor`` to 1:

.. GENERATED FROM PYTHON SOURCE LINES 888-891

.. code-block:: Python

    knn = KNeighborsRegressor(20, p=1)
    knn.fit(df_no_dup[["Longitude", "Latitude"]], df_no_dup[target_col])


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-9 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-9.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-9.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-9 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-9 pre {
      padding: 0;
    }

    #sk-container-id-9 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-9 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-9 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-9 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-9 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-9 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-9 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-9 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-9 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-9 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-9 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-9 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-9 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-9 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-9 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-9 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-9 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-9 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-9 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-9 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-9 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-9 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-9 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-9 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-9 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-9 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-9 div.sk-label label.sk-toggleable__label,
    #sk-container-id-9 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-9 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-9 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-9 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-9 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-9 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-9 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-9 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-9 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-9 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-9 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-9 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-9 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-9" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>KNeighborsRegressor(n_neighbors=20, p=1)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-17" type="checkbox" checked><label for="sk-estimator-id-17" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">20</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-9');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 892-896

.. code-block:: Python

    distances = knn.kneighbors(
        df[["Longitude", "Latitude"]].iloc[[123]], return_distance=True
    )[0].round(5)
    print(distances)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [[0.   0.01 0.01 0.01 0.01 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.03
      0.03 0.03 0.03 0.03 0.03 0.03]]


.. GENERATED FROM PYTHON SOURCE LINES 897-902

Again, we find 4 neighbors with a distance of exactly 0.01, as expected.
For distance 0.02, we actually find 8 neighbors. This makes sense,
because in Manhattan space, the neighbor north-north of the sample has
the same distance as the neighbor north-east, etc., as both require two
steps to be reached.

.. GENERATED FROM PYTHON SOURCE LINES 904-905

What does this mean for us? There are two important considerations:

.. GENERATED FROM PYTHON SOURCE LINES 907-922

1. When we train a KNN model with a ``k`` that’s not exactly equal
   to 4, 8, etc., we run into some problems. If we take, for instance,
   ``k=3``, and the 4 closest neighbors are equally close,
   ``KNeighborsRegressor`` will actually pick an arbitrary set of
   3 from those 4 possible candidates. This generally something we want
   to avoid in our ML models. In practice, however, the problem isn’t so
   bad, because we have duplicates and because some neighbor spots are
   empty, as we saw earlier. This makes it less likely that many data
   points exhibit different behavior for a very specific ``k``.
2. The second consideration is that we should figure out which metric is
   best for our use case, Euclidean or Manhattan. If people were
   traveling by air, Euclidean makes most sense. If they traveled on a
   rectangular grid (like the streets in Manhattan, hence the name),
   Manhattan makes more sense. In practice, we have neither of those. So
   let’s just use what works better!

.. GENERATED FROM PYTHON SOURCE LINES 924-926

Determining the best hyper-parameters for the KNN model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 928-932

With all that in mind, let’s try to find good hyper-parameters for our
KNN regressor. As mentioned, we should check the value for ``k``
and we should check ``p=1`` and ``p=2`` (i.e. Manhattan and
Euclidean distance). #

.. GENERATED FROM PYTHON SOURCE LINES 934-941

On top of those, let’s check one more hyper-parameter, namely
``weights``. What this means is that when the model finds the
``k`` nearest neighbors, should it simply predict the average
target of those values or should closer neighbors have a higher weight
than more remote neighbors? To use the former approach, we can set
``weights=’uniform’``, for the latter, we set it to
``’distance’``.

.. GENERATED FROM PYTHON SOURCE LINES 943-946

As for the ``k`` parameter, in sklearn, it’s called
``n_neighbors``. Let’s check the values from 1 to 25, as well as
some higher values, 50, 75, 100, 150, and 200.

.. GENERATED FROM PYTHON SOURCE LINES 948-953

Regarding the metrics, we use root mean squared error (RMSE). This could
also be replaced with other metrics, it really depends on the use case.
Here we choose it because this is the most common one used for this
dataset, so if we wanted to compare our results with the results from
others, it makes sense to use the same metrics.

.. GENERATED FROM PYTHON SOURCE LINES 955-959

(Note: For scoring, we use the ``’neg_root_mean_squared_error’``,
i.e. the negative RMSE. This is because by convention, sklearn considers
higher values to be better. For RMSE, however, lower values are better.
To circumvent that, we just the negative RMSE.)

.. GENERATED FROM PYTHON SOURCE LINES 961-966

To check all the different parameter combinations and to calculate the
RMSE out of fold, we rely on sklearn’s ``GridSearchCV``. We won’t
explain what it does here, since there already are so many tutorials out
there that discuss grid search. Suffice it to say, this is exactly what
we need for our problem.

.. GENERATED FROM PYTHON SOURCE LINES 968-977

.. code-block:: Python

    knn = KNeighborsRegressor()
    params = {
        "weights": ["uniform", "distance"],
        "p": [1, 2],
        "n_neighbors": list(range(1, 26)) + [50, 75, 100, 150, 200],
    }
    search = GridSearchCV(knn, params, scoring="neg_root_mean_squared_error")
    search.fit(df_train[["Longitude", "Latitude"]], y_train)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-10 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-10.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-10.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-10 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-10 pre {
      padding: 0;
    }

    #sk-container-id-10 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-10 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-10 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-10 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-10 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-10 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-10 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-10 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-10 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-10 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-10 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-10 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-10 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-10 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-10 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-10 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-10 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-10 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-10 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-10 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-10 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-10 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-10 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-10 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-10 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-10 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-10 div.sk-label label.sk-toggleable__label,
    #sk-container-id-10 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-10 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-10 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-10 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-10 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-10 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-10 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-10 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-10 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-10 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-10 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-10 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-10 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-10" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>GridSearchCV(estimator=KNeighborsRegressor(),
                 param_grid={&#x27;n_neighbors&#x27;: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
                                             13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                             23, 24, 25, 50, 75, 100, 150, 200],
                             &#x27;p&#x27;: [1, 2], &#x27;weights&#x27;: [&#x27;uniform&#x27;, &#x27;distance&#x27;]},
                 scoring=&#x27;neg_root_mean_squared_error&#x27;)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-18" type="checkbox" ><label for="sk-estimator-id-18" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>GridSearchCV</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html">?<span>Documentation for GridSearchCV</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('estimator',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=estimator,-estimator%20object">
                estimator
                <span class="param-doc-description">estimator: estimator object<br><br>This is assumed to implement the scikit-learn estimator interface.<br>Either estimator needs to provide a ``score`` function,<br>or ``scoring`` must be passed.</span>
            </a>
        </td>
                <td class="value">KNeighborsRegressor()</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('param_grid',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=param_grid,-dict%20or%20list%20of%20dictionaries">
                param_grid
                <span class="param-doc-description">param_grid: dict or list of dictionaries<br><br>Dictionary with parameters names (`str`) as keys and lists of<br>parameter settings to try as values, or a list of such<br>dictionaries, in which case the grids spanned by each dictionary<br>in the list are explored. This enables searching over any sequence<br>of parameter settings.</span>
            </a>
        </td>
                <td class="value">{&#x27;n_neighbors&#x27;: [1, 2, ...], &#x27;p&#x27;: [1, 2], &#x27;weights&#x27;: [&#x27;uniform&#x27;, &#x27;distance&#x27;]}</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('scoring',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=scoring,-str%2C%20callable%2C%20list%2C%20tuple%20or%20dict%2C%20default%3DNone">
                scoring
                <span class="param-doc-description">scoring: str, callable, list, tuple or dict, default=None<br><br>Strategy to evaluate the performance of the cross-validated model on<br>the test set.<br><br>If `scoring` represents a single score, one can use:<br><br>- a single string (see :ref:`scoring_string_names`);<br>- a callable (see :ref:`scoring_callable`) that returns a single value;<br>- `None`, the `estimator`'s<br>  :ref:`default evaluation criterion <scoring_api_overview>` is used.<br><br>If `scoring` represents multiple scores, one can use:<br><br>- a list or tuple of unique strings;<br>- a callable returning a dictionary where the keys are the metric<br>  names and the values are the metric scores;<br>- a dictionary with metric names as keys and callables as values.<br><br>See :ref:`multimetric_grid_search` for an example.</span>
            </a>
        </td>
                <td class="value">&#x27;neg_root_mean_squared_error&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>Number of jobs to run in parallel.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br><br>.. versionchanged:: v0.20<br>   `n_jobs` default changed from 1 to None</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('refit',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=refit,-bool%2C%20str%2C%20or%20callable%2C%20default%3DTrue">
                refit
                <span class="param-doc-description">refit: bool, str, or callable, default=True<br><br>Refit an estimator using the best found parameters on the whole<br>dataset.<br><br>For multiple metric evaluation, this needs to be a `str` denoting the<br>scorer that would be used to find the best parameters for refitting<br>the estimator at the end.<br><br>Where there are considerations other than maximum score in<br>choosing a best estimator, ``refit`` can be set to a function which<br>returns the selected ``best_index_`` given ``cv_results_``. In that<br>case, the ``best_estimator_`` and ``best_params_`` will be set<br>according to the returned ``best_index_`` while the ``best_score_``<br>attribute will not be available.<br><br>The refitted estimator is made available at the ``best_estimator_``<br>attribute and permits using ``predict`` directly on this<br>``GridSearchCV`` instance.<br><br>Also for multiple metric evaluation, the attributes ``best_index_``,<br>``best_score_`` and ``best_params_`` will only be available if<br>``refit`` is set and all of them will be determined w.r.t this specific<br>scorer.<br><br>See ``scoring`` parameter to know more about multiple metric<br>evaluation.<br><br>See :ref:`sphx_glr_auto_examples_model_selection_plot_grid_search_digits.py`<br>to see how to design a custom selection strategy using a callable<br>via `refit`.<br><br>See :ref:`this example<br><sphx_glr_auto_examples_model_selection_plot_grid_search_refit_callable.py>`<br>for an example of how to use ``refit=callable`` to balance model<br>complexity and cross-validated score.<br><br>.. versionchanged:: 0.20<br>    Support for callable added.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('cv',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=cv,-int%2C%20cross-validation%20generator%20or%20an%20iterable%2C%20default%3DNone">
                cv
                <span class="param-doc-description">cv: int, cross-validation generator or an iterable, default=None<br><br>Determines the cross-validation splitting strategy.<br>Possible inputs for cv are:<br><br>- None, to use the default 5-fold cross validation,<br>- integer, to specify the number of folds in a `(Stratified)KFold`,<br>- :term:`CV splitter`,<br>- An iterable yielding (train, test) splits as arrays of indices.<br><br>For integer/None inputs, if the estimator is a classifier and ``y`` is<br>either binary or multiclass, :class:`StratifiedKFold` is used. In all<br>other cases, :class:`KFold` is used. These splitters are instantiated<br>with `shuffle=False` so the splits will be the same across calls.<br><br>Refer :ref:`User Guide <cross_validation>` for the various<br>cross-validation strategies that can be used here.<br><br>.. versionchanged:: 0.22<br>    ``cv`` default value if None changed from 3-fold to 5-fold.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=verbose,-int">
                verbose
                <span class="param-doc-description">verbose: int<br><br>Controls the verbosity: the higher, the more messages.<br><br>- >1 : the computation time for each fold and parameter candidate is<br>  displayed;<br>- >2 : the score is also displayed;<br>- >3 : the fold and candidate parameter indexes are also displayed<br>  together with the starting time of the computation.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('pre_dispatch',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=pre_dispatch,-int%2C%20or%20str%2C%20default%3D%272%2An_jobs%27">
                pre_dispatch
                <span class="param-doc-description">pre_dispatch: int, or str, default='2*n_jobs'<br><br>Controls the number of jobs that get dispatched during parallel<br>execution. Reducing this number can be useful to avoid an<br>explosion of memory consumption when more jobs get dispatched<br>than CPUs can process. This parameter can be:<br><br>- None, in which case all the jobs are immediately created and spawned. Use<br>  this for lightweight and fast-running jobs, to avoid delays due to on-demand<br>  spawning of the jobs<br>- An int, giving the exact number of total jobs that are spawned<br>- A str, giving an expression as a function of n_jobs, as in '2*n_jobs'</span>
            </a>
        </td>
                <td class="value">&#x27;2*n_jobs&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('error_score',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=error_score,-%27raise%27%20or%20numeric%2C%20default%3Dnp.nan">
                error_score
                <span class="param-doc-description">error_score: 'raise' or numeric, default=np.nan<br><br>Value to assign to the score if an error occurs in estimator fitting.<br>If set to 'raise', the error is raised. If a numeric value is given,<br>FitFailedWarning is raised. This parameter does not affect the refit<br>step, which will always raise the error.</span>
            </a>
        </td>
                <td class="value">nan</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('return_train_score',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.model_selection.GridSearchCV.html#:~:text=return_train_score,-bool%2C%20default%3DFalse">
                return_train_score
                <span class="param-doc-description">return_train_score: bool, default=False<br><br>If ``False``, the ``cv_results_`` attribute will not include training<br>scores.<br>Computing training scores is used to get insights on how different<br>parameter settings impact the overfitting/underfitting trade-off.<br>However computing the scores on the training set can be computationally<br>expensive and is not strictly required to select the parameters that<br>yield the best generalization performance.<br><br>.. versionadded:: 0.19<br><br>.. versionchanged:: 0.21<br>    Default value was changed from ``True`` to ``False``</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-19" type="checkbox" ><label for="sk-estimator-id-19" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>best_estimator_: KNeighborsRegressor</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="best_estimator___"><pre>KNeighborsRegressor(n_neighbors=6)</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-20" type="checkbox" ><label for="sk-estimator-id-20" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="best_estimator___">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">6</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-10');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 978-983

Fortunately, the grid search only takes a few seconds because the model
is quick, the dataset size is small, and we only test 120 different
parameter combinations. If the grid search would take too long, we could
switch to ``RandomizedSearchCV`` or ``HalvingGridSearchCV``
from sklearn, or use a Bayesian optimization method from other packages.

.. GENERATED FROM PYTHON SOURCE LINES 985-987

Now, let’s put the results of the grid search into a pandas
``DataFrame`` and inspect the top results.

.. GENERATED FROM PYTHON SOURCE LINES 990-995

.. code-block:: Python

    df_cv = pd.DataFrame(search.cv_results_)
    df_cv.sort_values("rank_test_score")[
        ["param_n_neighbors", "param_weights", "param_p", "mean_test_score"]
    ].head(10)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <div>
    <style scoped>
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

        .dataframe tbody tr th {
            vertical-align: top;
        }

        .dataframe thead th {
            text-align: right;
        }
    </style>
    <table border="1" class="dataframe">
      <thead>
        <tr style="text-align: right;">
          <th></th>
          <th>param_n_neighbors</th>
          <th>param_weights</th>
          <th>param_p</th>
          <th>mean_test_score</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th>22</th>
          <td>6</td>
          <td>uniform</td>
          <td>2</td>
          <td>-48792.277422</td>
        </tr>
        <tr>
          <th>26</th>
          <td>7</td>
          <td>uniform</td>
          <td>2</td>
          <td>-48908.663856</td>
        </tr>
        <tr>
          <th>18</th>
          <td>5</td>
          <td>uniform</td>
          <td>2</td>
          <td>-48939.112420</td>
        </tr>
        <tr>
          <th>20</th>
          <td>6</td>
          <td>uniform</td>
          <td>1</td>
          <td>-48960.566567</td>
        </tr>
        <tr>
          <th>24</th>
          <td>7</td>
          <td>uniform</td>
          <td>1</td>
          <td>-49049.354866</td>
        </tr>
        <tr>
          <th>30</th>
          <td>8</td>
          <td>uniform</td>
          <td>2</td>
          <td>-49125.740942</td>
        </tr>
        <tr>
          <th>16</th>
          <td>5</td>
          <td>uniform</td>
          <td>1</td>
          <td>-49158.915396</td>
        </tr>
        <tr>
          <th>28</th>
          <td>8</td>
          <td>uniform</td>
          <td>1</td>
          <td>-49264.347864</td>
        </tr>
        <tr>
          <th>34</th>
          <td>9</td>
          <td>uniform</td>
          <td>2</td>
          <td>-49299.733688</td>
        </tr>
        <tr>
          <th>32</th>
          <td>9</td>
          <td>uniform</td>
          <td>1</td>
          <td>-49463.991846</td>
        </tr>
      </tbody>
    </table>
    </div>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 996-1000

Okay, so there is a clear trend for small ``k`` values in the
range of 5 to 9 to fare better. Also ``’uniform’`` weights seem to
be clearly superior. When it comes to the distance metric, no clear
trend is discernable.

.. GENERATED FROM PYTHON SOURCE LINES 1002-1009

To get a better picture, it is often a good idea to plot the grid search
results, instead of just picking the best result blindly, especially if
the outcome is noisy. For this purpose we will plot four lines, one for
each combination of ``weights`` and ``p``, all showing the
RMSE as a function of ``n_neighbors``. Note that we plot
``n_neighbors`` on a log scale, because we want to have higher
resolution for small values.

.. GENERATED FROM PYTHON SOURCE LINES 1011-1028

.. code-block:: Python


    fig, ax = plt.subplots()
    for weight in params["weights"]:  # type: ignore
        for p in params["p"]:  # type: ignore
            query = f"param_weights=='{weight}' & param_p=={p}"
            df_subset = df_cv.query(query)
            df_subset.plot(
                x="param_n_neighbors",
                y="mean_test_score",
                xlabel="Log of the n_neighbors parameter",
                ylabel="negative RMSE",
                label=query,
                ax=ax,
                marker="o",
                ms=5,
                logx=True,
            )


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_009.png
   :alt: plot california housing
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_009.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 1029-1033

So both for ``weights==’uniform’`` and for
``weights==’distance’``, we see that ``p==2`` is slightly
better than, or equally good as, ``p==1``. It will thus be safe to
use ``p==2``.

.. GENERATED FROM PYTHON SOURCE LINES 1035-1042

Furthermore, we see that for low values of ``n_neighbors``, it is
better to have ``weights==’uniform’``, while for large values of
``n_neighbors``, ``weights==’distance’`` is better. This is
perhaps not too surprising: If we consider a lot of neighbors, many of
which are already quite far away, we should give lower weight to those
remote neighbors. When we only look at 5 neighbors, this isn’t necessary
– their distances will be quite similar anway.

.. GENERATED FROM PYTHON SOURCE LINES 1044-1049

Another observation is that the curve for the RMSE as a function of
``n_neighbors`` seems to be quite smooth. This tells us two
things: First of all, the metric isn’t very noisy, at least using the
5-fold cross validation that the grid seach uses by default. This is
good, we don’t want the outcome to be too dependent on randomness.

.. GENERATED FROM PYTHON SOURCE LINES 1051-1056

Moreover, we don’t see any strange jumps at ``n_neighbors`` values
like 4 or 8. If you remember the problem we discussed earlier of
arbitrary points being chosen when neighbors have the exact same
distance, this could have been a concern. Since we don’t see it, it’s
probably not as bad as we might have feared.

.. GENERATED FROM PYTHON SOURCE LINES 1058-1062

So now that we have a good idea about what good hyper-parameters for KNN
are, let’s see how the good the predictions from the model are. For
this, we plot the true target as a function of the predictions, which we
calculate out of fold using sklearn’s ``cross_val_predict``.

.. GENERATED FROM PYTHON SOURCE LINES 1065-1074

.. code-block:: Python

    knn = KNeighborsRegressor(**search.best_params_)
    avg_neighbor_val = cross_val_predict(knn, df_train[["Longitude", "Latitude"]], y_train)
    fig, ax = plt.subplots(figsize=(5, 5))
    ax.scatter(avg_neighbor_val, y_train, alpha=0.05, s=1.5)
    ax.set_xlabel(
        f"median house price of {search.best_params_['n_neighbors']} closest neighbors"
    )
    ax.set_ylabel(target_col)


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_010.png
   :alt: plot california housing
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_010.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Text(-5.402777777777777, 0.5, 'MedHouseVal')


.. GENERATED FROM PYTHON SOURCE LINES 1075-1082

As we can see, the correlation is quite good between the two. This means
that our KNN model can explain a lot of the variability in the target
using only the coordinates, none of the other features. In particular,
we can easily see that this correlation is much better than the
correlation of any of the features that we plotted above. **Would it not
be nice if we could use the KNN predictions as a feature for a more
powerful model?**

.. GENERATED FROM PYTHON SOURCE LINES 1084-1090

Why would we want to use the KNN predictions as a feature for another
model, instead of just being content with using the KNN as the final
predictor? The problem is that with KNN, it is very hard to incorporate
the remaining features. Although we can be sure that those features are
not as important, they should still contain useful signal to improve the
model.

.. GENERATED FROM PYTHON SOURCE LINES 1092-1098

The reason why it is difficult to include arbitrary features like
"MedInc" or "HouseAge" into a KNN model is that they are
on a completely different scale than the coordinates, so we would need
to normalize all the data. Even then, we know that our other features
are not very uniformely distributed, which in general is something that
the KNN benefits from.

.. GENERATED FROM PYTHON SOURCE LINES 1100-1105

But even if everything was on the same scale and evenly distributed,
should we weight a distance of 0.1 in the "MedInc" space the
same as a distance of 0.1 in "Longitude" space? Probably not. It
just doesn’t make sense to put two completely measurements into the same
KNN model.

.. GENERATED FROM PYTHON SOURCE LINES 1107-1112

This is why we will use the KNN predictions as a *feature* for more
appropriate models. This also explains why this is part of the feature
engineering section. Doing this correctly is not quite trivial, but
below we will see how we can use the tools that sklearn provides to
achieve this goal.

.. GENERATED FROM PYTHON SOURCE LINES 1114-1116

Training ML models
~~~~~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 1118-1121

Now that we have gained important insights into the data and also formed
a plan when it comes to feature engineering, we can start with the
training of the predictive models.

.. GENERATED FROM PYTHON SOURCE LINES 1123-1125

Dummy model
^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 1127-1132

As a first step, we start with a dummy model, i.e. a model that isn’t
allowed to learn anything from the input data. Often, it’s a good idea
to try a dummy model first, as it will give us an idea what the worst
score is we should expect. If our proper models cannot beat the dummy
model, it most likely means we have done something seriously wrong.

.. GENERATED FROM PYTHON SOURCE LINES 1134-1141

For this dataset, what is the appropriate dummy model? We are trying to
minimize the root mean squared error. This is the same as trying to
minimize the mean squared error (the root is just a monotonic
transformation). And to minimize the mean squared error, if we don’t
know anything about the input data, requires us to predict the *mean* of
the target. For our convenience, sklearn provides a
``DummyRegressor`` that will do exactly this for us:

.. GENERATED FROM PYTHON SOURCE LINES 1143-1146

.. code-block:: Python

    dummy = DummyRegressor()
    dummy.fit(df_train, y_train)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-11 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-11.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-11.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-11 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-11 pre {
      padding: 0;
    }

    #sk-container-id-11 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-11 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-11 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-11 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-11 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-11 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-11 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-11 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-11 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-11 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-11 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-11 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-11 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-11 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-11 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-11 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-11 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-11 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-11 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-11 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-11 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-11 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-11 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-11 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-11 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-11 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-11 div.sk-label label.sk-toggleable__label,
    #sk-container-id-11 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-11 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-11 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-11 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-11 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-11 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-11 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-11 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-11 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-11 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-11 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-11 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-11 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-11" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>DummyRegressor()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-21" type="checkbox" checked><label for="sk-estimator-id-21" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>DummyRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.dummy.DummyRegressor.html">?<span>Documentation for DummyRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('strategy',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.dummy.DummyRegressor.html#:~:text=strategy,-%7B%22mean%22%2C%20%22median%22%2C%20%22quantile%22%2C%20%22constant%22%7D%2C%20default%3D%22mean%22">
                strategy
                <span class="param-doc-description">strategy: {"mean", "median", "quantile", "constant"}, default="mean"<br><br>Strategy to use to generate predictions.<br><br>* "mean": always predicts the mean of the training set<br>* "median": always predicts the median of the training set<br>* "quantile": always predicts a specified quantile of the training set,<br>  provided with the quantile parameter.<br>* "constant": always predicts a constant value that is provided by<br>  the user.</span>
            </a>
        </td>
                <td class="value">&#x27;mean&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('constant',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.dummy.DummyRegressor.html#:~:text=constant,-int%20or%20float%20or%20array-like%20of%20shape%20%28n_outputs%2C%29%2C%20default%3DNone">
                constant
                <span class="param-doc-description">constant: int or float or array-like of shape (n_outputs,), default=None<br><br>The explicit constant as predicted by the "constant" strategy. This<br>parameter is useful only for the "constant" strategy.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('quantile',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.dummy.DummyRegressor.html#:~:text=quantile,-float%20in%20%5B0.0%2C%201.0%5D%2C%20default%3DNone">
                quantile
                <span class="param-doc-description">quantile: float in [0.0, 1.0], default=None<br><br>The quantile to predict using the "quantile" strategy. A quantile of<br>0.5 corresponds to the median, while 0.0 to the minimum and 1.0 to the<br>maximum.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-11');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1147-1150

Note: Even though we pass ``df_train``, the
``DummyRegressor`` does not make use of it. We could pass an empty
``df`` and the result would be the same.

.. GENERATED FROM PYTHON SOURCE LINES 1153-1155

.. code-block:: Python

    -get_scorer("neg_root_mean_squared_error")(dummy, df_test, y_test)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    100528.43327419003


.. GENERATED FROM PYTHON SOURCE LINES 1156-1158

As a reminder, sklearn only comes with the negative RMSE, so we make it
positive again at the end.

.. GENERATED FROM PYTHON SOURCE LINES 1160-1162

Adding KNN predictions as features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 1164-1171

Let’s get to the part where we want to add the predictions of the KNN
model as features for our final regression model. As we have discussed
above, we have some good evidence that the KNN predictions are well
suited to capture the spatial relationship between data points, but KNN
itself is not good with dealing with the other features, so we want to
use a model that can deal with those features on top of the KNN
predictions.

.. GENERATED FROM PYTHON SOURCE LINES 1173-1176

Does sklearn offer us a way of adding the predictions of one model as features
for another model? Indeed it does, this is achieved by using the
:class:`sklearn.ensemble.StackingRegressor`. To quote from the docs:

.. GENERATED FROM PYTHON SOURCE LINES 1178-1182

*Stacked generalization consists in stacking the output of individual
estimator [sic] and use a regressor to compute the final prediction.
Stacking allows to use the strength of each individual estimator by
using their output as input of a final estimator.*

.. GENERATED FROM PYTHON SOURCE LINES 1184-1187

In our case, that means that we use KNN for the "individual
estimator" part and then a different model, like an ensemble of
decision trees, for the "final esimator".

.. GENERATED FROM PYTHON SOURCE LINES 1189-1190

The documentation also states that:

.. GENERATED FROM PYTHON SOURCE LINES 1192-1195

*Note that estimators\_ are fitted on the full X while final_estimator\_
is trained using cross-validated predictions of the base estimators
using ``cross_val_predict``."*

.. GENERATED FROM PYTHON SOURCE LINES 1197-1208

This is important to know. It is crucial that the predictions from the
KNNs are calculated using ``cross_val_predict``, i.e *out of
fold*, otherwise, we run into the risk of overfitting. To understand
this point, let’s take an extreme example. Let’s say we use a KNN with
``n_neighbors=1``. If the predictions were calculated *in fold*,
this KNN would completely overfit on the training data and the
prediction would be perfect (okay, not quite, since we have duplicate
coordinates in the data). Then the final model would only rely on the
seemingly perfect KNN predictions, ignoring all other features. This
would be really bad. That’s why the "cross-validated
predictions" part is so crucial.

.. GENERATED FROM PYTHON SOURCE LINES 1211-1215

.. code-block:: Python


    # Note: We may be tempted to simply add the new feature to the ``DataFrame``
    # like this: ``df['pred_knn'] = knn.predict(df[['Longitude', 'Latitude']])``.


.. GENERATED FROM PYTHON SOURCE LINES 1216-1220

However, this is problematic. First of all, we need to make out of fold
predictions, as explained above. This code snippet would result in
overfitting. We thus need to calculate the out of fold predictions
manually, which adds more (errror prone) custom code.

.. GENERATED FROM PYTHON SOURCE LINES 1222-1228

Second, let’s assume we want to deploy the final model. When we call it,
we need to pass all the features, i.e. we would need to generate the KNN
prediction before passing the features to the model. This requires even
more custom code to be added, making the whole application even more
error prone. By sticking with the tools sklearn gives us, we avoid the
two issues.

.. GENERATED FROM PYTHON SOURCE LINES 1230-1236

By default, ``StackingRegressor`` trains the final estimator only
on the predictions of the individual estimators. However, we want it to
be trained on all the other features too. This can be achieved by
setting ``StackingRegressor(..., passthrough=True)``, which will
*pass through* the original input and concatenate it with the
predictions from our KNN.

.. GENERATED FROM PYTHON SOURCE LINES 1238-1245

Another issue we need to solve is that we want the final estimator to be
trained on all features, but the KNN is supposed to be only trained on
longitude and latitude. When we pass all features as ``X``, the
KNN would be trained on all these features, which, as we discussed,
wouldn’t be a good idea. If we only pass longitude and latitude as
``X``, the final estimator cannot make use of the other features.
What do we do?

.. GENERATED FROM PYTHON SOURCE LINES 1247-1258

The solution here is to pass all the features, but to put the KNN into a
``Pipeline`` with one step *selecting* only the longitude and
latitude, and the second step being the KNN itself. Unless we’re missing
something, sklearn does not directly provide a transformer that is only
used for selecting columns, but we can cobble one together ourselves.
For this, we choose the ``FunctionTransformer``, which is a
transformer that calls an arbitrary function. The function itself should
simpy select the two columns ``["Longitude",
"Latitude"]``. This can be done using the
``itemgetter`` function from the builtin ``operator``
library. The resulting ``Pipeline`` looks like this:

.. GENERATED FROM PYTHON SOURCE LINES 1260-1267

.. code-block:: Python

    Pipeline(
        [
            ("select_cols", FunctionTransformer(itemgetter(["Longitude", "Latitude"]))),
            ("knn", KNeighborsRegressor()),
        ]
    )


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-12 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-12.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-12.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-12 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-12 pre {
      padding: 0;
    }

    #sk-container-id-12 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-12 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-12 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-12 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-12 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-12 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-12 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-12 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-12 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-12 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-12 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-12 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-12 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-12 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-12 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-12 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-12 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-12 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-12 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-12 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-12 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-12 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-12 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-12 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-12 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-12 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-12 div.sk-label label.sk-toggleable__label,
    #sk-container-id-12 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-12 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-12 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-12 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-12 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-12 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-12 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-12 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-12 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-12 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-12 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-12 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-12 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-12" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>Pipeline(steps=[(&#x27;select_cols&#x27;,
                     FunctionTransformer(func=operator.itemgetter([&#x27;Longitude&#x27;, &#x27;Latitude&#x27;]))),
                    (&#x27;knn&#x27;, KNeighborsRegressor())])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-22" type="checkbox" ><label for="sk-estimator-id-22" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>Pipeline</div></div><div><a class="sk-estimator-doc-link " rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html">?<span>Documentation for Pipeline</span></a><span class="sk-estimator-doc-link ">i<span>Not fitted</span></span></div></label><div class="sk-toggleable__content " data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('steps',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=steps,-list%20of%20tuples">
                steps
                <span class="param-doc-description">steps: list of tuples<br><br>List of (name of step, estimator) tuples that are to be chained in<br>sequential order. To be compatible with the scikit-learn API, all steps<br>must define `fit`. All non-last steps must also define `transform`. See<br>:ref:`Combining Estimators <combining_estimators>` for more details.</span>
            </a>
        </td>
                <td class="value">[(&#x27;select_cols&#x27;, ...), (&#x27;knn&#x27;, ...)]</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transform_input',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=transform_input,-list%20of%20str%2C%20default%3DNone">
                transform_input
                <span class="param-doc-description">transform_input: list of str, default=None<br><br>The names of the :term:`metadata` parameters that should be transformed by the<br>pipeline before passing it to the step consuming it.<br><br>This enables transforming some input arguments to ``fit`` (other than ``X``)<br>to be transformed by the steps of the pipeline up to the step which requires<br>them. Requirement is defined via :ref:`metadata routing <metadata_routing>`.<br>For instance, this can be used to pass a validation set through the pipeline.<br><br>You can only set this if metadata routing is enabled, which you<br>can enable using ``sklearn.set_config(enable_metadata_routing=True)``.<br><br>.. versionadded:: 1.6</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('memory',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=memory,-str%20or%20object%20with%20the%20joblib.Memory%20interface%2C%20default%3DNone">
                memory
                <span class="param-doc-description">memory: str or object with the joblib.Memory interface, default=None<br><br>Used to cache the fitted transformers of the pipeline. The last step<br>will never be cached, even if it is a transformer. By default, no<br>caching is performed. If a string is given, it is the path to the<br>caching directory. Enabling caching triggers a clone of the transformers<br>before fitting. Therefore, the transformer instance given to the<br>pipeline cannot be inspected directly. Use the attribute ``named_steps``<br>or ``steps`` to inspect estimators within the pipeline. Caching the<br>transformers is advantageous when fitting is time consuming. See<br>:ref:`sphx_glr_auto_examples_neighbors_plot_caching_nearest_neighbors.py`<br>for an example on how to enable caching.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=verbose,-bool%2C%20default%3DFalse">
                verbose
                <span class="param-doc-description">verbose: bool, default=False<br><br>If True, the time elapsed while fitting each step will be printed as it<br>is completed.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-23" type="checkbox" ><label for="sk-estimator-id-23" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>itemgetter(...)</div><div class="caption">FunctionTransformer</div></div><div><a class="sk-estimator-doc-link " rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html">?<span>Documentation for FunctionTransformer</span></a></div></label><div class="sk-toggleable__content " data-param-prefix="select_cols__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('func',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=func,-callable%2C%20default%3DNone">
                func
                <span class="param-doc-description">func: callable, default=None<br><br>The callable to use for the transformation. This will be passed<br>the same arguments as transform, with args and kwargs forwarded.<br>If func is None, then func will be the identity function.</span>
            </a>
        </td>
                <td class="value">operator.item..., &#x27;Latitude&#x27;])</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('inverse_func',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=inverse_func,-callable%2C%20default%3DNone">
                inverse_func
                <span class="param-doc-description">inverse_func: callable, default=None<br><br>The callable to use for the inverse transformation. This will be<br>passed the same arguments as inverse transform, with args and<br>kwargs forwarded. If inverse_func is None, then inverse_func<br>will be the identity function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('validate',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=validate,-bool%2C%20default%3DFalse">
                validate
                <span class="param-doc-description">validate: bool, default=False<br><br>Indicate that the input X array should be checked before calling<br>``func``. The possibilities are:<br><br>- If False, there is no input validation.<br>- If True, then X will be converted to a 2-dimensional NumPy array or<br>  sparse matrix. If the conversion is not possible an exception is<br>  raised.<br><br>.. versionchanged:: 0.22<br>   The default of ``validate`` changed from True to False.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('accept_sparse',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=accept_sparse,-bool%2C%20default%3DFalse">
                accept_sparse
                <span class="param-doc-description">accept_sparse: bool, default=False<br><br>Indicate that func accepts a sparse matrix as input. If validate is<br>False, this has no effect. Otherwise, if accept_sparse is false,<br>sparse matrix inputs will cause an exception to be raised.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('check_inverse',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=check_inverse,-bool%2C%20default%3DTrue">
                check_inverse
                <span class="param-doc-description">check_inverse: bool, default=True<br><br>Whether to check that or ``func`` followed by ``inverse_func`` leads to<br>the original inputs. It can be used for a sanity check, raising a<br>warning when the condition is not fulfilled.<br><br>.. versionadded:: 0.20</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('feature_names_out',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=feature_names_out,-callable%2C%20%27one-to-one%27%20or%20None%2C%20default%3DNone">
                feature_names_out
                <span class="param-doc-description">feature_names_out: callable, 'one-to-one' or None, default=None<br><br>Determines the list of feature names that will be returned by the<br>`get_feature_names_out` method. If it is 'one-to-one', then the output<br>feature names will be equal to the input feature names. If it is a<br>callable, then it must take two positional arguments: this<br>`FunctionTransformer` (`self`) and an array-like of input feature names<br>(`input_features`). It must return an array-like of output feature<br>names. The `get_feature_names_out` method is only defined if<br>`feature_names_out` is not None.<br><br>See ``get_feature_names_out`` for more details.<br><br>.. versionadded:: 1.1</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('kw_args',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=kw_args,-dict%2C%20default%3DNone">
                kw_args
                <span class="param-doc-description">kw_args: dict, default=None<br><br>Dictionary of additional keyword arguments to pass to func.<br><br>.. versionadded:: 0.18</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('inv_kw_args',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.preprocessing.FunctionTransformer.html#:~:text=inv_kw_args,-dict%2C%20default%3DNone">
                inv_kw_args
                <span class="param-doc-description">inv_kw_args: dict, default=None<br><br>Dictionary of additional keyword arguments to pass to inverse_func.<br><br>.. versionadded:: 0.18</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-item"><div class="sk-estimator  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-24" type="checkbox" ><label for="sk-estimator-id-24" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link " rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a></div></label><div class="sk-toggleable__content " data-param-prefix="knn__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">5</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-12');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1268-1274

Note: Alternatively, we could have made use of sklearn’s
``ColumnTransformer``, which does have a builtin way of selecting
columns. It also wants to apply some transformer to these selected
columns, even though we don’t need that. This can be circumvented by
setting this transformer to ``"passthrough"``. The end
result is:

.. GENERATED FROM PYTHON SOURCE LINES 1276-1288

.. code-block:: Python

    Pipeline(
        [
            (
                "select_cols",
                ColumnTransformer(
                    [("long_and_lat", "passthrough", ["Longitude", "Latitude"])]
                ),
            ),
            ("knn", KNeighborsRegressor()),
        ]
    )


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-13 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-13.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-13.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-13 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-13 pre {
      padding: 0;
    }

    #sk-container-id-13 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-13 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-13 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-13 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-13 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-13 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-13 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-13 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-13 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-13 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-13 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-13 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-13 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-13 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-13 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-13 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-13 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-13 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-13 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-13 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-13 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-13 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-13 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-13 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-13 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-13 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-13 div.sk-label label.sk-toggleable__label,
    #sk-container-id-13 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-13 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-13 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-13 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-13 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-13 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-13 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-13 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-13 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-13 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-13 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-13 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-13 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-13" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>Pipeline(steps=[(&#x27;select_cols&#x27;,
                     ColumnTransformer(transformers=[(&#x27;long_and_lat&#x27;, &#x27;passthrough&#x27;,
                                                      [&#x27;Longitude&#x27;, &#x27;Latitude&#x27;])])),
                    (&#x27;knn&#x27;, KNeighborsRegressor())])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-25" type="checkbox" ><label for="sk-estimator-id-25" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>Pipeline</div></div><div><a class="sk-estimator-doc-link " rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html">?<span>Documentation for Pipeline</span></a><span class="sk-estimator-doc-link ">i<span>Not fitted</span></span></div></label><div class="sk-toggleable__content " data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('steps',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=steps,-list%20of%20tuples">
                steps
                <span class="param-doc-description">steps: list of tuples<br><br>List of (name of step, estimator) tuples that are to be chained in<br>sequential order. To be compatible with the scikit-learn API, all steps<br>must define `fit`. All non-last steps must also define `transform`. See<br>:ref:`Combining Estimators <combining_estimators>` for more details.</span>
            </a>
        </td>
                <td class="value">[(&#x27;select_cols&#x27;, ...), (&#x27;knn&#x27;, ...)]</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transform_input',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=transform_input,-list%20of%20str%2C%20default%3DNone">
                transform_input
                <span class="param-doc-description">transform_input: list of str, default=None<br><br>The names of the :term:`metadata` parameters that should be transformed by the<br>pipeline before passing it to the step consuming it.<br><br>This enables transforming some input arguments to ``fit`` (other than ``X``)<br>to be transformed by the steps of the pipeline up to the step which requires<br>them. Requirement is defined via :ref:`metadata routing <metadata_routing>`.<br>For instance, this can be used to pass a validation set through the pipeline.<br><br>You can only set this if metadata routing is enabled, which you<br>can enable using ``sklearn.set_config(enable_metadata_routing=True)``.<br><br>.. versionadded:: 1.6</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('memory',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=memory,-str%20or%20object%20with%20the%20joblib.Memory%20interface%2C%20default%3DNone">
                memory
                <span class="param-doc-description">memory: str or object with the joblib.Memory interface, default=None<br><br>Used to cache the fitted transformers of the pipeline. The last step<br>will never be cached, even if it is a transformer. By default, no<br>caching is performed. If a string is given, it is the path to the<br>caching directory. Enabling caching triggers a clone of the transformers<br>before fitting. Therefore, the transformer instance given to the<br>pipeline cannot be inspected directly. Use the attribute ``named_steps``<br>or ``steps`` to inspect estimators within the pipeline. Caching the<br>transformers is advantageous when fitting is time consuming. See<br>:ref:`sphx_glr_auto_examples_neighbors_plot_caching_nearest_neighbors.py`<br>for an example on how to enable caching.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.pipeline.Pipeline.html#:~:text=verbose,-bool%2C%20default%3DFalse">
                verbose
                <span class="param-doc-description">verbose: bool, default=False<br><br>If True, the time elapsed while fitting each step will be printed as it<br>is completed.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-serial"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-26" type="checkbox" ><label for="sk-estimator-id-26" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>select_cols: ColumnTransformer</div></div><div><a class="sk-estimator-doc-link " rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html">?<span>Documentation for select_cols: ColumnTransformer</span></a></div></label><div class="sk-toggleable__content " data-param-prefix="select_cols__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformers',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformers,-list%20of%20tuples">
                transformers
                <span class="param-doc-description">transformers: list of tuples<br><br>List of (name, transformer, columns) tuples specifying the<br>transformer objects to be applied to subsets of the data.<br><br>name : str<br>    Like in Pipeline and FeatureUnion, this allows the transformer and<br>    its parameters to be set using ``set_params`` and searched in grid<br>    search.<br>transformer : {'drop', 'passthrough'} or estimator<br>    Estimator must support :term:`fit` and :term:`transform`.<br>    Special-cased strings 'drop' and 'passthrough' are accepted as<br>    well, to indicate to drop the columns or to pass them through<br>    untransformed, respectively.<br>columns :  str, array-like of str, int, array-like of int,                 array-like of bool, slice or callable<br>    Indexes the data on its second axis. Integers are interpreted as<br>    positional columns, while strings can reference DataFrame columns<br>    by name.  A scalar string or int should be used where<br>    ``transformer`` expects X to be a 1d array-like (vector),<br>    otherwise a 2d array will be passed to the transformer.<br>    A callable is passed the input data `X` and can return any of the<br>    above. To select multiple columns by name or dtype, you can use<br>    :obj:`make_column_selector`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;long_and_lat&#x27;, ...)]</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('remainder',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=remainder,-%7B%27drop%27%2C%20%27passthrough%27%7D%20or%20estimator%2C%20default%3D%27drop%27">
                remainder
                <span class="param-doc-description">remainder: {'drop', 'passthrough'} or estimator, default='drop'<br><br>By default, only the specified columns in `transformers` are<br>transformed and combined in the output, and the non-specified<br>columns are dropped. (default of ``'drop'``).<br>By specifying ``remainder='passthrough'``, all remaining columns that<br>were not specified in `transformers`, but present in the data passed<br>to `fit` will be automatically passed through. This subset of columns<br>is concatenated with the output of the transformers. For dataframes,<br>extra columns not seen during `fit` will be excluded from the output<br>of `transform`.<br>By setting ``remainder`` to be an estimator, the remaining<br>non-specified columns will use the ``remainder`` estimator. The<br>estimator must support :term:`fit` and :term:`transform`.<br>Note that using this feature requires that the DataFrame columns<br>input at :term:`fit` and :term:`transform` have identical order.</span>
            </a>
        </td>
                <td class="value">&#x27;drop&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('sparse_threshold',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=sparse_threshold,-float%2C%20default%3D0.3">
                sparse_threshold
                <span class="param-doc-description">sparse_threshold: float, default=0.3<br><br>If the output of the different transformers contains sparse matrices,<br>these will be stacked as a sparse matrix if the overall density is<br>lower than this value. Use ``sparse_threshold=0`` to always return<br>dense.  When the transformed output consists of all dense data, the<br>stacked result will be dense, and this keyword will be ignored.</span>
            </a>
        </td>
                <td class="value">0.3</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>Number of jobs to run in parallel.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformer_weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformer_weights,-dict%2C%20default%3DNone">
                transformer_weights
                <span class="param-doc-description">transformer_weights: dict, default=None<br><br>Multiplicative weights for features per transformer. The output of the<br>transformer is multiplied by these weights. Keys are transformer names,<br>values the weights.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose,-bool%2C%20default%3DFalse">
                verbose
                <span class="param-doc-description">verbose: bool, default=False<br><br>If True, the time elapsed while fitting each transformer will be<br>printed as it is completed.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose_feature_names_out',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose_feature_names_out,-bool%2C%20str%20or%20Callable%5B%5Bstr%2C%20str%5D%2C%20str%5D%2C%20default%3DTrue">
                verbose_feature_names_out
                <span class="param-doc-description">verbose_feature_names_out: bool, str or Callable[[str, str], str], default=True<br><br>- If True, :meth:`ColumnTransformer.get_feature_names_out` will prefix<br>  all feature names with the name of the transformer that generated that<br>  feature. It is equivalent to setting<br>  `verbose_feature_names_out="{transformer_name}__{feature_name}"`.<br>- If False, :meth:`ColumnTransformer.get_feature_names_out` will not<br>  prefix any feature names and will error if feature names are not<br>  unique.<br>- If ``Callable[[str, str], str]``,<br>  :meth:`ColumnTransformer.get_feature_names_out` will rename all the features<br>  using the name of the transformer. The first argument of the callable is the<br>  transformer name and the second argument is the feature name. The returned<br>  string will be the new feature name.<br>- If ``str``, it must be a string ready for formatting. The given string will<br>  be formatted using two field names: ``transformer_name`` and ``feature_name``.<br>  e.g. ``"{feature_name}__{transformer_name}"``. See :meth:`str.format` method<br>  from the standard library for more info.<br><br>.. versionadded:: 1.0<br><br>.. versionchanged:: 1.6<br>    `verbose_feature_names_out` can be a callable or a string to be formatted.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('force_int_remainder_cols',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=force_int_remainder_cols,-bool%2C%20default%3DFalse">
                force_int_remainder_cols
                <span class="param-doc-description">force_int_remainder_cols: bool, default=False<br><br>This parameter has no effect.<br><br>.. note::<br>    If you do not access the list of columns for the remainder columns<br>    in the `transformers_` fitted attribute, you do not need to set<br>    this parameter.<br><br>.. versionadded:: 1.5<br><br>.. versionchanged:: 1.7<br>   The default value for `force_int_remainder_cols` will change from<br>   `True` to `False` in version 1.7.<br><br>.. deprecated:: 1.7<br>   `force_int_remainder_cols` is deprecated and will be removed in 1.9.</span>
            </a>
        </td>
                <td class="value">&#x27;deprecated&#x27;</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-27" type="checkbox" ><label for="sk-estimator-id-27" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>long_and_lat</div></div></label><div class="sk-toggleable__content " data-param-prefix="select_cols__long_and_lat__"><pre>[&#x27;Longitude&#x27;, &#x27;Latitude&#x27;]</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-28" type="checkbox" ><label for="sk-estimator-id-28" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>passthrough</div></div></label><div class="sk-toggleable__content " data-param-prefix="select_cols__long_and_lat__"><pre>passthrough</pre></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-estimator  sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-29" type="checkbox" ><label for="sk-estimator-id-29" class="sk-toggleable__label  sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link " rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a></div></label><div class="sk-toggleable__content " data-param-prefix="knn__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">5</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-13');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1289-1291

At the end of the day, it doesn’t really matter which variation we use.
Let’s go with the 2nd approach.

.. GENERATED FROM PYTHON SOURCE LINES 1293-1300

With this out of the way, what hyper-parameters do we want to use for
our KNN? In our grid search of the KNN regressor, we found that small
values of ``n_neighbors`` work best. As to the other
hyper-parameters, we already saw that there is no point in changing the
``p`` parameter from the default value of 2 to 1. For the
``weights`` parameter, we saw that low ``n_neighbors`` work
better with ``uniform``, the default, so let’s use that here.

.. GENERATED FROM PYTHON SOURCE LINES 1302-1306

By the way, ``StackingRegressor`` can also take multiple
estimators for the initial prediction. Therefore, we could pass a list
of multiple KNNs with different hyper-parameters. This would be a good
idea to test if we want to further improve the models.

.. GENERATED FROM PYTHON SOURCE LINES 1308-1310

Linear regression
^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 1312-1321

Now let’s finally get started with training and evaluating our first ML model
on the whole dataset. As a start, we try out a simple linear regression, which
often provides a good benchmark for regression tasks. Usually, with linear
regressions, we would like to normalize the data first, but sklearn’s
``LinearRegression`` already does this for us, so we don’t need to bother with
that. (Given what we found out about the distributions of some of the features
as shown in the earlier plot with the feature histograms, it would, however,
be worth to spend some time thinking about whether we could improve upon the
default preprocessing.)

.. GENERATED FROM PYTHON SOURCE LINES 1323-1326

The ``StackingRegressor`` expects a list of tuples, where the
first element of the tuple is a name and the second element is the
estimator. Plugging our different parts together, we get:

.. GENERATED FROM PYTHON SOURCE LINES 1329-1346

.. code-block:: Python

    knn_regressor = [
        (
            "knn@5",
            Pipeline(
                [
                    (
                        "select_cols",
                        ColumnTransformer(
                            [("long_and_lat", "passthrough", ["Longitude", "Latitude"])]
                        ),
                    ),
                    ("knn", KNeighborsRegressor()),
                ]
            ),
        ),
    ]


.. GENERATED FROM PYTHON SOURCE LINES 1347-1353

.. code-block:: Python

    lin = StackingRegressor(
        estimators=knn_regressor,
        final_estimator=LinearRegression(n_jobs=N_JOBS),
        passthrough=True,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1354-1356

.. code-block:: Python

    lin.fit(df_train, y_train)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-14 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-14.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-14.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-14 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-14 pre {
      padding: 0;
    }

    #sk-container-id-14 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-14 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-14 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-14 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-14 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-14 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-14 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-14 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-14 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-14 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-14 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-14 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-14 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-14 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-14 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-14 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-14 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-14 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-14 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-14 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-14 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-14 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-14 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-14 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-14 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-14 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-14 div.sk-label label.sk-toggleable__label,
    #sk-container-id-14 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-14 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-14 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-14 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-14 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-14 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-14 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-14 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-14 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-14 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-14 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-14 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-14 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-14" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>StackingRegressor(estimators=[(&#x27;knn@5&#x27;,
                                   Pipeline(steps=[(&#x27;select_cols&#x27;,
                                                    ColumnTransformer(transformers=[(&#x27;long_and_lat&#x27;,
                                                                                     &#x27;passthrough&#x27;,
                                                                                     [&#x27;Longitude&#x27;,
                                                                                      &#x27;Latitude&#x27;])])),
                                                   (&#x27;knn&#x27;,
                                                    KNeighborsRegressor())]))],
                      final_estimator=LinearRegression(n_jobs=1), passthrough=True)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-30" type="checkbox" ><label for="sk-estimator-id-30" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>StackingRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html">?<span>Documentation for StackingRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('estimators',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=estimators,-list%20of%20%28str%2C%20estimator%29">
                estimators
                <span class="param-doc-description">estimators: list of (str, estimator)<br><br>Base estimators which will be stacked together. Each element of the<br>list is defined as a tuple of string (i.e. name) and an estimator<br>instance. An estimator can be set to 'drop' using `set_params`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;knn@5&#x27;, ...)]</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('final_estimator',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=final_estimator,-estimator%2C%20default%3DNone">
                final_estimator
                <span class="param-doc-description">final_estimator: estimator, default=None<br><br>A regressor which will be used to combine the base estimators.<br>The default regressor is a :class:`~sklearn.linear_model.RidgeCV`.</span>
            </a>
        </td>
                <td class="value">LinearRegression(n_jobs=1)</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('cv',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=cv,-int%2C%20cross-validation%20generator%2C%20iterable%2C%20or%20%22prefit%22%2C%20default%3DNone">
                cv
                <span class="param-doc-description">cv: int, cross-validation generator, iterable, or "prefit", default=None<br><br>Determines the cross-validation splitting strategy used in<br>`cross_val_predict` to train `final_estimator`. Possible inputs for<br>cv are:<br><br>* None, to use the default 5-fold cross validation,<br>* integer, to specify the number of folds in a (Stratified) KFold,<br>* An object to be used as a cross-validation generator,<br>* An iterable yielding train, test splits,<br>* `"prefit"`, to assume the `estimators` are prefit. In this case, the<br>  estimators will not be refitted.<br><br>For integer/None inputs, if the estimator is a classifier and y is<br>either binary or multiclass,<br>:class:`~sklearn.model_selection.StratifiedKFold` is used.<br>In all other cases, :class:`~sklearn.model_selection.KFold` is used.<br>These splitters are instantiated with `shuffle=False` so the splits<br>will be the same across calls.<br><br>Refer :ref:`User Guide <cross_validation>` for the various<br>cross-validation strategies that can be used here.<br><br>If "prefit" is passed, it is assumed that all `estimators` have<br>been fitted already. The `final_estimator_` is trained on the `estimators`<br>predictions on the full training set and are **not** cross validated<br>predictions. Please note that if the models have been trained on the same<br>data to train the stacking model, there is a very high risk of overfitting.<br><br>.. versionadded:: 1.1<br>    The 'prefit' option was added in 1.1<br><br>.. note::<br>   A larger number of split will provide no benefits if the number<br>   of training samples is large enough. Indeed, the training time<br>   will increase. ``cv`` is not used for model evaluation but for<br>   prediction.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of jobs to run in parallel for `fit` of all `estimators`.<br>`None` means 1 unless in a `joblib.parallel_backend` context. -1 means<br>using all processors. See :term:`Glossary <n_jobs>` for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('passthrough',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=passthrough,-bool%2C%20default%3DFalse">
                passthrough
                <span class="param-doc-description">passthrough: bool, default=False<br><br>When False, only the predictions of estimators will be used as<br>training data for `final_estimator`. When True, the<br>`final_estimator` is trained on the predictions as well as the<br>original training data.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=verbose,-int%2C%20default%3D0">
                verbose
                <span class="param-doc-description">verbose: int, default=0<br><br>Verbosity level.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>knn@5</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-serial"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-31" type="checkbox" ><label for="sk-estimator-id-31" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>select_cols: ColumnTransformer</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html">?<span>Documentation for select_cols: ColumnTransformer</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformers',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformers,-list%20of%20tuples">
                transformers
                <span class="param-doc-description">transformers: list of tuples<br><br>List of (name, transformer, columns) tuples specifying the<br>transformer objects to be applied to subsets of the data.<br><br>name : str<br>    Like in Pipeline and FeatureUnion, this allows the transformer and<br>    its parameters to be set using ``set_params`` and searched in grid<br>    search.<br>transformer : {'drop', 'passthrough'} or estimator<br>    Estimator must support :term:`fit` and :term:`transform`.<br>    Special-cased strings 'drop' and 'passthrough' are accepted as<br>    well, to indicate to drop the columns or to pass them through<br>    untransformed, respectively.<br>columns :  str, array-like of str, int, array-like of int,                 array-like of bool, slice or callable<br>    Indexes the data on its second axis. Integers are interpreted as<br>    positional columns, while strings can reference DataFrame columns<br>    by name.  A scalar string or int should be used where<br>    ``transformer`` expects X to be a 1d array-like (vector),<br>    otherwise a 2d array will be passed to the transformer.<br>    A callable is passed the input data `X` and can return any of the<br>    above. To select multiple columns by name or dtype, you can use<br>    :obj:`make_column_selector`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;long_and_lat&#x27;, ...)]</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('remainder',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=remainder,-%7B%27drop%27%2C%20%27passthrough%27%7D%20or%20estimator%2C%20default%3D%27drop%27">
                remainder
                <span class="param-doc-description">remainder: {'drop', 'passthrough'} or estimator, default='drop'<br><br>By default, only the specified columns in `transformers` are<br>transformed and combined in the output, and the non-specified<br>columns are dropped. (default of ``'drop'``).<br>By specifying ``remainder='passthrough'``, all remaining columns that<br>were not specified in `transformers`, but present in the data passed<br>to `fit` will be automatically passed through. This subset of columns<br>is concatenated with the output of the transformers. For dataframes,<br>extra columns not seen during `fit` will be excluded from the output<br>of `transform`.<br>By setting ``remainder`` to be an estimator, the remaining<br>non-specified columns will use the ``remainder`` estimator. The<br>estimator must support :term:`fit` and :term:`transform`.<br>Note that using this feature requires that the DataFrame columns<br>input at :term:`fit` and :term:`transform` have identical order.</span>
            </a>
        </td>
                <td class="value">&#x27;drop&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('sparse_threshold',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=sparse_threshold,-float%2C%20default%3D0.3">
                sparse_threshold
                <span class="param-doc-description">sparse_threshold: float, default=0.3<br><br>If the output of the different transformers contains sparse matrices,<br>these will be stacked as a sparse matrix if the overall density is<br>lower than this value. Use ``sparse_threshold=0`` to always return<br>dense.  When the transformed output consists of all dense data, the<br>stacked result will be dense, and this keyword will be ignored.</span>
            </a>
        </td>
                <td class="value">0.3</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>Number of jobs to run in parallel.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformer_weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformer_weights,-dict%2C%20default%3DNone">
                transformer_weights
                <span class="param-doc-description">transformer_weights: dict, default=None<br><br>Multiplicative weights for features per transformer. The output of the<br>transformer is multiplied by these weights. Keys are transformer names,<br>values the weights.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose,-bool%2C%20default%3DFalse">
                verbose
                <span class="param-doc-description">verbose: bool, default=False<br><br>If True, the time elapsed while fitting each transformer will be<br>printed as it is completed.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose_feature_names_out',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose_feature_names_out,-bool%2C%20str%20or%20Callable%5B%5Bstr%2C%20str%5D%2C%20str%5D%2C%20default%3DTrue">
                verbose_feature_names_out
                <span class="param-doc-description">verbose_feature_names_out: bool, str or Callable[[str, str], str], default=True<br><br>- If True, :meth:`ColumnTransformer.get_feature_names_out` will prefix<br>  all feature names with the name of the transformer that generated that<br>  feature. It is equivalent to setting<br>  `verbose_feature_names_out="{transformer_name}__{feature_name}"`.<br>- If False, :meth:`ColumnTransformer.get_feature_names_out` will not<br>  prefix any feature names and will error if feature names are not<br>  unique.<br>- If ``Callable[[str, str], str]``,<br>  :meth:`ColumnTransformer.get_feature_names_out` will rename all the features<br>  using the name of the transformer. The first argument of the callable is the<br>  transformer name and the second argument is the feature name. The returned<br>  string will be the new feature name.<br>- If ``str``, it must be a string ready for formatting. The given string will<br>  be formatted using two field names: ``transformer_name`` and ``feature_name``.<br>  e.g. ``"{feature_name}__{transformer_name}"``. See :meth:`str.format` method<br>  from the standard library for more info.<br><br>.. versionadded:: 1.0<br><br>.. versionchanged:: 1.6<br>    `verbose_feature_names_out` can be a callable or a string to be formatted.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('force_int_remainder_cols',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=force_int_remainder_cols,-bool%2C%20default%3DFalse">
                force_int_remainder_cols
                <span class="param-doc-description">force_int_remainder_cols: bool, default=False<br><br>This parameter has no effect.<br><br>.. note::<br>    If you do not access the list of columns for the remainder columns<br>    in the `transformers_` fitted attribute, you do not need to set<br>    this parameter.<br><br>.. versionadded:: 1.5<br><br>.. versionchanged:: 1.7<br>   The default value for `force_int_remainder_cols` will change from<br>   `True` to `False` in version 1.7.<br><br>.. deprecated:: 1.7<br>   `force_int_remainder_cols` is deprecated and will be removed in 1.9.</span>
            </a>
        </td>
                <td class="value">&#x27;deprecated&#x27;</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-32" type="checkbox" ><label for="sk-estimator-id-32" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>long_and_lat</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>[&#x27;Longitude&#x27;, &#x27;Latitude&#x27;]</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-33" type="checkbox" ><label for="sk-estimator-id-33" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>passthrough</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>passthrough</pre></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-34" type="checkbox" ><label for="sk-estimator-id-34" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__knn__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">5</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>final_estimator</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-35" type="checkbox" ><label for="sk-estimator-id-35" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>LinearRegression</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html">?<span>Documentation for LinearRegression</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="final_estimator__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('fit_intercept',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=fit_intercept,-bool%2C%20default%3DTrue">
                fit_intercept
                <span class="param-doc-description">fit_intercept: bool, default=True<br><br>Whether to calculate the intercept for this model. If set<br>to False, no intercept will be used in calculations<br>(i.e. data is expected to be centered).</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('copy_X',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=copy_X,-bool%2C%20default%3DTrue">
                copy_X
                <span class="param-doc-description">copy_X: bool, default=True<br><br>If True, X will be copied; else, it may be overwritten.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('tol',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=tol,-float%2C%20default%3D1e-6">
                tol
                <span class="param-doc-description">tol: float, default=1e-6<br><br>The precision of the solution (`coef_`) is determined by `tol` which<br>specifies a different convergence criterion for the `lsqr` solver.<br>`tol` is set as `atol` and `btol` of :func:`scipy.sparse.linalg.lsqr` when<br>fitting on sparse training data. This parameter has no effect when fitting<br>on dense data.<br><br>.. versionadded:: 1.7</span>
            </a>
        </td>
                <td class="value">1e-06</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of jobs to use for the computation. This will only provide<br>speedup in case of sufficiently large problems, that is if firstly<br>`n_targets > 1` and secondly `X` is sparse or if `positive` is set<br>to `True`. ``None`` means 1 unless in a<br>:obj:`joblib.parallel_backend` context. ``-1`` means using all<br>processors. See :term:`Glossary <n_jobs>` for more details.</span>
            </a>
        </td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('positive',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=positive,-bool%2C%20default%3DFalse">
                positive
                <span class="param-doc-description">positive: bool, default=False<br><br>When set to ``True``, forces the coefficients to be positive. This<br>option is only supported for dense arrays.<br><br>For a comparison between a linear regression model with positive constraints<br>on the regression coefficients and a linear regression without such constraints,<br>see :ref:`sphx_glr_auto_examples_linear_model_plot_nnls.py`.<br><br>.. versionadded:: 0.24</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-14');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1357-1359

.. code-block:: Python

    -get_scorer("neg_root_mean_squared_error")(lin, df_test, y_test)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    44316.26978449051


.. GENERATED FROM PYTHON SOURCE LINES 1360-1365

The final score is already quite an improvement over the results from
the dummy model, so we can be happy about that. Moreover, if we compare
this score the score we got when we trained a KNN purely on longitude
and latitude, it’s also much better, which confirms our decision that
using the other features is helpful for the model.

.. GENERATED FROM PYTHON SOURCE LINES 1367-1370

When comparing the results from other people, it also doesn’t look too
bad, but we have to keep in mind that the datasets and preprocessing
steps are not identical, so differences should be expected.

.. GENERATED FROM PYTHON SOURCE LINES 1372-1374

Just out of curiosity, let’s check the score without using the KNN
predictions as features:

.. GENERATED FROM PYTHON SOURCE LINES 1376-1378

.. code-block:: Python

    lin_raw = LinearRegression(n_jobs=N_JOBS)


.. GENERATED FROM PYTHON SOURCE LINES 1379-1381

.. code-block:: Python

    lin_raw.fit(df_train, y_train)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-15 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-15.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-15.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-15 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-15 pre {
      padding: 0;
    }

    #sk-container-id-15 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-15 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-15 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-15 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-15 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-15 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-15 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-15 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-15 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-15 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-15 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-15 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-15 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-15 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-15 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-15 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-15 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-15 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-15 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-15 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-15 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-15 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-15 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-15 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-15 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-15 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-15 div.sk-label label.sk-toggleable__label,
    #sk-container-id-15 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-15 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-15 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-15 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-15 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-15 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-15 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-15 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-15 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-15 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-15 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-15 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-15 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-15" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>LinearRegression(n_jobs=1)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-36" type="checkbox" checked><label for="sk-estimator-id-36" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>LinearRegression</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html">?<span>Documentation for LinearRegression</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('fit_intercept',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=fit_intercept,-bool%2C%20default%3DTrue">
                fit_intercept
                <span class="param-doc-description">fit_intercept: bool, default=True<br><br>Whether to calculate the intercept for this model. If set<br>to False, no intercept will be used in calculations<br>(i.e. data is expected to be centered).</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('copy_X',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=copy_X,-bool%2C%20default%3DTrue">
                copy_X
                <span class="param-doc-description">copy_X: bool, default=True<br><br>If True, X will be copied; else, it may be overwritten.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('tol',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=tol,-float%2C%20default%3D1e-6">
                tol
                <span class="param-doc-description">tol: float, default=1e-6<br><br>The precision of the solution (`coef_`) is determined by `tol` which<br>specifies a different convergence criterion for the `lsqr` solver.<br>`tol` is set as `atol` and `btol` of :func:`scipy.sparse.linalg.lsqr` when<br>fitting on sparse training data. This parameter has no effect when fitting<br>on dense data.<br><br>.. versionadded:: 1.7</span>
            </a>
        </td>
                <td class="value">1e-06</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of jobs to use for the computation. This will only provide<br>speedup in case of sufficiently large problems, that is if firstly<br>`n_targets > 1` and secondly `X` is sparse or if `positive` is set<br>to `True`. ``None`` means 1 unless in a<br>:obj:`joblib.parallel_backend` context. ``-1`` means using all<br>processors. See :term:`Glossary <n_jobs>` for more details.</span>
            </a>
        </td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('positive',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.linear_model.LinearRegression.html#:~:text=positive,-bool%2C%20default%3DFalse">
                positive
                <span class="param-doc-description">positive: bool, default=False<br><br>When set to ``True``, forces the coefficients to be positive. This<br>option is only supported for dense arrays.<br><br>For a comparison between a linear regression model with positive constraints<br>on the regression coefficients and a linear regression without such constraints,<br>see :ref:`sphx_glr_auto_examples_linear_model_plot_nnls.py`.<br><br>.. versionadded:: 0.24</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-15');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1382-1384

.. code-block:: Python

    -get_scorer("neg_root_mean_squared_error")(lin_raw, df_test, y_test)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    66647.75686249179


.. GENERATED FROM PYTHON SOURCE LINES 1385-1390

We see a quite substantial increase in the prediction error. This
shouldn’t be too surprising. When the linear regressor tries to fit
longitude and latitude, the only thing it can do is try to fit a plane
on top of it, which is far too simple to fit the geospatial patterns we
observed.

.. GENERATED FROM PYTHON SOURCE LINES 1392-1394

Random forest
^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 1396-1399

Next let’s use our first decision tree-based model, the
``RandomForestRegressor``. For this, we use the same approach as
above, we only need to swap the ``final_estimator``:

.. GENERATED FROM PYTHON SOURCE LINES 1401-1409

.. code-block:: Python

    rf = StackingRegressor(
        estimators=knn_regressor,
        final_estimator=RandomForestRegressor(
            n_estimators=100, random_state=0, n_jobs=N_JOBS
        ),
        passthrough=True,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1410-1412

.. code-block:: Python

    rf.fit(df_train, y_train)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-16 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-16.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-16.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-16 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-16 pre {
      padding: 0;
    }

    #sk-container-id-16 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-16 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-16 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-16 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-16 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-16 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-16 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-16 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-16 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-16 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-16 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-16 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-16 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-16 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-16 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-16 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-16 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-16 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-16 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-16 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-16 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-16 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-16 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-16 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-16 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-16 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-16 div.sk-label label.sk-toggleable__label,
    #sk-container-id-16 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-16 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-16 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-16 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-16 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-16 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-16 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-16 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-16 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-16 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-16 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-16 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-16 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-16" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>StackingRegressor(estimators=[(&#x27;knn@5&#x27;,
                                   Pipeline(steps=[(&#x27;select_cols&#x27;,
                                                    ColumnTransformer(transformers=[(&#x27;long_and_lat&#x27;,
                                                                                     &#x27;passthrough&#x27;,
                                                                                     [&#x27;Longitude&#x27;,
                                                                                      &#x27;Latitude&#x27;])])),
                                                   (&#x27;knn&#x27;,
                                                    KNeighborsRegressor())]))],
                      final_estimator=RandomForestRegressor(n_jobs=1,
                                                            random_state=0),
                      passthrough=True)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-37" type="checkbox" ><label for="sk-estimator-id-37" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>StackingRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html">?<span>Documentation for StackingRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('estimators',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=estimators,-list%20of%20%28str%2C%20estimator%29">
                estimators
                <span class="param-doc-description">estimators: list of (str, estimator)<br><br>Base estimators which will be stacked together. Each element of the<br>list is defined as a tuple of string (i.e. name) and an estimator<br>instance. An estimator can be set to 'drop' using `set_params`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;knn@5&#x27;, ...)]</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('final_estimator',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=final_estimator,-estimator%2C%20default%3DNone">
                final_estimator
                <span class="param-doc-description">final_estimator: estimator, default=None<br><br>A regressor which will be used to combine the base estimators.<br>The default regressor is a :class:`~sklearn.linear_model.RidgeCV`.</span>
            </a>
        </td>
                <td class="value">RandomForestR...andom_state=0)</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('cv',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=cv,-int%2C%20cross-validation%20generator%2C%20iterable%2C%20or%20%22prefit%22%2C%20default%3DNone">
                cv
                <span class="param-doc-description">cv: int, cross-validation generator, iterable, or "prefit", default=None<br><br>Determines the cross-validation splitting strategy used in<br>`cross_val_predict` to train `final_estimator`. Possible inputs for<br>cv are:<br><br>* None, to use the default 5-fold cross validation,<br>* integer, to specify the number of folds in a (Stratified) KFold,<br>* An object to be used as a cross-validation generator,<br>* An iterable yielding train, test splits,<br>* `"prefit"`, to assume the `estimators` are prefit. In this case, the<br>  estimators will not be refitted.<br><br>For integer/None inputs, if the estimator is a classifier and y is<br>either binary or multiclass,<br>:class:`~sklearn.model_selection.StratifiedKFold` is used.<br>In all other cases, :class:`~sklearn.model_selection.KFold` is used.<br>These splitters are instantiated with `shuffle=False` so the splits<br>will be the same across calls.<br><br>Refer :ref:`User Guide <cross_validation>` for the various<br>cross-validation strategies that can be used here.<br><br>If "prefit" is passed, it is assumed that all `estimators` have<br>been fitted already. The `final_estimator_` is trained on the `estimators`<br>predictions on the full training set and are **not** cross validated<br>predictions. Please note that if the models have been trained on the same<br>data to train the stacking model, there is a very high risk of overfitting.<br><br>.. versionadded:: 1.1<br>    The 'prefit' option was added in 1.1<br><br>.. note::<br>   A larger number of split will provide no benefits if the number<br>   of training samples is large enough. Indeed, the training time<br>   will increase. ``cv`` is not used for model evaluation but for<br>   prediction.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of jobs to run in parallel for `fit` of all `estimators`.<br>`None` means 1 unless in a `joblib.parallel_backend` context. -1 means<br>using all processors. See :term:`Glossary <n_jobs>` for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('passthrough',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=passthrough,-bool%2C%20default%3DFalse">
                passthrough
                <span class="param-doc-description">passthrough: bool, default=False<br><br>When False, only the predictions of estimators will be used as<br>training data for `final_estimator`. When True, the<br>`final_estimator` is trained on the predictions as well as the<br>original training data.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=verbose,-int%2C%20default%3D0">
                verbose
                <span class="param-doc-description">verbose: int, default=0<br><br>Verbosity level.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>knn@5</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-serial"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-38" type="checkbox" ><label for="sk-estimator-id-38" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>select_cols: ColumnTransformer</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html">?<span>Documentation for select_cols: ColumnTransformer</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformers',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformers,-list%20of%20tuples">
                transformers
                <span class="param-doc-description">transformers: list of tuples<br><br>List of (name, transformer, columns) tuples specifying the<br>transformer objects to be applied to subsets of the data.<br><br>name : str<br>    Like in Pipeline and FeatureUnion, this allows the transformer and<br>    its parameters to be set using ``set_params`` and searched in grid<br>    search.<br>transformer : {'drop', 'passthrough'} or estimator<br>    Estimator must support :term:`fit` and :term:`transform`.<br>    Special-cased strings 'drop' and 'passthrough' are accepted as<br>    well, to indicate to drop the columns or to pass them through<br>    untransformed, respectively.<br>columns :  str, array-like of str, int, array-like of int,                 array-like of bool, slice or callable<br>    Indexes the data on its second axis. Integers are interpreted as<br>    positional columns, while strings can reference DataFrame columns<br>    by name.  A scalar string or int should be used where<br>    ``transformer`` expects X to be a 1d array-like (vector),<br>    otherwise a 2d array will be passed to the transformer.<br>    A callable is passed the input data `X` and can return any of the<br>    above. To select multiple columns by name or dtype, you can use<br>    :obj:`make_column_selector`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;long_and_lat&#x27;, ...)]</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('remainder',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=remainder,-%7B%27drop%27%2C%20%27passthrough%27%7D%20or%20estimator%2C%20default%3D%27drop%27">
                remainder
                <span class="param-doc-description">remainder: {'drop', 'passthrough'} or estimator, default='drop'<br><br>By default, only the specified columns in `transformers` are<br>transformed and combined in the output, and the non-specified<br>columns are dropped. (default of ``'drop'``).<br>By specifying ``remainder='passthrough'``, all remaining columns that<br>were not specified in `transformers`, but present in the data passed<br>to `fit` will be automatically passed through. This subset of columns<br>is concatenated with the output of the transformers. For dataframes,<br>extra columns not seen during `fit` will be excluded from the output<br>of `transform`.<br>By setting ``remainder`` to be an estimator, the remaining<br>non-specified columns will use the ``remainder`` estimator. The<br>estimator must support :term:`fit` and :term:`transform`.<br>Note that using this feature requires that the DataFrame columns<br>input at :term:`fit` and :term:`transform` have identical order.</span>
            </a>
        </td>
                <td class="value">&#x27;drop&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('sparse_threshold',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=sparse_threshold,-float%2C%20default%3D0.3">
                sparse_threshold
                <span class="param-doc-description">sparse_threshold: float, default=0.3<br><br>If the output of the different transformers contains sparse matrices,<br>these will be stacked as a sparse matrix if the overall density is<br>lower than this value. Use ``sparse_threshold=0`` to always return<br>dense.  When the transformed output consists of all dense data, the<br>stacked result will be dense, and this keyword will be ignored.</span>
            </a>
        </td>
                <td class="value">0.3</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>Number of jobs to run in parallel.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformer_weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformer_weights,-dict%2C%20default%3DNone">
                transformer_weights
                <span class="param-doc-description">transformer_weights: dict, default=None<br><br>Multiplicative weights for features per transformer. The output of the<br>transformer is multiplied by these weights. Keys are transformer names,<br>values the weights.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose,-bool%2C%20default%3DFalse">
                verbose
                <span class="param-doc-description">verbose: bool, default=False<br><br>If True, the time elapsed while fitting each transformer will be<br>printed as it is completed.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose_feature_names_out',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose_feature_names_out,-bool%2C%20str%20or%20Callable%5B%5Bstr%2C%20str%5D%2C%20str%5D%2C%20default%3DTrue">
                verbose_feature_names_out
                <span class="param-doc-description">verbose_feature_names_out: bool, str or Callable[[str, str], str], default=True<br><br>- If True, :meth:`ColumnTransformer.get_feature_names_out` will prefix<br>  all feature names with the name of the transformer that generated that<br>  feature. It is equivalent to setting<br>  `verbose_feature_names_out="{transformer_name}__{feature_name}"`.<br>- If False, :meth:`ColumnTransformer.get_feature_names_out` will not<br>  prefix any feature names and will error if feature names are not<br>  unique.<br>- If ``Callable[[str, str], str]``,<br>  :meth:`ColumnTransformer.get_feature_names_out` will rename all the features<br>  using the name of the transformer. The first argument of the callable is the<br>  transformer name and the second argument is the feature name. The returned<br>  string will be the new feature name.<br>- If ``str``, it must be a string ready for formatting. The given string will<br>  be formatted using two field names: ``transformer_name`` and ``feature_name``.<br>  e.g. ``"{feature_name}__{transformer_name}"``. See :meth:`str.format` method<br>  from the standard library for more info.<br><br>.. versionadded:: 1.0<br><br>.. versionchanged:: 1.6<br>    `verbose_feature_names_out` can be a callable or a string to be formatted.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('force_int_remainder_cols',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=force_int_remainder_cols,-bool%2C%20default%3DFalse">
                force_int_remainder_cols
                <span class="param-doc-description">force_int_remainder_cols: bool, default=False<br><br>This parameter has no effect.<br><br>.. note::<br>    If you do not access the list of columns for the remainder columns<br>    in the `transformers_` fitted attribute, you do not need to set<br>    this parameter.<br><br>.. versionadded:: 1.5<br><br>.. versionchanged:: 1.7<br>   The default value for `force_int_remainder_cols` will change from<br>   `True` to `False` in version 1.7.<br><br>.. deprecated:: 1.7<br>   `force_int_remainder_cols` is deprecated and will be removed in 1.9.</span>
            </a>
        </td>
                <td class="value">&#x27;deprecated&#x27;</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-39" type="checkbox" ><label for="sk-estimator-id-39" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>long_and_lat</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>[&#x27;Longitude&#x27;, &#x27;Latitude&#x27;]</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-40" type="checkbox" ><label for="sk-estimator-id-40" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>passthrough</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>passthrough</pre></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-41" type="checkbox" ><label for="sk-estimator-id-41" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__knn__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">5</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>final_estimator</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-42" type="checkbox" ><label for="sk-estimator-id-42" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>RandomForestRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html">?<span>Documentation for RandomForestRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="final_estimator__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_estimators',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=n_estimators,-int%2C%20default%3D100">
                n_estimators
                <span class="param-doc-description">n_estimators: int, default=100<br><br>The number of trees in the forest.<br><br>.. versionchanged:: 0.22<br>   The default value of ``n_estimators`` changed from 10 to 100<br>   in 0.22.</span>
            </a>
        </td>
                <td class="value">100</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('criterion',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=criterion,-%7B%22squared_error%22%2C%20%22absolute_error%22%2C%20%22friedman_mse%22%2C%20%22poisson%22%7D%2C%20%20%20%20%20%20%20%20%20%20%20%20%20default%3D%22squared_error%22">
                criterion
                <span class="param-doc-description">criterion: {"squared_error", "absolute_error", "friedman_mse", "poisson"},             default="squared_error"<br><br>The function to measure the quality of a split. Supported criteria<br>are "squared_error" for the mean squared error, which is equal to<br>variance reduction as feature selection criterion and minimizes the L2<br>loss using the mean of each terminal node, "friedman_mse", which uses<br>mean squared error with Friedman's improvement score for potential<br>splits, "absolute_error" for the mean absolute error, which minimizes<br>the L1 loss using the median of each terminal node, and "poisson" which<br>uses reduction in Poisson deviance to find splits.<br>Training using "absolute_error" is significantly slower<br>than when using "squared_error".<br><br>.. versionadded:: 0.18<br>   Mean Absolute Error (MAE) criterion.<br><br>.. versionadded:: 1.0<br>   Poisson criterion.</span>
            </a>
        </td>
                <td class="value">&#x27;squared_error&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_depth',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=max_depth,-int%2C%20default%3DNone">
                max_depth
                <span class="param-doc-description">max_depth: int, default=None<br><br>The maximum depth of the tree. If None, then nodes are expanded until<br>all leaves are pure or until all leaves contain less than<br>min_samples_split samples.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_samples_split',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=min_samples_split,-int%20or%20float%2C%20default%3D2">
                min_samples_split
                <span class="param-doc-description">min_samples_split: int or float, default=2<br><br>The minimum number of samples required to split an internal node:<br><br>- If int, then consider `min_samples_split` as the minimum number.<br>- If float, then `min_samples_split` is a fraction and<br>  `ceil(min_samples_split * n_samples)` are the minimum<br>  number of samples for each split.<br><br>.. versionchanged:: 0.18<br>   Added float values for fractions.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_samples_leaf',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=min_samples_leaf,-int%20or%20float%2C%20default%3D1">
                min_samples_leaf
                <span class="param-doc-description">min_samples_leaf: int or float, default=1<br><br>The minimum number of samples required to be at a leaf node.<br>A split point at any depth will only be considered if it leaves at<br>least ``min_samples_leaf`` training samples in each of the left and<br>right branches.  This may have the effect of smoothing the model,<br>especially in regression.<br><br>- If int, then consider `min_samples_leaf` as the minimum number.<br>- If float, then `min_samples_leaf` is a fraction and<br>  `ceil(min_samples_leaf * n_samples)` are the minimum<br>  number of samples for each node.<br><br>.. versionchanged:: 0.18<br>   Added float values for fractions.</span>
            </a>
        </td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_weight_fraction_leaf',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=min_weight_fraction_leaf,-float%2C%20default%3D0.0">
                min_weight_fraction_leaf
                <span class="param-doc-description">min_weight_fraction_leaf: float, default=0.0<br><br>The minimum weighted fraction of the sum total of weights (of all<br>the input samples) required to be at a leaf node. Samples have<br>equal weight when sample_weight is not provided.</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_features',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=max_features,-%7B%22sqrt%22%2C%20%22log2%22%2C%20None%7D%2C%20int%20or%20float%2C%20default%3D1.0">
                max_features
                <span class="param-doc-description">max_features: {"sqrt", "log2", None}, int or float, default=1.0<br><br>The number of features to consider when looking for the best split:<br><br>- If int, then consider `max_features` features at each split.<br>- If float, then `max_features` is a fraction and<br>  `max(1, int(max_features * n_features_in_))` features are considered at each<br>  split.<br>- If "sqrt", then `max_features=sqrt(n_features)`.<br>- If "log2", then `max_features=log2(n_features)`.<br>- If None or 1.0, then `max_features=n_features`.<br><br>.. note::<br>    The default of 1.0 is equivalent to bagged trees and more<br>    randomness can be achieved by setting smaller values, e.g. 0.3.<br><br>.. versionchanged:: 1.1<br>    The default of `max_features` changed from `"auto"` to 1.0.<br><br>Note: the search for a split does not stop until at least one<br>valid partition of the node samples is found, even if it requires to<br>effectively inspect more than ``max_features`` features.</span>
            </a>
        </td>
                <td class="value">1.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_leaf_nodes',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=max_leaf_nodes,-int%2C%20default%3DNone">
                max_leaf_nodes
                <span class="param-doc-description">max_leaf_nodes: int, default=None<br><br>Grow trees with ``max_leaf_nodes`` in best-first fashion.<br>Best nodes are defined as relative reduction in impurity.<br>If None then unlimited number of leaf nodes.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_impurity_decrease',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=min_impurity_decrease,-float%2C%20default%3D0.0">
                min_impurity_decrease
                <span class="param-doc-description">min_impurity_decrease: float, default=0.0<br><br>A node will be split if this split induces a decrease of the impurity<br>greater than or equal to this value.<br><br>The weighted impurity decrease equation is the following::<br><br>    N_t / N * (impurity - N_t_R / N_t * right_impurity<br>                        - N_t_L / N_t * left_impurity)<br><br>where ``N`` is the total number of samples, ``N_t`` is the number of<br>samples at the current node, ``N_t_L`` is the number of samples in the<br>left child, and ``N_t_R`` is the number of samples in the right child.<br><br>``N``, ``N_t``, ``N_t_R`` and ``N_t_L`` all refer to the weighted sum,<br>if ``sample_weight`` is passed.<br><br>.. versionadded:: 0.19</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('bootstrap',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=bootstrap,-bool%2C%20default%3DTrue">
                bootstrap
                <span class="param-doc-description">bootstrap: bool, default=True<br><br>Whether bootstrap samples are used when building trees. If False, the<br>whole dataset is used to build each tree.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('oob_score',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=oob_score,-bool%20or%20callable%2C%20default%3DFalse">
                oob_score
                <span class="param-doc-description">oob_score: bool or callable, default=False<br><br>Whether to use out-of-bag samples to estimate the generalization score.<br>By default, :func:`~sklearn.metrics.r2_score` is used.<br>Provide a callable with signature `metric(y_true, y_pred)` to use a<br>custom metric. Only available if `bootstrap=True`.<br><br>For an illustration of out-of-bag (OOB) error estimation, see the example<br>:ref:`sphx_glr_auto_examples_ensemble_plot_ensemble_oob.py`.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of jobs to run in parallel. :meth:`fit`, :meth:`predict`,<br>:meth:`decision_path` and :meth:`apply` are all parallelized over the<br>trees. ``None`` means 1 unless in a :obj:`joblib.parallel_backend`<br>context. ``-1`` means using all processors. See :term:`Glossary<br><n_jobs>` for more details.</span>
            </a>
        </td>
                <td class="value">1</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('random_state',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=random_state,-int%2C%20RandomState%20instance%20or%20None%2C%20default%3DNone">
                random_state
                <span class="param-doc-description">random_state: int, RandomState instance or None, default=None<br><br>Controls both the randomness of the bootstrapping of the samples used<br>when building trees (if ``bootstrap=True``) and the sampling of the<br>features to consider when looking for the best split at each node<br>(if ``max_features < n_features``).<br>See :term:`Glossary <random_state>` for details.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=verbose,-int%2C%20default%3D0">
                verbose
                <span class="param-doc-description">verbose: int, default=0<br><br>Controls the verbosity when fitting and predicting.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('warm_start',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=warm_start,-bool%2C%20default%3DFalse">
                warm_start
                <span class="param-doc-description">warm_start: bool, default=False<br><br>When set to ``True``, reuse the solution of the previous call to fit<br>and add more estimators to the ensemble, otherwise, just fit a whole<br>new forest. See :term:`Glossary <warm_start>` and<br>:ref:`tree_ensemble_warm_start` for details.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('ccp_alpha',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=ccp_alpha,-non-negative%20float%2C%20default%3D0.0">
                ccp_alpha
                <span class="param-doc-description">ccp_alpha: non-negative float, default=0.0<br><br>Complexity parameter used for Minimal Cost-Complexity Pruning. The<br>subtree with the largest cost complexity that is smaller than<br>``ccp_alpha`` will be chosen. By default, no pruning is performed. See<br>:ref:`minimal_cost_complexity_pruning` for details. See<br>:ref:`sphx_glr_auto_examples_tree_plot_cost_complexity_pruning.py`<br>for an example of such pruning.<br><br>.. versionadded:: 0.22</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_samples',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=max_samples,-int%20or%20float%2C%20default%3DNone">
                max_samples
                <span class="param-doc-description">max_samples: int or float, default=None<br><br>If bootstrap is True, the number of samples to draw from X<br>to train each base estimator.<br><br>- If None (default), then draw `X.shape[0]` samples.<br>- If int, then draw `max_samples` samples.<br>- If float, then draw `max(round(n_samples * max_samples), 1)` samples. Thus,<br>  `max_samples` should be in the interval `(0.0, 1.0]`.<br><br>.. versionadded:: 0.22</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('monotonic_cst',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.RandomForestRegressor.html#:~:text=monotonic_cst,-array-like%20of%20int%20of%20shape%20%28n_features%29%2C%20default%3DNone">
                monotonic_cst
                <span class="param-doc-description">monotonic_cst: array-like of int of shape (n_features), default=None<br><br>Indicates the monotonicity constraint to enforce on each feature.<br>  - 1: monotonically increasing<br>  - 0: no constraint<br>  - -1: monotonically decreasing<br><br>If monotonic_cst is None, no constraints are applied.<br><br>Monotonicity constraints are not supported for:<br>  - multioutput regressions (i.e. when `n_outputs_ > 1`),<br>  - regressions trained on data with missing values.<br><br>Read more in the :ref:`User Guide <monotonic_cst_gbdt>`.<br><br>.. versionadded:: 1.4</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-16');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1413-1415

.. code-block:: Python

    -get_scorer("neg_root_mean_squared_error")(rf, df_test, y_test)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    42136.85505534634


.. GENERATED FROM PYTHON SOURCE LINES 1416-1418

We can see a nice, but not huge, improvement over using the
``LinearRegressor``.

.. GENERATED FROM PYTHON SOURCE LINES 1420-1422

Gradient boosted decision trees
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 1424-1425

Finally, let’s use sklearn’s ``GradientBoostingRegressor``:

.. GENERATED FROM PYTHON SOURCE LINES 1427-1433

.. code-block:: Python

    gb = StackingRegressor(
        estimators=knn_regressor,
        final_estimator=GradientBoostingRegressor(n_estimators=100, random_state=0),
        passthrough=True,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1434-1436

.. code-block:: Python

    gb.fit(df_train, y_train)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-17 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-17.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-17.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-17 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-17 pre {
      padding: 0;
    }

    #sk-container-id-17 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-17 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-17 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-17 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-17 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-17 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-17 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-17 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-17 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-17 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-17 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-17 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-17 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-17 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-17 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-17 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-17 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-17 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-17 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-17 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-17 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-17 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-17 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-17 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-17 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-17 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-17 div.sk-label label.sk-toggleable__label,
    #sk-container-id-17 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-17 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-17 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-17 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-17 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-17 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-17 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-17 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-17 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-17 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-17 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-17 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-17 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-17" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>StackingRegressor(estimators=[(&#x27;knn@5&#x27;,
                                   Pipeline(steps=[(&#x27;select_cols&#x27;,
                                                    ColumnTransformer(transformers=[(&#x27;long_and_lat&#x27;,
                                                                                     &#x27;passthrough&#x27;,
                                                                                     [&#x27;Longitude&#x27;,
                                                                                      &#x27;Latitude&#x27;])])),
                                                   (&#x27;knn&#x27;,
                                                    KNeighborsRegressor())]))],
                      final_estimator=GradientBoostingRegressor(random_state=0),
                      passthrough=True)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-43" type="checkbox" ><label for="sk-estimator-id-43" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>StackingRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html">?<span>Documentation for StackingRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('estimators',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=estimators,-list%20of%20%28str%2C%20estimator%29">
                estimators
                <span class="param-doc-description">estimators: list of (str, estimator)<br><br>Base estimators which will be stacked together. Each element of the<br>list is defined as a tuple of string (i.e. name) and an estimator<br>instance. An estimator can be set to 'drop' using `set_params`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;knn@5&#x27;, ...)]</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('final_estimator',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=final_estimator,-estimator%2C%20default%3DNone">
                final_estimator
                <span class="param-doc-description">final_estimator: estimator, default=None<br><br>A regressor which will be used to combine the base estimators.<br>The default regressor is a :class:`~sklearn.linear_model.RidgeCV`.</span>
            </a>
        </td>
                <td class="value">GradientBoost...andom_state=0)</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('cv',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=cv,-int%2C%20cross-validation%20generator%2C%20iterable%2C%20or%20%22prefit%22%2C%20default%3DNone">
                cv
                <span class="param-doc-description">cv: int, cross-validation generator, iterable, or "prefit", default=None<br><br>Determines the cross-validation splitting strategy used in<br>`cross_val_predict` to train `final_estimator`. Possible inputs for<br>cv are:<br><br>* None, to use the default 5-fold cross validation,<br>* integer, to specify the number of folds in a (Stratified) KFold,<br>* An object to be used as a cross-validation generator,<br>* An iterable yielding train, test splits,<br>* `"prefit"`, to assume the `estimators` are prefit. In this case, the<br>  estimators will not be refitted.<br><br>For integer/None inputs, if the estimator is a classifier and y is<br>either binary or multiclass,<br>:class:`~sklearn.model_selection.StratifiedKFold` is used.<br>In all other cases, :class:`~sklearn.model_selection.KFold` is used.<br>These splitters are instantiated with `shuffle=False` so the splits<br>will be the same across calls.<br><br>Refer :ref:`User Guide <cross_validation>` for the various<br>cross-validation strategies that can be used here.<br><br>If "prefit" is passed, it is assumed that all `estimators` have<br>been fitted already. The `final_estimator_` is trained on the `estimators`<br>predictions on the full training set and are **not** cross validated<br>predictions. Please note that if the models have been trained on the same<br>data to train the stacking model, there is a very high risk of overfitting.<br><br>.. versionadded:: 1.1<br>    The 'prefit' option was added in 1.1<br><br>.. note::<br>   A larger number of split will provide no benefits if the number<br>   of training samples is large enough. Indeed, the training time<br>   will increase. ``cv`` is not used for model evaluation but for<br>   prediction.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of jobs to run in parallel for `fit` of all `estimators`.<br>`None` means 1 unless in a `joblib.parallel_backend` context. -1 means<br>using all processors. See :term:`Glossary <n_jobs>` for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('passthrough',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=passthrough,-bool%2C%20default%3DFalse">
                passthrough
                <span class="param-doc-description">passthrough: bool, default=False<br><br>When False, only the predictions of estimators will be used as<br>training data for `final_estimator`. When True, the<br>`final_estimator` is trained on the predictions as well as the<br>original training data.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=verbose,-int%2C%20default%3D0">
                verbose
                <span class="param-doc-description">verbose: int, default=0<br><br>Verbosity level.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>knn@5</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-serial"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-44" type="checkbox" ><label for="sk-estimator-id-44" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>select_cols: ColumnTransformer</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html">?<span>Documentation for select_cols: ColumnTransformer</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformers',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformers,-list%20of%20tuples">
                transformers
                <span class="param-doc-description">transformers: list of tuples<br><br>List of (name, transformer, columns) tuples specifying the<br>transformer objects to be applied to subsets of the data.<br><br>name : str<br>    Like in Pipeline and FeatureUnion, this allows the transformer and<br>    its parameters to be set using ``set_params`` and searched in grid<br>    search.<br>transformer : {'drop', 'passthrough'} or estimator<br>    Estimator must support :term:`fit` and :term:`transform`.<br>    Special-cased strings 'drop' and 'passthrough' are accepted as<br>    well, to indicate to drop the columns or to pass them through<br>    untransformed, respectively.<br>columns :  str, array-like of str, int, array-like of int,                 array-like of bool, slice or callable<br>    Indexes the data on its second axis. Integers are interpreted as<br>    positional columns, while strings can reference DataFrame columns<br>    by name.  A scalar string or int should be used where<br>    ``transformer`` expects X to be a 1d array-like (vector),<br>    otherwise a 2d array will be passed to the transformer.<br>    A callable is passed the input data `X` and can return any of the<br>    above. To select multiple columns by name or dtype, you can use<br>    :obj:`make_column_selector`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;long_and_lat&#x27;, ...)]</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('remainder',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=remainder,-%7B%27drop%27%2C%20%27passthrough%27%7D%20or%20estimator%2C%20default%3D%27drop%27">
                remainder
                <span class="param-doc-description">remainder: {'drop', 'passthrough'} or estimator, default='drop'<br><br>By default, only the specified columns in `transformers` are<br>transformed and combined in the output, and the non-specified<br>columns are dropped. (default of ``'drop'``).<br>By specifying ``remainder='passthrough'``, all remaining columns that<br>were not specified in `transformers`, but present in the data passed<br>to `fit` will be automatically passed through. This subset of columns<br>is concatenated with the output of the transformers. For dataframes,<br>extra columns not seen during `fit` will be excluded from the output<br>of `transform`.<br>By setting ``remainder`` to be an estimator, the remaining<br>non-specified columns will use the ``remainder`` estimator. The<br>estimator must support :term:`fit` and :term:`transform`.<br>Note that using this feature requires that the DataFrame columns<br>input at :term:`fit` and :term:`transform` have identical order.</span>
            </a>
        </td>
                <td class="value">&#x27;drop&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('sparse_threshold',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=sparse_threshold,-float%2C%20default%3D0.3">
                sparse_threshold
                <span class="param-doc-description">sparse_threshold: float, default=0.3<br><br>If the output of the different transformers contains sparse matrices,<br>these will be stacked as a sparse matrix if the overall density is<br>lower than this value. Use ``sparse_threshold=0`` to always return<br>dense.  When the transformed output consists of all dense data, the<br>stacked result will be dense, and this keyword will be ignored.</span>
            </a>
        </td>
                <td class="value">0.3</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>Number of jobs to run in parallel.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformer_weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformer_weights,-dict%2C%20default%3DNone">
                transformer_weights
                <span class="param-doc-description">transformer_weights: dict, default=None<br><br>Multiplicative weights for features per transformer. The output of the<br>transformer is multiplied by these weights. Keys are transformer names,<br>values the weights.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose,-bool%2C%20default%3DFalse">
                verbose
                <span class="param-doc-description">verbose: bool, default=False<br><br>If True, the time elapsed while fitting each transformer will be<br>printed as it is completed.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose_feature_names_out',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose_feature_names_out,-bool%2C%20str%20or%20Callable%5B%5Bstr%2C%20str%5D%2C%20str%5D%2C%20default%3DTrue">
                verbose_feature_names_out
                <span class="param-doc-description">verbose_feature_names_out: bool, str or Callable[[str, str], str], default=True<br><br>- If True, :meth:`ColumnTransformer.get_feature_names_out` will prefix<br>  all feature names with the name of the transformer that generated that<br>  feature. It is equivalent to setting<br>  `verbose_feature_names_out="{transformer_name}__{feature_name}"`.<br>- If False, :meth:`ColumnTransformer.get_feature_names_out` will not<br>  prefix any feature names and will error if feature names are not<br>  unique.<br>- If ``Callable[[str, str], str]``,<br>  :meth:`ColumnTransformer.get_feature_names_out` will rename all the features<br>  using the name of the transformer. The first argument of the callable is the<br>  transformer name and the second argument is the feature name. The returned<br>  string will be the new feature name.<br>- If ``str``, it must be a string ready for formatting. The given string will<br>  be formatted using two field names: ``transformer_name`` and ``feature_name``.<br>  e.g. ``"{feature_name}__{transformer_name}"``. See :meth:`str.format` method<br>  from the standard library for more info.<br><br>.. versionadded:: 1.0<br><br>.. versionchanged:: 1.6<br>    `verbose_feature_names_out` can be a callable or a string to be formatted.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('force_int_remainder_cols',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=force_int_remainder_cols,-bool%2C%20default%3DFalse">
                force_int_remainder_cols
                <span class="param-doc-description">force_int_remainder_cols: bool, default=False<br><br>This parameter has no effect.<br><br>.. note::<br>    If you do not access the list of columns for the remainder columns<br>    in the `transformers_` fitted attribute, you do not need to set<br>    this parameter.<br><br>.. versionadded:: 1.5<br><br>.. versionchanged:: 1.7<br>   The default value for `force_int_remainder_cols` will change from<br>   `True` to `False` in version 1.7.<br><br>.. deprecated:: 1.7<br>   `force_int_remainder_cols` is deprecated and will be removed in 1.9.</span>
            </a>
        </td>
                <td class="value">&#x27;deprecated&#x27;</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-45" type="checkbox" ><label for="sk-estimator-id-45" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>long_and_lat</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>[&#x27;Longitude&#x27;, &#x27;Latitude&#x27;]</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-46" type="checkbox" ><label for="sk-estimator-id-46" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>passthrough</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>passthrough</pre></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-47" type="checkbox" ><label for="sk-estimator-id-47" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__knn__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">5</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>final_estimator</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-48" type="checkbox" ><label for="sk-estimator-id-48" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>GradientBoostingRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html">?<span>Documentation for GradientBoostingRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="final_estimator__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('loss',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=loss,-%7B%27squared_error%27%2C%20%27absolute_error%27%2C%20%27huber%27%2C%20%27quantile%27%7D%2C%20%20%20%20%20%20%20%20%20%20%20%20%20default%3D%27squared_error%27">
                loss
                <span class="param-doc-description">loss: {'squared_error', 'absolute_error', 'huber', 'quantile'},             default='squared_error'<br><br>Loss function to be optimized. 'squared_error' refers to the squared<br>error for regression. 'absolute_error' refers to the absolute error of<br>regression and is a robust loss function. 'huber' is a<br>combination of the two. 'quantile' allows quantile regression (use<br>`alpha` to specify the quantile).<br>See<br>:ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_quantile.py`<br>for an example that demonstrates quantile regression for creating<br>prediction intervals with `loss='quantile'`.</span>
            </a>
        </td>
                <td class="value">&#x27;squared_error&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('learning_rate',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=learning_rate,-float%2C%20default%3D0.1">
                learning_rate
                <span class="param-doc-description">learning_rate: float, default=0.1<br><br>Learning rate shrinks the contribution of each tree by `learning_rate`.<br>There is a trade-off between learning_rate and n_estimators.<br>Values must be in the range `[0.0, inf)`.</span>
            </a>
        </td>
                <td class="value">0.1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_estimators',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=n_estimators,-int%2C%20default%3D100">
                n_estimators
                <span class="param-doc-description">n_estimators: int, default=100<br><br>The number of boosting stages to perform. Gradient boosting<br>is fairly robust to over-fitting so a large number usually<br>results in better performance.<br>Values must be in the range `[1, inf)`.</span>
            </a>
        </td>
                <td class="value">100</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('subsample',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=subsample,-float%2C%20default%3D1.0">
                subsample
                <span class="param-doc-description">subsample: float, default=1.0<br><br>The fraction of samples to be used for fitting the individual base<br>learners. If smaller than 1.0 this results in Stochastic Gradient<br>Boosting. `subsample` interacts with the parameter `n_estimators`.<br>Choosing `subsample < 1.0` leads to a reduction of variance<br>and an increase in bias.<br>Values must be in the range `(0.0, 1.0]`.</span>
            </a>
        </td>
                <td class="value">1.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('criterion',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=criterion,-%7B%27friedman_mse%27%2C%20%27squared_error%27%7D%2C%20default%3D%27friedman_mse%27">
                criterion
                <span class="param-doc-description">criterion: {'friedman_mse', 'squared_error'}, default='friedman_mse'<br><br>The function to measure the quality of a split. Supported criteria are<br>"friedman_mse" for the mean squared error with improvement score by<br>Friedman, "squared_error" for mean squared error. The default value of<br>"friedman_mse" is generally the best as it can provide a better<br>approximation in some cases.<br><br>.. versionadded:: 0.18</span>
            </a>
        </td>
                <td class="value">&#x27;friedman_mse&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_samples_split',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_samples_split,-int%20or%20float%2C%20default%3D2">
                min_samples_split
                <span class="param-doc-description">min_samples_split: int or float, default=2<br><br>The minimum number of samples required to split an internal node:<br><br>- If int, values must be in the range `[2, inf)`.<br>- If float, values must be in the range `(0.0, 1.0]` and `min_samples_split`<br>  will be `ceil(min_samples_split * n_samples)`.<br><br>.. versionchanged:: 0.18<br>   Added float values for fractions.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_samples_leaf',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_samples_leaf,-int%20or%20float%2C%20default%3D1">
                min_samples_leaf
                <span class="param-doc-description">min_samples_leaf: int or float, default=1<br><br>The minimum number of samples required to be at a leaf node.<br>A split point at any depth will only be considered if it leaves at<br>least ``min_samples_leaf`` training samples in each of the left and<br>right branches.  This may have the effect of smoothing the model,<br>especially in regression.<br><br>- If int, values must be in the range `[1, inf)`.<br>- If float, values must be in the range `(0.0, 1.0)` and `min_samples_leaf`<br>  will be `ceil(min_samples_leaf * n_samples)`.<br><br>.. versionchanged:: 0.18<br>   Added float values for fractions.</span>
            </a>
        </td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_weight_fraction_leaf',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_weight_fraction_leaf,-float%2C%20default%3D0.0">
                min_weight_fraction_leaf
                <span class="param-doc-description">min_weight_fraction_leaf: float, default=0.0<br><br>The minimum weighted fraction of the sum total of weights (of all<br>the input samples) required to be at a leaf node. Samples have<br>equal weight when sample_weight is not provided.<br>Values must be in the range `[0.0, 0.5]`.</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_depth',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=max_depth,-int%20or%20None%2C%20default%3D3">
                max_depth
                <span class="param-doc-description">max_depth: int or None, default=3<br><br>Maximum depth of the individual regression estimators. The maximum<br>depth limits the number of nodes in the tree. Tune this parameter<br>for best performance; the best value depends on the interaction<br>of the input variables. If None, then nodes are expanded until<br>all leaves are pure or until all leaves contain less than<br>min_samples_split samples.<br>If int, values must be in the range `[1, inf)`.</span>
            </a>
        </td>
                <td class="value">3</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_impurity_decrease',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_impurity_decrease,-float%2C%20default%3D0.0">
                min_impurity_decrease
                <span class="param-doc-description">min_impurity_decrease: float, default=0.0<br><br>A node will be split if this split induces a decrease of the impurity<br>greater than or equal to this value.<br>Values must be in the range `[0.0, inf)`.<br><br>The weighted impurity decrease equation is the following::<br><br>    N_t / N * (impurity - N_t_R / N_t * right_impurity<br>                        - N_t_L / N_t * left_impurity)<br><br>where ``N`` is the total number of samples, ``N_t`` is the number of<br>samples at the current node, ``N_t_L`` is the number of samples in the<br>left child, and ``N_t_R`` is the number of samples in the right child.<br><br>``N``, ``N_t``, ``N_t_R`` and ``N_t_L`` all refer to the weighted sum,<br>if ``sample_weight`` is passed.<br><br>.. versionadded:: 0.19</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('init',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=init,-estimator%20or%20%27zero%27%2C%20default%3DNone">
                init
                <span class="param-doc-description">init: estimator or 'zero', default=None<br><br>An estimator object that is used to compute the initial predictions.<br>``init`` has to provide :term:`fit` and :term:`predict`. If 'zero', the<br>initial raw predictions are set to zero. By default a<br>``DummyEstimator`` is used, predicting either the average target value<br>(for loss='squared_error'), or a quantile for the other losses.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('random_state',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=random_state,-int%2C%20RandomState%20instance%20or%20None%2C%20default%3DNone">
                random_state
                <span class="param-doc-description">random_state: int, RandomState instance or None, default=None<br><br>Controls the random seed given to each Tree estimator at each<br>boosting iteration.<br>In addition, it controls the random permutation of the features at<br>each split (see Notes for more details).<br>It also controls the random splitting of the training data to obtain a<br>validation set if `n_iter_no_change` is not None.<br>Pass an int for reproducible output across multiple function calls.<br>See :term:`Glossary <random_state>`.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_features',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=max_features,-%7B%27sqrt%27%2C%20%27log2%27%7D%2C%20int%20or%20float%2C%20default%3DNone">
                max_features
                <span class="param-doc-description">max_features: {'sqrt', 'log2'}, int or float, default=None<br><br>The number of features to consider when looking for the best split:<br><br>- If int, values must be in the range `[1, inf)`.<br>- If float, values must be in the range `(0.0, 1.0]` and the features<br>  considered at each split will be `max(1, int(max_features * n_features_in_))`.<br>- If "sqrt", then `max_features=sqrt(n_features)`.<br>- If "log2", then `max_features=log2(n_features)`.<br>- If None, then `max_features=n_features`.<br><br>Choosing `max_features < n_features` leads to a reduction of variance<br>and an increase in bias.<br><br>Note: the search for a split does not stop until at least one<br>valid partition of the node samples is found, even if it requires to<br>effectively inspect more than ``max_features`` features.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('alpha',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=alpha,-float%2C%20default%3D0.9">
                alpha
                <span class="param-doc-description">alpha: float, default=0.9<br><br>The alpha-quantile of the huber loss function and the quantile<br>loss function. Only if ``loss='huber'`` or ``loss='quantile'``.<br>Values must be in the range `(0.0, 1.0)`.</span>
            </a>
        </td>
                <td class="value">0.9</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=verbose,-int%2C%20default%3D0">
                verbose
                <span class="param-doc-description">verbose: int, default=0<br><br>Enable verbose output. If 1 then it prints progress and performance<br>once in a while (the more trees the lower the frequency). If greater<br>than 1 then it prints progress and performance for every tree.<br>Values must be in the range `[0, inf)`.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_leaf_nodes',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=max_leaf_nodes,-int%2C%20default%3DNone">
                max_leaf_nodes
                <span class="param-doc-description">max_leaf_nodes: int, default=None<br><br>Grow trees with ``max_leaf_nodes`` in best-first fashion.<br>Best nodes are defined as relative reduction in impurity.<br>Values must be in the range `[2, inf)`.<br>If None, then unlimited number of leaf nodes.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('warm_start',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=warm_start,-bool%2C%20default%3DFalse">
                warm_start
                <span class="param-doc-description">warm_start: bool, default=False<br><br>When set to ``True``, reuse the solution of the previous call to fit<br>and add more estimators to the ensemble, otherwise, just erase the<br>previous solution. See :term:`the Glossary <warm_start>`.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('validation_fraction',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=validation_fraction,-float%2C%20default%3D0.1">
                validation_fraction
                <span class="param-doc-description">validation_fraction: float, default=0.1<br><br>The proportion of training data to set aside as validation set for<br>early stopping. Values must be in the range `(0.0, 1.0)`.<br>Only used if ``n_iter_no_change`` is set to an integer.<br><br>.. versionadded:: 0.20</span>
            </a>
        </td>
                <td class="value">0.1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_iter_no_change',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=n_iter_no_change,-int%2C%20default%3DNone">
                n_iter_no_change
                <span class="param-doc-description">n_iter_no_change: int, default=None<br><br>``n_iter_no_change`` is used to decide if early stopping will be used<br>to terminate training when validation score is not improving. By<br>default it is set to None to disable early stopping. If set to a<br>number, it will set aside ``validation_fraction`` size of the training<br>data as validation and terminate training when validation score is not<br>improving in all of the previous ``n_iter_no_change`` numbers of<br>iterations.<br>Values must be in the range `[1, inf)`.<br>See<br>:ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_early_stopping.py`.<br><br>.. versionadded:: 0.20</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('tol',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=tol,-float%2C%20default%3D1e-4">
                tol
                <span class="param-doc-description">tol: float, default=1e-4<br><br>Tolerance for the early stopping. When the loss is not improving<br>by at least tol for ``n_iter_no_change`` iterations (if set to a<br>number), the training stops.<br>Values must be in the range `[0.0, inf)`.<br><br>.. versionadded:: 0.20</span>
            </a>
        </td>
                <td class="value">0.0001</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('ccp_alpha',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=ccp_alpha,-non-negative%20float%2C%20default%3D0.0">
                ccp_alpha
                <span class="param-doc-description">ccp_alpha: non-negative float, default=0.0<br><br>Complexity parameter used for Minimal Cost-Complexity Pruning. The<br>subtree with the largest cost complexity that is smaller than<br>``ccp_alpha`` will be chosen. By default, no pruning is performed.<br>Values must be in the range `[0.0, inf)`.<br>See :ref:`minimal_cost_complexity_pruning` for details. See<br>:ref:`sphx_glr_auto_examples_tree_plot_cost_complexity_pruning.py`<br>for an example of such pruning.<br><br>.. versionadded:: 0.22</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-17');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1437-1439

.. code-block:: Python

    -get_scorer("neg_root_mean_squared_error")(gb, df_test, y_test)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    41826.816445595665


.. GENERATED FROM PYTHON SOURCE LINES 1440-1445

The score is almost identical to the ``RandomForestRegressor``, as
is the training time. For other choices of hyper-parameters, this will
certainly differ, but as is, it doesn’t really matter which model we
choose. Here it would be a good exercise to perform a hyper-parameter
search to determine what model is truly better.

.. GENERATED FROM PYTHON SOURCE LINES 1447-1449

Aside: Checking the importance of longitude and latitude as predictive features
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 1451-1456

Remember that earlier on, we formed the hypothesis that tree-based
models would have difficulties making use of the longitude and latitude
features because they require a high amount of splits to really be
useful? Even if it’s not strictly necessary, let’s take some time to
check this hypothesis.

.. GENERATED FROM PYTHON SOURCE LINES 1458-1464

To do this, what we would want to do is to check how much the tree-based model
relies on said features, given that we change the depth of the tree. To
determine the importance of the feature, we will use the
:func:`sklearn.inspection.permutation_importance` function from sklearn. Then
we would like to check if longitude and latitude become more important if we
increase the depth of the decision trees.

.. GENERATED FROM PYTHON SOURCE LINES 1466-1470

Be aware that we don’t want to use the ``StackingRegressor`` with
the KNN predictions here, because for that model, the longitude and
latitude influence the KNN prediction feature, obfuscating the result.
So let’s use the pure ``GradientBoostingRegressor`` here.

.. GENERATED FROM PYTHON SOURCE LINES 1472-1477

Our approach will be to train 3 models with different values for
``max_depth``. We choose relative small values here because
gradient boosting typically uses very shallow trees (the default depth
is 3). After training, we calculate the permutation importances of each
feature and create a bar plot of the results:

.. GENERATED FROM PYTHON SOURCE LINES 1479-1495

.. code-block:: Python

    max_depths = [2, 4, 6]
    fig, axes = plt.subplots(1, 3, figsize=(12, 8))
    for md, ax in zip(max_depths, axes):
        gb = GradientBoostingRegressor(max_depth=md, random_state=0)
        gb.fit(df_train, y_train)
        pi = permutation_importance(gb, df_train, y_train, random_state=0)
        score = -get_scorer("neg_root_mean_squared_error")(gb, df_test, y_test)

        ax.barh(df_train.columns, pi["importances_mean"])
        ax.set_xlim([0, 1.5])
        title = f"permutation importances for max_depths={md}\n(test RMSE: {score:.0f})"
        ax.set_title(title)
        if md > max_depths[0]:
            ax.set_yticklabels([])
    plt.tight_layout()


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_011.png
   :alt: permutation importances for max_depths=2 (test RMSE: 54063), permutation importances for max_depths=4 (test RMSE: 47078), permutation importances for max_depths=6 (test RMSE: 44274)
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_011.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 1496-1504

As we can see, longitude and latitude are not very important for
``max_depth=2``, even being below the importance of
"MedInc". In contrast, they are more important for
``max_depth=4``, with the importance growing even more for
``max_depth=6``. This confirms our hypothesis, though we should be
aware that besides ``max_depth``, other hyper-parameters influence
the effective number of splits we have, so in reality it’s not that
simple.

.. GENERATED FROM PYTHON SOURCE LINES 1506-1511

Out of curiosity, we also show the RMSE on the test set for the
individual models. Interestingly, we find that its considerably worse
than the scores we got earlier when we included the KNN predictions, not
even beating the linear regression! This is a nice validation that our
KNN feature really helps a lot.

.. GENERATED FROM PYTHON SOURCE LINES 1513-1515

Final model
^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 1517-1523

Let’s settle on a final model for now. We will use gradient boosting
again, only this time using more estimators. In general, with gradient
boosting, more trees help more. The tradeoff is mostly that the
resulting model will be bigger and slower. We go with 500 trees here
(the default is 100), but ideally we should run a hyper-parameter search
to get the best results.

.. GENERATED FROM PYTHON SOURCE LINES 1525-1531

.. code-block:: Python

    gb_final = StackingRegressor(
        estimators=knn_regressor,
        final_estimator=GradientBoostingRegressor(n_estimators=500, random_state=0),
        passthrough=True,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1532-1534

.. code-block:: Python

    gb_final.fit(df_train, y_train)


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style>#sk-container-id-18 {
      /* Definition of color scheme common for light and dark mode */
      --sklearn-color-text: #000;
      --sklearn-color-text-muted: #666;
      --sklearn-color-line: gray;
      /* Definition of color scheme for unfitted estimators */
      --sklearn-color-unfitted-level-0: #fff5e6;
      --sklearn-color-unfitted-level-1: #f6e4d2;
      --sklearn-color-unfitted-level-2: #ffe0b3;
      --sklearn-color-unfitted-level-3: chocolate;
      /* Definition of color scheme for fitted estimators */
      --sklearn-color-fitted-level-0: #f0f8ff;
      --sklearn-color-fitted-level-1: #d4ebff;
      --sklearn-color-fitted-level-2: #b3dbfd;
      --sklearn-color-fitted-level-3: cornflowerblue;
    }

    #sk-container-id-18.light {
      /* Specific color for light theme */
      --sklearn-color-text-on-default-background: black;
      --sklearn-color-background: white;
      --sklearn-color-border-box: black;
      --sklearn-color-icon: #696969;
    }

    #sk-container-id-18.dark {
      --sklearn-color-text-on-default-background: white;
      --sklearn-color-background: #111;
      --sklearn-color-border-box: white;
      --sklearn-color-icon: #878787;
    }

    #sk-container-id-18 {
      color: var(--sklearn-color-text);
    }

    #sk-container-id-18 pre {
      padding: 0;
    }

    #sk-container-id-18 input.sk-hidden--visually {
      border: 0;
      clip: rect(1px 1px 1px 1px);
      clip: rect(1px, 1px, 1px, 1px);
      height: 1px;
      margin: -1px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1px;
    }

    #sk-container-id-18 div.sk-dashed-wrapped {
      border: 1px dashed var(--sklearn-color-line);
      margin: 0 0.4em 0.5em 0.4em;
      box-sizing: border-box;
      padding-bottom: 0.4em;
      background-color: var(--sklearn-color-background);
    }

    #sk-container-id-18 div.sk-container {
      /* jupyter's `normalize.less` sets `[hidden] { display: none; }`
         but bootstrap.min.css set `[hidden] { display: none !important; }`
         so we also need the `!important` here to be able to override the
         default hidden behavior on the sphinx rendered scikit-learn.org.
         See: https://github.com/scikit-learn/scikit-learn/issues/21755 */
      display: inline-block !important;
      position: relative;
    }

    #sk-container-id-18 div.sk-text-repr-fallback {
      display: none;
    }

    div.sk-parallel-item,
    div.sk-serial,
    div.sk-item {
      /* draw centered vertical line to link estimators */
      background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));
      background-size: 2px 100%;
      background-repeat: no-repeat;
      background-position: center center;
    }

    /* Parallel-specific style estimator block */

    #sk-container-id-18 div.sk-parallel-item::after {
      content: "";
      width: 100%;
      border-bottom: 2px solid var(--sklearn-color-text-on-default-background);
      flex-grow: 1;
    }

    #sk-container-id-18 div.sk-parallel {
      display: flex;
      align-items: stretch;
      justify-content: center;
      background-color: var(--sklearn-color-background);
      position: relative;
    }

    #sk-container-id-18 div.sk-parallel-item {
      display: flex;
      flex-direction: column;
    }

    #sk-container-id-18 div.sk-parallel-item:first-child::after {
      align-self: flex-end;
      width: 50%;
    }

    #sk-container-id-18 div.sk-parallel-item:last-child::after {
      align-self: flex-start;
      width: 50%;
    }

    #sk-container-id-18 div.sk-parallel-item:only-child::after {
      width: 0;
    }

    /* Serial-specific style estimator block */

    #sk-container-id-18 div.sk-serial {
      display: flex;
      flex-direction: column;
      align-items: center;
      background-color: var(--sklearn-color-background);
      padding-right: 1em;
      padding-left: 1em;
    }


    /* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is
    clickable and can be expanded/collapsed.
    - Pipeline and ColumnTransformer use this feature and define the default style
    - Estimators will overwrite some part of the style using the `sk-estimator` class
    */

    /* Pipeline and ColumnTransformer style (default) */

    #sk-container-id-18 div.sk-toggleable {
      /* Default theme specific background. It is overwritten whether we have a
      specific estimator or a Pipeline/ColumnTransformer */
      background-color: var(--sklearn-color-background);
    }

    /* Toggleable label */
    #sk-container-id-18 label.sk-toggleable__label {
      cursor: pointer;
      display: flex;
      width: 100%;
      margin-bottom: 0;
      padding: 0.5em;
      box-sizing: border-box;
      text-align: center;
      align-items: center;
      justify-content: center;
      gap: 0.5em;
    }

    #sk-container-id-18 label.sk-toggleable__label .caption {
      font-size: 0.6rem;
      font-weight: lighter;
      color: var(--sklearn-color-text-muted);
    }

    #sk-container-id-18 label.sk-toggleable__label-arrow:before {
      /* Arrow on the left of the label */
      content: "▸";
      float: left;
      margin-right: 0.25em;
      color: var(--sklearn-color-icon);
    }

    #sk-container-id-18 label.sk-toggleable__label-arrow:hover:before {
      color: var(--sklearn-color-text);
    }

    /* Toggleable content - dropdown */

    #sk-container-id-18 div.sk-toggleable__content {
      display: none;
      text-align: left;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-18 div.sk-toggleable__content.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-18 div.sk-toggleable__content pre {
      margin: 0.2em;
      border-radius: 0.25em;
      color: var(--sklearn-color-text);
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-18 div.sk-toggleable__content.fitted pre {
      /* unfitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    #sk-container-id-18 input.sk-toggleable__control:checked~div.sk-toggleable__content {
      /* Expand drop-down */
      display: block;
      width: 100%;
      overflow: visible;
    }

    #sk-container-id-18 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {
      content: "▾";
    }

    /* Pipeline/ColumnTransformer-specific style */

    #sk-container-id-18 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-18 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator-specific style */

    /* Colorize estimator box */
    #sk-container-id-18 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-18 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    #sk-container-id-18 div.sk-label label.sk-toggleable__label,
    #sk-container-id-18 div.sk-label label {
      /* The background is the default theme color */
      color: var(--sklearn-color-text-on-default-background);
    }

    /* On hover, darken the color of the background */
    #sk-container-id-18 div.sk-label:hover label.sk-toggleable__label {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    /* Label box, darken color on hover, fitted */
    #sk-container-id-18 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {
      color: var(--sklearn-color-text);
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Estimator label */

    #sk-container-id-18 div.sk-label label {
      font-family: monospace;
      font-weight: bold;
      line-height: 1.2em;
    }

    #sk-container-id-18 div.sk-label-container {
      text-align: center;
    }

    /* Estimator-specific */
    #sk-container-id-18 div.sk-estimator {
      font-family: monospace;
      border: 1px dotted var(--sklearn-color-border-box);
      border-radius: 0.25em;
      box-sizing: border-box;
      margin-bottom: 0.5em;
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-0);
    }

    #sk-container-id-18 div.sk-estimator.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
    }

    /* on hover */
    #sk-container-id-18 div.sk-estimator:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-2);
    }

    #sk-container-id-18 div.sk-estimator.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-2);
    }

    /* Specification for estimator info (e.g. "i" and "?") */

    /* Common style for "i" and "?" */

    .sk-estimator-doc-link,
    a:link.sk-estimator-doc-link,
    a:visited.sk-estimator-doc-link {
      float: right;
      font-size: smaller;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1em;
      height: 1em;
      width: 1em;
      text-decoration: none !important;
      margin-left: 0.5em;
      text-align: center;
      /* unfitted */
      border: var(--sklearn-color-unfitted-level-3) 1pt solid;
      color: var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted,
    a:link.sk-estimator-doc-link.fitted,
    a:visited.sk-estimator-doc-link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3) 1pt solid;
      color: var(--sklearn-color-fitted-level-3);
    }

    /* On hover */
    div.sk-estimator:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover,
    div.sk-label-container:hover .sk-estimator-doc-link:hover,
    .sk-estimator-doc-link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-unfitted-level-0);
      text-decoration: none;
    }

    div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover,
    div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,
    .sk-estimator-doc-link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
      border: var(--sklearn-color-fitted-level-0) 1pt solid;
      color: var(--sklearn-color-fitted-level-0);
      text-decoration: none;
    }

    /* Span, style for the box shown on hovering the info icon */
    .sk-estimator-doc-link span {
      display: none;
      z-index: 9999;
      position: relative;
      font-weight: normal;
      right: .2ex;
      padding: .5ex;
      margin: .5ex;
      width: min-content;
      min-width: 20ex;
      max-width: 50ex;
      color: var(--sklearn-color-text);
      box-shadow: 2pt 2pt 4pt #999;
      /* unfitted */
      background: var(--sklearn-color-unfitted-level-0);
      border: .5pt solid var(--sklearn-color-unfitted-level-3);
    }

    .sk-estimator-doc-link.fitted span {
      /* fitted */
      background: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-3);
    }

    .sk-estimator-doc-link:hover span {
      display: block;
    }

    /* "?"-specific style due to the `<a>` HTML tag */

    #sk-container-id-18 a.estimator_doc_link {
      float: right;
      font-size: 1rem;
      line-height: 1em;
      font-family: monospace;
      background-color: var(--sklearn-color-unfitted-level-0);
      border-radius: 1rem;
      height: 1rem;
      width: 1rem;
      text-decoration: none;
      /* unfitted */
      color: var(--sklearn-color-unfitted-level-1);
      border: var(--sklearn-color-unfitted-level-1) 1pt solid;
    }

    #sk-container-id-18 a.estimator_doc_link.fitted {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-0);
      border: var(--sklearn-color-fitted-level-1) 1pt solid;
      color: var(--sklearn-color-fitted-level-1);
    }

    /* On hover */
    #sk-container-id-18 a.estimator_doc_link:hover {
      /* unfitted */
      background-color: var(--sklearn-color-unfitted-level-3);
      color: var(--sklearn-color-background);
      text-decoration: none;
    }

    #sk-container-id-18 a.estimator_doc_link.fitted:hover {
      /* fitted */
      background-color: var(--sklearn-color-fitted-level-3);
    }

    .estimator-table {
        font-family: monospace;
    }

    .estimator-table summary {
        padding: .5rem;
        cursor: pointer;
    }

    .estimator-table summary::marker {
        font-size: 0.7rem;
    }

    .estimator-table details[open] {
        padding-left: 0.1rem;
        padding-right: 0.1rem;
        padding-bottom: 0.3rem;
    }

    .estimator-table .parameters-table {
        margin-left: auto !important;
        margin-right: auto !important;
        margin-top: 0;
    }

    .estimator-table .parameters-table tr:nth-child(odd) {
        background-color: #fff;
    }

    .estimator-table .parameters-table tr:nth-child(even) {
        background-color: #f6f6f6;
    }

    .estimator-table .parameters-table tr:hover {
        background-color: #e0e0e0;
    }

    .estimator-table table td {
        border: 1px solid rgba(106, 105, 104, 0.232);
    }

    /*
        `table td`is set in notebook with right text-align.
        We need to overwrite it.
    */
    .estimator-table table td.param {
        text-align: left;
        position: relative;
        padding: 0;
    }

    .user-set td {
        color:rgb(255, 94, 0);
        text-align: left !important;
    }

    .user-set td.value {
        color:rgb(255, 94, 0);
        background-color: transparent;
    }

    .default td {
        color: black;
        text-align: left !important;
    }

    .user-set td i,
    .default td i {
        color: black;
    }

    /*
        Styles for parameter documentation links
        We need styling for visited so jupyter doesn't overwrite it
    */
    a.param-doc-link,
    a.param-doc-link:link,
    a.param-doc-link:visited {
        text-decoration: underline dashed;
        text-underline-offset: .3em;
        color: inherit;
        display: block;
        padding: .5em;
    }

    /* "hack" to make the entire area of the cell containing the link clickable */
    a.param-doc-link::before {
        position: absolute;
        content: "";
        inset: 0;
    }

    .param-doc-description {
        display: none;
        position: absolute;
        z-index: 9999;
        left: 0;
        padding: .5ex;
        margin-left: 1.5em;
        color: var(--sklearn-color-text);
        box-shadow: .3em .3em .4em #999;
        width: max-content;
        text-align: left;
        max-height: 10em;
        overflow-y: auto;

        /* unfitted */
        background: var(--sklearn-color-unfitted-level-0);
        border: thin solid var(--sklearn-color-unfitted-level-3);
    }

    /* Fitted state for parameter tooltips */
    .fitted .param-doc-description {
        /* fitted */
        background: var(--sklearn-color-fitted-level-0);
        border: thin solid var(--sklearn-color-fitted-level-3);
    }

    .param-doc-link:hover .param-doc-description {
        display: block;
    }

    .copy-paste-icon {
        background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCA0NDggNTEyIj48IS0tIUZvbnQgQXdlc29tZSBGcmVlIDYuNy4yIGJ5IEBmb250YXdlc29tZSAtIGh0dHBzOi8vZm9udGF3ZXNvbWUuY29tIExpY2Vuc2UgLSBodHRwczovL2ZvbnRhd2Vzb21lLmNvbS9saWNlbnNlL2ZyZWUgQ29weXJpZ2h0IDIwMjUgRm9udGljb25zLCBJbmMuLS0+PHBhdGggZD0iTTIwOCAwTDMzMi4xIDBjMTIuNyAwIDI0LjkgNS4xIDMzLjkgMTQuMWw2Ny45IDY3LjljOSA5IDE0LjEgMjEuMiAxNC4xIDMzLjlMNDQ4IDMzNmMwIDI2LjUtMjEuNSA0OC00OCA0OGwtMTkyIDBjLTI2LjUgMC00OC0yMS41LTQ4LTQ4bDAtMjg4YzAtMjYuNSAyMS41LTQ4IDQ4LTQ4ek00OCAxMjhsODAgMCAwIDY0LTY0IDAgMCAyNTYgMTkyIDAgMC0zMiA2NCAwIDAgNDhjMCAyNi41LTIxLjUgNDgtNDggNDhMNDggNTEyYy0yNi41IDAtNDgtMjEuNS00OC00OEwwIDE3NmMwLTI2LjUgMjEuNS00OCA0OC00OHoiLz48L3N2Zz4=);
        background-repeat: no-repeat;
        background-size: 14px 14px;
        background-position: 0;
        display: inline-block;
        width: 14px;
        height: 14px;
        cursor: pointer;
    }
    </style><body><div id="sk-container-id-18" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>StackingRegressor(estimators=[(&#x27;knn@5&#x27;,
                                   Pipeline(steps=[(&#x27;select_cols&#x27;,
                                                    ColumnTransformer(transformers=[(&#x27;long_and_lat&#x27;,
                                                                                     &#x27;passthrough&#x27;,
                                                                                     [&#x27;Longitude&#x27;,
                                                                                      &#x27;Latitude&#x27;])])),
                                                   (&#x27;knn&#x27;,
                                                    KNeighborsRegressor())]))],
                      final_estimator=GradientBoostingRegressor(n_estimators=500,
                                                                random_state=0),
                      passthrough=True)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-49" type="checkbox" ><label for="sk-estimator-id-49" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>StackingRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html">?<span>Documentation for StackingRegressor</span></a><span class="sk-estimator-doc-link fitted">i<span>Fitted</span></span></div></label><div class="sk-toggleable__content fitted" data-param-prefix="">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('estimators',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=estimators,-list%20of%20%28str%2C%20estimator%29">
                estimators
                <span class="param-doc-description">estimators: list of (str, estimator)<br><br>Base estimators which will be stacked together. Each element of the<br>list is defined as a tuple of string (i.e. name) and an estimator<br>instance. An estimator can be set to 'drop' using `set_params`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;knn@5&#x27;, ...)]</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('final_estimator',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=final_estimator,-estimator%2C%20default%3DNone">
                final_estimator
                <span class="param-doc-description">final_estimator: estimator, default=None<br><br>A regressor which will be used to combine the base estimators.<br>The default regressor is a :class:`~sklearn.linear_model.RidgeCV`.</span>
            </a>
        </td>
                <td class="value">GradientBoost...andom_state=0)</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('cv',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=cv,-int%2C%20cross-validation%20generator%2C%20iterable%2C%20or%20%22prefit%22%2C%20default%3DNone">
                cv
                <span class="param-doc-description">cv: int, cross-validation generator, iterable, or "prefit", default=None<br><br>Determines the cross-validation splitting strategy used in<br>`cross_val_predict` to train `final_estimator`. Possible inputs for<br>cv are:<br><br>* None, to use the default 5-fold cross validation,<br>* integer, to specify the number of folds in a (Stratified) KFold,<br>* An object to be used as a cross-validation generator,<br>* An iterable yielding train, test splits,<br>* `"prefit"`, to assume the `estimators` are prefit. In this case, the<br>  estimators will not be refitted.<br><br>For integer/None inputs, if the estimator is a classifier and y is<br>either binary or multiclass,<br>:class:`~sklearn.model_selection.StratifiedKFold` is used.<br>In all other cases, :class:`~sklearn.model_selection.KFold` is used.<br>These splitters are instantiated with `shuffle=False` so the splits<br>will be the same across calls.<br><br>Refer :ref:`User Guide <cross_validation>` for the various<br>cross-validation strategies that can be used here.<br><br>If "prefit" is passed, it is assumed that all `estimators` have<br>been fitted already. The `final_estimator_` is trained on the `estimators`<br>predictions on the full training set and are **not** cross validated<br>predictions. Please note that if the models have been trained on the same<br>data to train the stacking model, there is a very high risk of overfitting.<br><br>.. versionadded:: 1.1<br>    The 'prefit' option was added in 1.1<br><br>.. note::<br>   A larger number of split will provide no benefits if the number<br>   of training samples is large enough. Indeed, the training time<br>   will increase. ``cv`` is not used for model evaluation but for<br>   prediction.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of jobs to run in parallel for `fit` of all `estimators`.<br>`None` means 1 unless in a `joblib.parallel_backend` context. -1 means<br>using all processors. See :term:`Glossary <n_jobs>` for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('passthrough',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=passthrough,-bool%2C%20default%3DFalse">
                passthrough
                <span class="param-doc-description">passthrough: bool, default=False<br><br>When False, only the predictions of estimators will be used as<br>training data for `final_estimator`. When True, the<br>`final_estimator` is trained on the predictions as well as the<br>original training data.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.StackingRegressor.html#:~:text=verbose,-int%2C%20default%3D0">
                verbose
                <span class="param-doc-description">verbose: int, default=0<br><br>Verbosity level.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>knn@5</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-serial"><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-50" type="checkbox" ><label for="sk-estimator-id-50" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>select_cols: ColumnTransformer</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html">?<span>Documentation for select_cols: ColumnTransformer</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformers',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformers,-list%20of%20tuples">
                transformers
                <span class="param-doc-description">transformers: list of tuples<br><br>List of (name, transformer, columns) tuples specifying the<br>transformer objects to be applied to subsets of the data.<br><br>name : str<br>    Like in Pipeline and FeatureUnion, this allows the transformer and<br>    its parameters to be set using ``set_params`` and searched in grid<br>    search.<br>transformer : {'drop', 'passthrough'} or estimator<br>    Estimator must support :term:`fit` and :term:`transform`.<br>    Special-cased strings 'drop' and 'passthrough' are accepted as<br>    well, to indicate to drop the columns or to pass them through<br>    untransformed, respectively.<br>columns :  str, array-like of str, int, array-like of int,                 array-like of bool, slice or callable<br>    Indexes the data on its second axis. Integers are interpreted as<br>    positional columns, while strings can reference DataFrame columns<br>    by name.  A scalar string or int should be used where<br>    ``transformer`` expects X to be a 1d array-like (vector),<br>    otherwise a 2d array will be passed to the transformer.<br>    A callable is passed the input data `X` and can return any of the<br>    above. To select multiple columns by name or dtype, you can use<br>    :obj:`make_column_selector`.</span>
            </a>
        </td>
                <td class="value">[(&#x27;long_and_lat&#x27;, ...)]</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('remainder',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=remainder,-%7B%27drop%27%2C%20%27passthrough%27%7D%20or%20estimator%2C%20default%3D%27drop%27">
                remainder
                <span class="param-doc-description">remainder: {'drop', 'passthrough'} or estimator, default='drop'<br><br>By default, only the specified columns in `transformers` are<br>transformed and combined in the output, and the non-specified<br>columns are dropped. (default of ``'drop'``).<br>By specifying ``remainder='passthrough'``, all remaining columns that<br>were not specified in `transformers`, but present in the data passed<br>to `fit` will be automatically passed through. This subset of columns<br>is concatenated with the output of the transformers. For dataframes,<br>extra columns not seen during `fit` will be excluded from the output<br>of `transform`.<br>By setting ``remainder`` to be an estimator, the remaining<br>non-specified columns will use the ``remainder`` estimator. The<br>estimator must support :term:`fit` and :term:`transform`.<br>Note that using this feature requires that the DataFrame columns<br>input at :term:`fit` and :term:`transform` have identical order.</span>
            </a>
        </td>
                <td class="value">&#x27;drop&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('sparse_threshold',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=sparse_threshold,-float%2C%20default%3D0.3">
                sparse_threshold
                <span class="param-doc-description">sparse_threshold: float, default=0.3<br><br>If the output of the different transformers contains sparse matrices,<br>these will be stacked as a sparse matrix if the overall density is<br>lower than this value. Use ``sparse_threshold=0`` to always return<br>dense.  When the transformed output consists of all dense data, the<br>stacked result will be dense, and this keyword will be ignored.</span>
            </a>
        </td>
                <td class="value">0.3</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>Number of jobs to run in parallel.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('transformer_weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=transformer_weights,-dict%2C%20default%3DNone">
                transformer_weights
                <span class="param-doc-description">transformer_weights: dict, default=None<br><br>Multiplicative weights for features per transformer. The output of the<br>transformer is multiplied by these weights. Keys are transformer names,<br>values the weights.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose,-bool%2C%20default%3DFalse">
                verbose
                <span class="param-doc-description">verbose: bool, default=False<br><br>If True, the time elapsed while fitting each transformer will be<br>printed as it is completed.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose_feature_names_out',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=verbose_feature_names_out,-bool%2C%20str%20or%20Callable%5B%5Bstr%2C%20str%5D%2C%20str%5D%2C%20default%3DTrue">
                verbose_feature_names_out
                <span class="param-doc-description">verbose_feature_names_out: bool, str or Callable[[str, str], str], default=True<br><br>- If True, :meth:`ColumnTransformer.get_feature_names_out` will prefix<br>  all feature names with the name of the transformer that generated that<br>  feature. It is equivalent to setting<br>  `verbose_feature_names_out="{transformer_name}__{feature_name}"`.<br>- If False, :meth:`ColumnTransformer.get_feature_names_out` will not<br>  prefix any feature names and will error if feature names are not<br>  unique.<br>- If ``Callable[[str, str], str]``,<br>  :meth:`ColumnTransformer.get_feature_names_out` will rename all the features<br>  using the name of the transformer. The first argument of the callable is the<br>  transformer name and the second argument is the feature name. The returned<br>  string will be the new feature name.<br>- If ``str``, it must be a string ready for formatting. The given string will<br>  be formatted using two field names: ``transformer_name`` and ``feature_name``.<br>  e.g. ``"{feature_name}__{transformer_name}"``. See :meth:`str.format` method<br>  from the standard library for more info.<br><br>.. versionadded:: 1.0<br><br>.. versionchanged:: 1.6<br>    `verbose_feature_names_out` can be a callable or a string to be formatted.</span>
            </a>
        </td>
                <td class="value">True</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('force_int_remainder_cols',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.compose.ColumnTransformer.html#:~:text=force_int_remainder_cols,-bool%2C%20default%3DFalse">
                force_int_remainder_cols
                <span class="param-doc-description">force_int_remainder_cols: bool, default=False<br><br>This parameter has no effect.<br><br>.. note::<br>    If you do not access the list of columns for the remainder columns<br>    in the `transformers_` fitted attribute, you do not need to set<br>    this parameter.<br><br>.. versionadded:: 1.5<br><br>.. versionchanged:: 1.7<br>   The default value for `force_int_remainder_cols` will change from<br>   `True` to `False` in version 1.7.<br><br>.. deprecated:: 1.7<br>   `force_int_remainder_cols` is deprecated and will be removed in 1.9.</span>
            </a>
        </td>
                <td class="value">&#x27;deprecated&#x27;</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-51" type="checkbox" ><label for="sk-estimator-id-51" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>long_and_lat</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>[&#x27;Longitude&#x27;, &#x27;Latitude&#x27;]</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-52" type="checkbox" ><label for="sk-estimator-id-52" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>passthrough</div></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__select_cols__long_and_lat__"><pre>passthrough</pre></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-53" type="checkbox" ><label for="sk-estimator-id-53" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>KNeighborsRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html">?<span>Documentation for KNeighborsRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="knn@5__knn__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_neighbors',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_neighbors,-int%2C%20default%3D5">
                n_neighbors
                <span class="param-doc-description">n_neighbors: int, default=5<br><br>Number of neighbors to use by default for :meth:`kneighbors` queries.</span>
            </a>
        </td>
                <td class="value">5</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('weights',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=weights,-%7B%27uniform%27%2C%20%27distance%27%7D%2C%20callable%20or%20None%2C%20default%3D%27uniform%27">
                weights
                <span class="param-doc-description">weights: {'uniform', 'distance'}, callable or None, default='uniform'<br><br>Weight function used in prediction.  Possible values:<br><br>- 'uniform' : uniform weights.  All points in each neighborhood<br>  are weighted equally.<br>- 'distance' : weight points by the inverse of their distance.<br>  in this case, closer neighbors of a query point will have a<br>  greater influence than neighbors which are further away.<br>- [callable] : a user-defined function which accepts an<br>  array of distances, and returns an array of the same shape<br>  containing the weights.<br><br>Uniform weights are used by default.<br><br>See the following example for a demonstration of the impact of<br>different weighting schemes on predictions:<br>:ref:`sphx_glr_auto_examples_neighbors_plot_regression.py`.</span>
            </a>
        </td>
                <td class="value">&#x27;uniform&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('algorithm',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=algorithm,-%7B%27auto%27%2C%20%27ball_tree%27%2C%20%27kd_tree%27%2C%20%27brute%27%7D%2C%20default%3D%27auto%27">
                algorithm
                <span class="param-doc-description">algorithm: {'auto', 'ball_tree', 'kd_tree', 'brute'}, default='auto'<br><br>Algorithm used to compute the nearest neighbors:<br><br>- 'ball_tree' will use :class:`BallTree`<br>- 'kd_tree' will use :class:`KDTree`<br>- 'brute' will use a brute-force search.<br>- 'auto' will attempt to decide the most appropriate algorithm<br>  based on the values passed to :meth:`fit` method.<br><br>Note: fitting on sparse input will override the setting of<br>this parameter, using brute force.</span>
            </a>
        </td>
                <td class="value">&#x27;auto&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('leaf_size',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=leaf_size,-int%2C%20default%3D30">
                leaf_size
                <span class="param-doc-description">leaf_size: int, default=30<br><br>Leaf size passed to BallTree or KDTree.  This can affect the<br>speed of the construction and query, as well as the memory<br>required to store the tree.  The optimal value depends on the<br>nature of the problem.</span>
            </a>
        </td>
                <td class="value">30</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('p',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=p,-float%2C%20default%3D2">
                p
                <span class="param-doc-description">p: float, default=2<br><br>Power parameter for the Minkowski metric. When p = 1, this is<br>equivalent to using manhattan_distance (l1), and euclidean_distance<br>(l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric,-str%2C%20DistanceMetric%20object%20or%20callable%2C%20default%3D%27minkowski%27">
                metric
                <span class="param-doc-description">metric: str, DistanceMetric object or callable, default='minkowski'<br><br>Metric to use for distance computation. Default is "minkowski", which<br>results in the standard Euclidean distance when p = 2. See the<br>documentation of `scipy.spatial.distance<br><https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and<br>the metrics listed in<br>:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric<br>values.<br><br>If metric is "precomputed", X is assumed to be a distance matrix and<br>must be square during fit. X may be a :term:`sparse graph`, in which<br>case only "nonzero" elements may be considered neighbors.<br><br>If metric is a callable function, it takes two arrays representing 1D<br>vectors as inputs and must return one value indicating the distance<br>between those vectors. This works for Scipy's metrics, but is less<br>efficient than passing the metric name as a string.<br><br>If metric is a DistanceMetric object, it will be passed directly to<br>the underlying computation routines.</span>
            </a>
        </td>
                <td class="value">&#x27;minkowski&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('metric_params',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=metric_params,-dict%2C%20default%3DNone">
                metric_params
                <span class="param-doc-description">metric_params: dict, default=None<br><br>Additional keyword arguments for the metric function.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_jobs',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#:~:text=n_jobs,-int%2C%20default%3DNone">
                n_jobs
                <span class="param-doc-description">n_jobs: int, default=None<br><br>The number of parallel jobs to run for neighbors search.<br>``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.<br>``-1`` means using all processors. See :term:`Glossary <n_jobs>`<br>for more details.<br>Doesn't affect :meth:`fit` method.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div><div class="sk-item"><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label fitted sk-toggleable"><label>final_estimator</label></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator fitted sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-54" type="checkbox" ><label for="sk-estimator-id-54" class="sk-toggleable__label fitted sk-toggleable__label-arrow"><div><div>GradientBoostingRegressor</div></div><div><a class="sk-estimator-doc-link fitted" rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html">?<span>Documentation for GradientBoostingRegressor</span></a></div></label><div class="sk-toggleable__content fitted" data-param-prefix="final_estimator__">
            <div class="estimator-table">
                <details>
                    <summary>Parameters</summary>
                    <table class="parameters-table">
                      <tbody>
                    
            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('loss',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=loss,-%7B%27squared_error%27%2C%20%27absolute_error%27%2C%20%27huber%27%2C%20%27quantile%27%7D%2C%20%20%20%20%20%20%20%20%20%20%20%20%20default%3D%27squared_error%27">
                loss
                <span class="param-doc-description">loss: {'squared_error', 'absolute_error', 'huber', 'quantile'},             default='squared_error'<br><br>Loss function to be optimized. 'squared_error' refers to the squared<br>error for regression. 'absolute_error' refers to the absolute error of<br>regression and is a robust loss function. 'huber' is a<br>combination of the two. 'quantile' allows quantile regression (use<br>`alpha` to specify the quantile).<br>See<br>:ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_quantile.py`<br>for an example that demonstrates quantile regression for creating<br>prediction intervals with `loss='quantile'`.</span>
            </a>
        </td>
                <td class="value">&#x27;squared_error&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('learning_rate',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=learning_rate,-float%2C%20default%3D0.1">
                learning_rate
                <span class="param-doc-description">learning_rate: float, default=0.1<br><br>Learning rate shrinks the contribution of each tree by `learning_rate`.<br>There is a trade-off between learning_rate and n_estimators.<br>Values must be in the range `[0.0, inf)`.</span>
            </a>
        </td>
                <td class="value">0.1</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_estimators',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=n_estimators,-int%2C%20default%3D100">
                n_estimators
                <span class="param-doc-description">n_estimators: int, default=100<br><br>The number of boosting stages to perform. Gradient boosting<br>is fairly robust to over-fitting so a large number usually<br>results in better performance.<br>Values must be in the range `[1, inf)`.</span>
            </a>
        </td>
                <td class="value">500</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('subsample',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=subsample,-float%2C%20default%3D1.0">
                subsample
                <span class="param-doc-description">subsample: float, default=1.0<br><br>The fraction of samples to be used for fitting the individual base<br>learners. If smaller than 1.0 this results in Stochastic Gradient<br>Boosting. `subsample` interacts with the parameter `n_estimators`.<br>Choosing `subsample < 1.0` leads to a reduction of variance<br>and an increase in bias.<br>Values must be in the range `(0.0, 1.0]`.</span>
            </a>
        </td>
                <td class="value">1.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('criterion',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=criterion,-%7B%27friedman_mse%27%2C%20%27squared_error%27%7D%2C%20default%3D%27friedman_mse%27">
                criterion
                <span class="param-doc-description">criterion: {'friedman_mse', 'squared_error'}, default='friedman_mse'<br><br>The function to measure the quality of a split. Supported criteria are<br>"friedman_mse" for the mean squared error with improvement score by<br>Friedman, "squared_error" for mean squared error. The default value of<br>"friedman_mse" is generally the best as it can provide a better<br>approximation in some cases.<br><br>.. versionadded:: 0.18</span>
            </a>
        </td>
                <td class="value">&#x27;friedman_mse&#x27;</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_samples_split',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_samples_split,-int%20or%20float%2C%20default%3D2">
                min_samples_split
                <span class="param-doc-description">min_samples_split: int or float, default=2<br><br>The minimum number of samples required to split an internal node:<br><br>- If int, values must be in the range `[2, inf)`.<br>- If float, values must be in the range `(0.0, 1.0]` and `min_samples_split`<br>  will be `ceil(min_samples_split * n_samples)`.<br><br>.. versionchanged:: 0.18<br>   Added float values for fractions.</span>
            </a>
        </td>
                <td class="value">2</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_samples_leaf',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_samples_leaf,-int%20or%20float%2C%20default%3D1">
                min_samples_leaf
                <span class="param-doc-description">min_samples_leaf: int or float, default=1<br><br>The minimum number of samples required to be at a leaf node.<br>A split point at any depth will only be considered if it leaves at<br>least ``min_samples_leaf`` training samples in each of the left and<br>right branches.  This may have the effect of smoothing the model,<br>especially in regression.<br><br>- If int, values must be in the range `[1, inf)`.<br>- If float, values must be in the range `(0.0, 1.0)` and `min_samples_leaf`<br>  will be `ceil(min_samples_leaf * n_samples)`.<br><br>.. versionchanged:: 0.18<br>   Added float values for fractions.</span>
            </a>
        </td>
                <td class="value">1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_weight_fraction_leaf',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_weight_fraction_leaf,-float%2C%20default%3D0.0">
                min_weight_fraction_leaf
                <span class="param-doc-description">min_weight_fraction_leaf: float, default=0.0<br><br>The minimum weighted fraction of the sum total of weights (of all<br>the input samples) required to be at a leaf node. Samples have<br>equal weight when sample_weight is not provided.<br>Values must be in the range `[0.0, 0.5]`.</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_depth',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=max_depth,-int%20or%20None%2C%20default%3D3">
                max_depth
                <span class="param-doc-description">max_depth: int or None, default=3<br><br>Maximum depth of the individual regression estimators. The maximum<br>depth limits the number of nodes in the tree. Tune this parameter<br>for best performance; the best value depends on the interaction<br>of the input variables. If None, then nodes are expanded until<br>all leaves are pure or until all leaves contain less than<br>min_samples_split samples.<br>If int, values must be in the range `[1, inf)`.</span>
            </a>
        </td>
                <td class="value">3</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('min_impurity_decrease',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=min_impurity_decrease,-float%2C%20default%3D0.0">
                min_impurity_decrease
                <span class="param-doc-description">min_impurity_decrease: float, default=0.0<br><br>A node will be split if this split induces a decrease of the impurity<br>greater than or equal to this value.<br>Values must be in the range `[0.0, inf)`.<br><br>The weighted impurity decrease equation is the following::<br><br>    N_t / N * (impurity - N_t_R / N_t * right_impurity<br>                        - N_t_L / N_t * left_impurity)<br><br>where ``N`` is the total number of samples, ``N_t`` is the number of<br>samples at the current node, ``N_t_L`` is the number of samples in the<br>left child, and ``N_t_R`` is the number of samples in the right child.<br><br>``N``, ``N_t``, ``N_t_R`` and ``N_t_L`` all refer to the weighted sum,<br>if ``sample_weight`` is passed.<br><br>.. versionadded:: 0.19</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('init',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=init,-estimator%20or%20%27zero%27%2C%20default%3DNone">
                init
                <span class="param-doc-description">init: estimator or 'zero', default=None<br><br>An estimator object that is used to compute the initial predictions.<br>``init`` has to provide :term:`fit` and :term:`predict`. If 'zero', the<br>initial raw predictions are set to zero. By default a<br>``DummyEstimator`` is used, predicting either the average target value<br>(for loss='squared_error'), or a quantile for the other losses.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="user-set">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('random_state',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=random_state,-int%2C%20RandomState%20instance%20or%20None%2C%20default%3DNone">
                random_state
                <span class="param-doc-description">random_state: int, RandomState instance or None, default=None<br><br>Controls the random seed given to each Tree estimator at each<br>boosting iteration.<br>In addition, it controls the random permutation of the features at<br>each split (see Notes for more details).<br>It also controls the random splitting of the training data to obtain a<br>validation set if `n_iter_no_change` is not None.<br>Pass an int for reproducible output across multiple function calls.<br>See :term:`Glossary <random_state>`.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_features',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=max_features,-%7B%27sqrt%27%2C%20%27log2%27%7D%2C%20int%20or%20float%2C%20default%3DNone">
                max_features
                <span class="param-doc-description">max_features: {'sqrt', 'log2'}, int or float, default=None<br><br>The number of features to consider when looking for the best split:<br><br>- If int, values must be in the range `[1, inf)`.<br>- If float, values must be in the range `(0.0, 1.0]` and the features<br>  considered at each split will be `max(1, int(max_features * n_features_in_))`.<br>- If "sqrt", then `max_features=sqrt(n_features)`.<br>- If "log2", then `max_features=log2(n_features)`.<br>- If None, then `max_features=n_features`.<br><br>Choosing `max_features < n_features` leads to a reduction of variance<br>and an increase in bias.<br><br>Note: the search for a split does not stop until at least one<br>valid partition of the node samples is found, even if it requires to<br>effectively inspect more than ``max_features`` features.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('alpha',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=alpha,-float%2C%20default%3D0.9">
                alpha
                <span class="param-doc-description">alpha: float, default=0.9<br><br>The alpha-quantile of the huber loss function and the quantile<br>loss function. Only if ``loss='huber'`` or ``loss='quantile'``.<br>Values must be in the range `(0.0, 1.0)`.</span>
            </a>
        </td>
                <td class="value">0.9</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('verbose',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=verbose,-int%2C%20default%3D0">
                verbose
                <span class="param-doc-description">verbose: int, default=0<br><br>Enable verbose output. If 1 then it prints progress and performance<br>once in a while (the more trees the lower the frequency). If greater<br>than 1 then it prints progress and performance for every tree.<br>Values must be in the range `[0, inf)`.</span>
            </a>
        </td>
                <td class="value">0</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('max_leaf_nodes',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=max_leaf_nodes,-int%2C%20default%3DNone">
                max_leaf_nodes
                <span class="param-doc-description">max_leaf_nodes: int, default=None<br><br>Grow trees with ``max_leaf_nodes`` in best-first fashion.<br>Best nodes are defined as relative reduction in impurity.<br>Values must be in the range `[2, inf)`.<br>If None, then unlimited number of leaf nodes.</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('warm_start',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=warm_start,-bool%2C%20default%3DFalse">
                warm_start
                <span class="param-doc-description">warm_start: bool, default=False<br><br>When set to ``True``, reuse the solution of the previous call to fit<br>and add more estimators to the ensemble, otherwise, just erase the<br>previous solution. See :term:`the Glossary <warm_start>`.</span>
            </a>
        </td>
                <td class="value">False</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('validation_fraction',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=validation_fraction,-float%2C%20default%3D0.1">
                validation_fraction
                <span class="param-doc-description">validation_fraction: float, default=0.1<br><br>The proportion of training data to set aside as validation set for<br>early stopping. Values must be in the range `(0.0, 1.0)`.<br>Only used if ``n_iter_no_change`` is set to an integer.<br><br>.. versionadded:: 0.20</span>
            </a>
        </td>
                <td class="value">0.1</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('n_iter_no_change',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=n_iter_no_change,-int%2C%20default%3DNone">
                n_iter_no_change
                <span class="param-doc-description">n_iter_no_change: int, default=None<br><br>``n_iter_no_change`` is used to decide if early stopping will be used<br>to terminate training when validation score is not improving. By<br>default it is set to None to disable early stopping. If set to a<br>number, it will set aside ``validation_fraction`` size of the training<br>data as validation and terminate training when validation score is not<br>improving in all of the previous ``n_iter_no_change`` numbers of<br>iterations.<br>Values must be in the range `[1, inf)`.<br>See<br>:ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_early_stopping.py`.<br><br>.. versionadded:: 0.20</span>
            </a>
        </td>
                <td class="value">None</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('tol',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=tol,-float%2C%20default%3D1e-4">
                tol
                <span class="param-doc-description">tol: float, default=1e-4<br><br>Tolerance for the early stopping. When the loss is not improving<br>by at least tol for ``n_iter_no_change`` iterations (if set to a<br>number), the training stops.<br>Values must be in the range `[0.0, inf)`.<br><br>.. versionadded:: 0.20</span>
            </a>
        </td>
                <td class="value">0.0001</td>
            </tr>
    

            <tr class="default">
                <td><i class="copy-paste-icon"
                     onclick="copyToClipboard('ccp_alpha',
                              this.parentElement.nextElementSibling)"
                ></i></td>
                <td class="param">
            <a class="param-doc-link"
                rel="noreferrer" target="_blank" href="https://scikit-learn.org/1.8/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html#:~:text=ccp_alpha,-non-negative%20float%2C%20default%3D0.0">
                ccp_alpha
                <span class="param-doc-description">ccp_alpha: non-negative float, default=0.0<br><br>Complexity parameter used for Minimal Cost-Complexity Pruning. The<br>subtree with the largest cost complexity that is smaller than<br>``ccp_alpha`` will be chosen. By default, no pruning is performed.<br>Values must be in the range `[0.0, inf)`.<br>See :ref:`minimal_cost_complexity_pruning` for details. See<br>:ref:`sphx_glr_auto_examples_tree_plot_cost_complexity_pruning.py`<br>for an example of such pruning.<br><br>.. versionadded:: 0.22</span>
            </a>
        </td>
                <td class="value">0.0</td>
            </tr>
    
                      </tbody>
                    </table>
                </details>
            </div>
        </div></div></div></div></div></div></div></div></div></div></div></div><script>function copyToClipboard(text, element) {
        // Get the parameter prefix from the closest toggleable content
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const fullParamName = paramPrefix ? `${paramPrefix}${text}` : text;

        const originalStyle = element.style;
        const computedStyle = window.getComputedStyle(element);
        const originalWidth = computedStyle.width;
        const originalHTML = element.innerHTML.replace('Copied!', '');

        navigator.clipboard.writeText(fullParamName)
            .then(() => {
                element.style.width = originalWidth;
                element.style.color = 'green';
                element.innerHTML = "Copied!";

                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            })
            .catch(err => {
                console.error('Failed to copy:', err);
                element.style.color = 'red';
                element.innerHTML = "Failed!";
                setTimeout(() => {
                    element.innerHTML = originalHTML;
                    element.style = originalStyle;
                }, 2000);
            });
        return false;
    }

    document.querySelectorAll('.copy-paste-icon').forEach(function(element) {
        const toggleableContent = element.closest('.sk-toggleable__content');
        const paramPrefix = toggleableContent ? toggleableContent.dataset.paramPrefix : '';
        const paramName = element.parentElement.nextElementSibling
            .textContent.trim().split(' ')[0];
        const fullParamName = paramPrefix ? `${paramPrefix}${paramName}` : paramName;

        element.setAttribute('title', fullParamName);
    });


    /**
     * Adapted from Skrub
     * https://github.com/skrub-data/skrub/blob/403466d1d5d4dc76a7ef569b3f8228db59a31dc3/skrub/_reporting/_data/templates/report.js#L789
     * @returns "light" or "dark"
     */
    function detectTheme(element) {
        const body = document.querySelector('body');

        // Check VSCode theme
        const themeKindAttr = body.getAttribute('data-vscode-theme-kind');
        const themeNameAttr = body.getAttribute('data-vscode-theme-name');

        if (themeKindAttr && themeNameAttr) {
            const themeKind = themeKindAttr.toLowerCase();
            const themeName = themeNameAttr.toLowerCase();

            if (themeKind.includes("dark") || themeName.includes("dark")) {
                return "dark";
            }
            if (themeKind.includes("light") || themeName.includes("light")) {
                return "light";
            }
        }

        // Check Jupyter theme
        if (body.getAttribute('data-jp-theme-light') === 'false') {
            return 'dark';
        } else if (body.getAttribute('data-jp-theme-light') === 'true') {
            return 'light';
        }

        // Guess based on a parent element's color
        const color = window.getComputedStyle(element.parentNode, null).getPropertyValue('color');
        const match = color.match(/^rgb\s*\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)\s*$/i);
        if (match) {
            const [r, g, b] = [
                parseFloat(match[1]),
                parseFloat(match[2]),
                parseFloat(match[3])
            ];

            // https://en.wikipedia.org/wiki/HSL_and_HSV#Lightness
            const luma = 0.299 * r + 0.587 * g + 0.114 * b;

            if (luma > 180) {
                // If the text is very bright we have a dark theme
                return 'dark';
            }
            if (luma < 75) {
                // If the text is very dark we have a light theme
                return 'light';
            }
            // Otherwise fall back to the next heuristic.
        }

        // Fallback to system preference
        return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
    }


    function forceTheme(elementId) {
        const estimatorElement = document.querySelector(`#${elementId}`);
        if (estimatorElement === null) {
            console.error(`Element with id ${elementId} not found.`);
        } else {
            const theme = detectTheme(estimatorElement);
            estimatorElement.classList.add(theme);
        }
    }

    forceTheme('sk-container-id-18');</script></body>
    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 1535-1537

.. code-block:: Python

    -get_scorer("neg_root_mean_squared_error")(gb_final, df_test, y_test)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    40709.24898984372


.. GENERATED FROM PYTHON SOURCE LINES 1538-1541

The final RMSE is a bit better than we got earlier when using 100 trees,
while the training time is still reasonably fast, so we can be happy
with the outcome.

.. GENERATED FROM PYTHON SOURCE LINES 1543-1545

Sharing the model
-----------------

.. GENERATED FROM PYTHON SOURCE LINES 1547-1549

Saving the model artifact
~~~~~~~~~~~~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 1551-1557

Now that we trained the final model, we should save it for later usage.
We could use pickle (or joblib) to do this, and there’s nothing wrong
with that. However, the resulting file is insecure because it can
theoretically execute arbitrary code when loading. Thus, if we want to
share this models with other people, they might be reluctant to unpickle
the file if they don’t trust us completely.

.. GENERATED FROM PYTHON SOURCE LINES 1559-1564

An alternative to the pickle format is the skops format. It is built
with security in mind, therefore, other people can open your skops file
without worries, even if they don’t trust us. More on the skops format
can be found
`here <https://skops.readthedocs.io/en/stable/persistence.html>`__.

.. GENERATED FROM PYTHON SOURCE LINES 1566-1570

For the purpose of this exercise, let’s use skops by calling
``skops.io.dump`` (``skops.io`` was imported as
``sio``) and store the model in a temporary directory, as shown
below:

.. GENERATED FROM PYTHON SOURCE LINES 1573-1576

.. code-block:: Python

    temp_dir = Path(mkdtemp())
    file_name = temp_dir / "model.skops"


.. GENERATED FROM PYTHON SOURCE LINES 1577-1579

.. code-block:: Python

    sio.dump(gb_final, file_name)


.. GENERATED FROM PYTHON SOURCE LINES 1580-1582

Creating a model card
~~~~~~~~~~~~~~~~~~~~~

.. GENERATED FROM PYTHON SOURCE LINES 1584-1586

When we want to share the model with others, it’s good practice to add a
model card. That way, interested users can quickly learn what to expect.

.. GENERATED FROM PYTHON SOURCE LINES 1588-1595

As a first approximation, a model card is a text document, often written
in markdown, that contains sections talking about what kind of problem
we’re dealing with, what kind of model to is used, what the intended
purpose is, how to contact the authors, etc. It may also contain some
metadata, which is targeted at machines and contains, say, tags that
indicate what type of model or task being used. For now, we start
without metadata.

.. GENERATED FROM PYTHON SOURCE LINES 1597-1602

To help getting started, we can use the ``skops.card.Card`` class.
It comes with a few default sections and provides some convenient
methods for adding figures etc. The
`documentation <https://skops.readthedocs.io/en/stable/model_card.html>`__
goes into more details.

.. GENERATED FROM PYTHON SOURCE LINES 1604-1608

For now, let’s start by creating a new model card and adding a few bits
of information. We pass our final model as an argument to the
``Card`` class, which is used to create a table of
hyper-parameters and a diagram of the model.

.. GENERATED FROM PYTHON SOURCE LINES 1611-1614

.. code-block:: Python

    model_card = card.Card(model=gb_final)
    model_card


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Card(
      model=StackingRegressor(estimators=[('kn... random_state=0), passthrough=True),
      Model description/Training Procedure/Hyperparameters=TableSection(51x2),
      Model description/Training Procedure/Model Plot=<style>#sk-co...script></body>,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1615-1622

Next let’s add some prose the the model card. We add a short description
of the model, the intended use, the data, and the preprocessing steps.
Those are just plain strings, which we add to the card using the
``model_card.add`` method. That method takes ``**kwargs`` as
input, where the key corresponds to the name of the section and the
value corresponds to the content, i.e. the aforementioned strings. This
way, we can add multiple new sections with a single method call.

.. GENERATED FROM PYTHON SOURCE LINES 1624-1647

.. code-block:: Python

    description = """Gradient boosting regressor trained on California Housing dataset

    The model is a gradient boosting regressor from sklearn. On top of the standard
    features, it contains predictions from a KNN models. These predictions are calculated
    out of fold, then added on top of the existing features. These features are really
    helpful for decision tree-based models, since those cannot easily learn from geospatial
    data."""
    intended_uses = "This model is meant for demonstration purposes"
    dataset_description = data.DESCR.split("\n", 1)[1].strip()
    preproc_description = (
        "Rows where the target was clipped are excluded. Train/test split is random."
    )

    model_card.add(
        **{
            "Model description": description,
            "Model description/Dataset description": dataset_description,
            "Model description/Intended uses & limitations": intended_uses,
            "Model Card Authors": "Benjamin Bossan",
            "Model Card Contact": "benjamin@huggingface.co",
        }
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Card(
      model=StackingRegressor(estimators=[('kn... random_state=0), passthrough=True),
      Model description=Gradient boosting regressor ...y learn from geospatial data.,
      Model description/Intended uses & limitations=This model is ...ration purposes,
      Model description/Training Procedure/Hyperparameters=TableSection(51x2),
      Model description/Training Procedure/Model Plot=<style>#sk-co...script></body>,
      Model description/Dataset description=California Housing..., 33:291-297, 1997.,
      Model Card Authors=Benjamin Bossan,
      Model Card Contact=benjamin@huggingface.co,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1648-1657

Maybe someone might wonder why we call ``model_card.add(**{…})``
like this. The reason is the following. Normally, Python
``**\ kwargs`` are passed like this: ``foo(key=val)``. But
we cannot use that syntax here, because the ``key`` would have to
be a valid variable name. That means it cannot contain any spaces, start
with a number, etc. But what if our section name contains spaces, like
``"Model description"``? We can still pass it as
``kwargs``, but we need to put it into a dict first. This is why
we use the shown notation.

.. GENERATED FROM PYTHON SOURCE LINES 1659-1662

By the way, if we wanted to change the content of a section, we could
just add the same section name again and the value would be overwritten
by the new content.

.. GENERATED FROM PYTHON SOURCE LINES 1664-1668

Another convenience method we should make use of is the
``model_card.add_metrics`` method. This will store the metrics
inside a table for better readability. Again, we pass multiple inputs
using ``**kwargs``, and the ``description`` is optional.

.. GENERATED FROM PYTHON SOURCE LINES 1670-1683

.. code-block:: Python

    model_card.add_metrics(
        description="Metrics are calculated on the test set",
        **{
            "Root mean squared error": -get_scorer("neg_root_mean_squared_error")(
                gb, df_test, y_test
            ),
            "Mean absolute error": -get_scorer("neg_mean_absolute_error")(
                gb, df_test, y_test
            ),
            "R²": get_scorer("r2")(gb, df_test, y_test),
        },
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Card(
      model=StackingRegressor(estimators=[('kn... random_state=0), passthrough=True),
      Model description=Gradient boosting regressor ...y learn from geospatial data.,
      Model description/Intended uses & limitations=This model is ...ration purposes,
      Model description/Training Procedure/Hyperparameters=TableSection(51x2),
      Model description/Training Procedure/Model Plot=<style>#sk-co...script></body>,
      Model description/Evaluation Results=TableSection(3x2),
      Model description/Dataset description=California Housing..., 33:291-297, 1997.,
      Model Card Authors=Benjamin Bossan,
      Model Card Contact=benjamin@huggingface.co,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1684-1690

How about we also add a plot to our model card? For this, let’s use the plot
that shows the target as a function of longitude and latitude that we created
above. We will just re-use the code from there to generate the plot. We will
store it for now inside the same temporary directory as the model, then call
the ``model_card.add_plot`` method. Since the plot is quite large, let’s
collapse it in the model card by passing ``folded=True``.

.. GENERATED FROM PYTHON SOURCE LINES 1692-1711

.. code-block:: Python

    fig, ax = plt.subplots(figsize=(10, 8))
    df.plot(
        kind="scatter",
        x="Longitude",
        y="Latitude",
        c=target_col,
        title="House value by location",
        cmap="coolwarm",
        s=1.5,
        ax=ax,
    )
    fig.savefig(temp_dir / "geographic.png")
    model_card.add_plot(
        folded=True,
        **{
            "Model description/Dataset description/Data distribution": "geographic.png",
        },
    )


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_012.png
   :alt: House value by location
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_012.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Card(
      model=StackingRegressor(estimators=[('kn... random_state=0), passthrough=True),
      Model description=Gradient boosting regressor ...y learn from geospatial data.,
      Model description/Intended uses & limitations=This model is ...ration purposes,
      Model description/Training Procedure/Hyperparameters=TableSection(51x2),
      Model description/Training Procedure/Model Plot=<style>#sk-co...script></body>,
      Model description/Evaluation Results=TableSection(3x2),
      Model description/Dataset description=California Housing..., 33:291-297, 1997.,
      Model description/Dataset description/Data distribution=PlotSecti...aphic.png),
      Model Card Authors=Benjamin Bossan,
      Model Card Contact=benjamin@huggingface.co,
    )


.. GENERATED FROM PYTHON SOURCE LINES 1712-1716

Similar to the getting started code, we make sure that the file name we
use for adding is just the plain ``"geographic.png"``,
excluding the temporary directory, or else the file cannot be found
later on.

.. GENERATED FROM PYTHON SOURCE LINES 1718-1720

The model card class also provides a convenient method to add a plot
that visualizes permutation importances. Let’s use it:

.. GENERATED FROM PYTHON SOURCE LINES 1722-1729

.. code-block:: Python

    pi = permutation_importance(
        gb_final, df_test, y_test, scoring="neg_root_mean_squared_error", random_state=0
    )
    model_card.add_permutation_importances(
        pi, columns=df_test.columns, plot_file="permutation-importances.png", overwrite=True
    )


.. image-sg:: /auto_examples/images/sphx_glr_plot_california_housing_013.png
   :alt: Permutation Importances
   :srcset: /auto_examples/images/sphx_glr_plot_california_housing_013.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    Card(
      model=StackingRegressor(estimators=[('kn... random_state=0), passthrough=True),
      Model description=Gradient boosting regressor ...y learn from geospatial data.,
      Model description/Intended uses & limitations=This model is ...ration purposes,
      Model description/Training Procedure/Hyperparameters=TableSection(51x2),
      Model description/Training Procedure/Model Plot=<style>#sk-co...script></body>,
      Model description/Evaluation Results=TableSection(3x2),
      Model description/Dataset description=California Housing..., 33:291-297, 1997.,
      Model description/Dataset description/Data distribution=PlotSecti...aphic.png),
      Model Card Authors=Benjamin Bossan,
      Model Card Contact=benjamin@huggingface.co,
      Permutation Importances=PlotSection(permutation-importances.png),
    )


.. GENERATED FROM PYTHON SOURCE LINES 1730-1734

For this particular model card, the predefined section
``"Citation"`` is not required. Therefore, we delete it
using ``model_card.delete``. Be careful: If there were subsections
inside this section, they would be deleted too.

.. GENERATED FROM PYTHON SOURCE LINES 1737-1739

.. code-block:: Python

    model_card.delete("Citation")


.. GENERATED FROM PYTHON SOURCE LINES 1740-1742

Finally, we save the model card in the temporary directory as
``README.md``.

.. GENERATED FROM PYTHON SOURCE LINES 1744-1746

.. code-block:: Python

    model_card.save(temp_dir / "README.md")


.. GENERATED FROM PYTHON SOURCE LINES 1747-1750

Now the model card is saved as a markdown file in the temporary
directory, together with the gradient boosting model and the figures we
added earlier.

.. GENERATED FROM PYTHON SOURCE LINES 1752-1754

Conclusion
----------

.. GENERATED FROM PYTHON SOURCE LINES 1756-1761

Hopefully, this has been a useful exercise. We took a deep dive into the
task of working with the California Housing dataset, gained a good
understanding of the data, used some of the more advanced and less well
known features of scikit-learn, and trained a machine learning model
that performs well.

.. GENERATED FROM PYTHON SOURCE LINES 1763-1766

If you have any feedback or suggestions for improvement, feel free to
reach out to the skops team, e.g. by visiting our `discord
channel <https://skops.readthedocs.io/en/stable/community.html#discord>`__.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (1 minutes 5.097 seconds)


.. _sphx_glr_download_auto_examples_plot_california_housing.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_california_housing.ipynb <plot_california_housing.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_california_housing.py <plot_california_housing.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_california_housing.zip <plot_california_housing.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_