Help with Sea level-predictor

I don’t know what do with the error
This is my code

import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress

def draw_plot():
    # Read data from file
    df = pd.read_csv("epa-sea-level.csv")

    # Create scatter plot
    plt.scatter(df["Year"], df["CSIRO Adjusted Sea Level"])

    # Create first line of best fit
    slope, intercept, r_value, p_value, std_err = linregress(x=df["Year"], y=df["CSIRO Adjusted Sea Level"])
    year_extended = list(range(1880, 2050, 1))
    line = [intercept + slope * j for j in year_extended]
    plt.plot(year_extended, line, linewidth=2, color="r")

    # Create second line of best fit
    mod_df = df.loc[df["Year"] >= 2000]
    slope2, intercept2, r_value2, p_value2, std_err2 = linregress(x=mod_df["Year"], y=mod_df["CSIRO Adjusted Sea Level"])
    year2 = list(range(2000, 2050, 1))
    line2 = [intercept2 + slope2 * j for j in year2]
    plt.plot(year2, line2, linewidth=3, color="k")

    # Add labels and title
    plt.xlabel("Year")
    plt.ylabel("Sea Level (inches)")
    plt.title("Rise in Sea Level")
    
    # Save plot and return data for testing (DO NOT MODIFY)
    plt.savefig('sea_level_plot.png')
    return plt.gca()

This is the error

FAIL: test_plot_data_points (test_module.LinePlotTestCase)

Traceback (most recent call last):
File “/home/runner/boilerplate-sea-level-predictor/test_module.py”, line 30, in test_plot_data_points
 python main.py
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-umaphdjp because the default path (/config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
F.F.

FAIL: test_plot_data_points (test_module.LinePlotTestCase)

Traceback (most recent call last):
File “/home/runner/boilerplate-sea-level-predictor/test_module.py”, line 30, in test_plot_data_points
self.assertEqual(actual, expected, “Expected different data points in scatter plot.”)
AssertionError: Lists differ: [[188[26 chars]72441], [1882.0, -0.44
 python main.py
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-suvt1xh9 because the default path (/config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
F.F.

FAIL: test_plot_data_points (test_module.LinePlotTestCase)

Traceback (most recent call last):
File “/home/runner/boilerplate-sea-level-predictor/test_module.py”, line 30, in test_plot_data_points
self.assertEqual(actual, expected, “Expected different data points in scatter plot.”)
AssertionError: Lists differ: [[188[26 chars]72441], [1882.0, -0.440944881], [1883.0, -0.23[2982 chars]951]] != [[188[26 chars]7244100000002], [1882.0, -0.440944881], [1883.[3226 chars]951]]

First differing element 1:
[1881.0, 0.220472441]
[1881.0, 0.22047244100000002]

Diff is 6114 characters long. Set self.maxDiff to None to see it. : Expected different data points in scatter plot.

======================================================================
FAIL: test_plot_lines (test_module.LinePlotTestCase)

Traceback (most recent call last):
File “/home/runner/boilerplate-sea-level-predictor/test_module.py”, line 37, in test_plot_lines
self.assertEqual(actual, expected, “Expected different line for second line of best fit.”)
AssertionError: Lists differ: [7.06[42 chars]04435186, 7.560361677767105, 7.726788951098968[873 chars]3011] != [7.06[42 chars]04435242, 7.560361677767105, 7.726788951098968[873 chars]3011]

First differing element 2:
7.393934404435186
7.393934404435242

Diff is 1253 characters long. Set self.maxDiff to None to see it. : Expected different line for second line of best fit.

This is caused by pandas new versions changing precision of representing float numbers. There are two ways to mitigate that in own code. One is adding float_precision='legacy' keyword argument to the pd.read_csv call. Another is forcing pandas version 1.1.5 in pyproject.toml file and updating dependencies.

@sanity Thank you, it worked, I used the float_precision=‘legacy’, although I still don’t get why it worked.

Expected results in tests were written some time ago, at that time pandas were using as default different (less precise) representation of float numbers when read with read_csv method. pandas 1.2.0 changed that default, but float_precision='legacy' optional argument allows to use the same precision as in older versions.