Sea Level Predictor final check

@sanity @kitanikita Yes; I was able to plot Best fit line 1 as per your guidance. I happen to use the logic and plot from 2000 till 2050 for Best fit line 2? Please correct me if I’m wrong; as it says use data from 2000? Have I plotted “Best fit line 2” incorrectly?
Also; when I run this program it never errors out and doesn’t stop from running either?

import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress
# import numpy as np

def draw_plot():
    # Read data from file
    df = pd.read_csv('epa-sea-level.csv')

    # Create scatter plot
    # plt.figure(figsize=(12,8))
    # ax = plt.subplot(1,2,1)
    # ax.scatter(x='Year', y ='CSIRO Adjusted Sea Level')    
    x = df['Year']
    y = df['CSIRO Adjusted Sea Level']

    plt.xlabel('Year')
    plt.ylabel('CSIRO Adjusted Sea Level')
    plt.scatter(x,y)

    # Create first line of best fit

    x = df['Year']
    y = df['CSIRO Adjusted Sea Level']
    
    slope, intercept, r_value, p_value, std_err = linregress(x,y)

    # Dummy variable to calculate First best fit line
    x2 = list(range(1880, 2050))
    y2 = []

    for year in x2:
      y2.append(intercept + slope * year)
        
    plt.plot(x2, y2, 'r', label = 'Best Fit Line 1', color='green')
    
    # Create second line of best fit
    x3 = list(range(2000, 2020))
    y3 = []

    for year in x3:
      y3.append(intercept + slope * year)

    plt.plot(x3, y3, 'r', label = 'Best Fit Line 2')
    plt.legend()
    plt.show()

    # Add labels and title
    plt.xlabel('Year')
    plt.ylabel('Sea Level (inches)')
    plt.title('Rise in Sea Level')
    
    # Save plot and return data for testing (DO NOT MODIFY)
    plt.savefig('sea_level_plot.png')
    return plt.gca()

The data for year 2000 onwards follows a different trend. So you need create a separate fit for the data with year>=2000 and plot it with the new slope and intercept values:

xx = df[ df['Year'] >= 2000 ]['Year']
yy = df[ df['Year'] >= 2000 ]['CSIRO Adjusted Sea Level']

fit2 = linregress(xx, yy)
new_slope = fit2.slope
new_intercept = fit2.intercept

Then you can create dummy lists xx2 and yy2, as you did before, and plot them.

1 Like

@kitanikita @SupremeSadat @sanity

In one of my earlier topics I raised this where I don’t get any error but the program just never ceases to stop running?? So; I have re-written the whole thing on a new forked project/ assignment thinking this would go away. But; after resolving “dependencies” description is required etc. the issue still remains as is.
Also; I can’t seem to see the “Best Line fit 2” plotted successfully?? Please help!

import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress

def draw_plot():
    # Read data from file
    df = pd.read_csv('epa-sea-level.csv')

    # Create Scatter plot
    # Previously I tried the folowing code:
    # plt.figure(figsize=(12,8))
    # ax = plt.scatter(x='Year', y='CSIRO Adjusted Sea Level')

    # Start off with X-axis and Y-axis as below
    x = df['Year']
    y = df['CSIRO Adjusted Sea Level']

    plt.xlabel('Year')
    plt.ylabel('CSIRO Adjusted Sea Level')
    plt.scatter(x,y)


    # Create FIRST line of best fit
    x = df['Year']
    y = df['CSIRO Adjusted Sea Level']

    # First grab the current Slope & Intercept values
    slope, intercept, r_value, p_value, std_err = linregress(x, y)

    # Using the above Slope/ Intercept values now plot Best Fit 1
    x1 = list(range(1880, 2050)) 
    y1 = []

    for year in x1:
      y1.append(intercept + slope * year)

    plt.plot(x1, y1, 'r', label = 'Best Line Fit 1')
    plt.legend()
    plt.show()

    
    # Create Second line of Best fit
    # SECOND best fit for year 2000 onwards so shall have new slope and intercept

    xfuture = df[['Year'] >= 2000] ['Year']
    yfuture = df[['Year'] >= 2000] ['CSIRO Adjusted Sea Level']

    newfit = linregress(xfuture, yfuture)
    newslope = newfit.slope
    newintercept = newfit.intercept

    for year in xfuture:
      yfuture.append(newintercept + newslope * year)
        
    plt.plot(xfuture, yfuture, 'r', label = 'Best Fit Line 2', color='green')
    plt.legend()
    plt.show()


    # Add labels and title
    plt.xlabel('Year')
    plt.ylabel('Sea Level (inches)')
    plt.title('Rise in Sea Level')

    
    # Save plot and return data for testing (DO NOT MODIFY)
    plt.savefig('sea_level_plot.png')
    return plt.gca()

Remove the first plt.show() – it separates your plt into 2 sections.

When creating xfuture and yfuture:

  • Your conditional statement is not ['Year'] >=2000 (because it is not a variable) but df['Year'] >=2000.
  • So you need to call df[conditional_statement] to restrict df and then call the 'Year' column.
  • All together this will be:
xfuture = df[df['Year'] >= 2000]['Year']
yfuture = df[df['Year'] >= 2000]['CSIRO Adjusted Sea Level']

After you create newfit, you still need to create a new set dummy variables to plot its function. So something like

xfuture2 = list(range(2000, 2050))

for the x coordinate and as a similar loop to what you did before to fill a new empty list for yfuture2 (the dummy y variable).

@kitanikita @SupremeSadat @sanity Thank you very much for your assistance; looks like I’m almost close…can you please confirm if I get the two lines Best Fit Line 1 and Best Fit Line 2; is there anything else required to be done?

# Create FIRST line of best fit
    x = df['Year']
    y = df['CSIRO Adjusted Sea Level']

    # First grab the current Slope & Intercept values
    slope, intercept, r_value, p_value, std_err = linregress(x, y)

    # Using the above Slope/ Intercept values now plot Best Fit 1
    x1 = list(range(1880, 2050)) 
    y1 = []

    for year in x1:
      y1.append(intercept + slope * year)

    plt.plot(x1, y1, 'r', label = 'Best Fit Line 1')
    plt.legend()
    # plt.show()
    
    # Create Second line of Best fit
    # SECOND best fit for year 2000 onwards so shall have new slope and intercept

    xfuture = df[df['Year'] >= 2000] ['Year']
    yfuture = df[df['Year'] >= 2000] ['CSIRO Adjusted Sea Level']

    newfit = linregress(xfuture, yfuture)
    newslope = newfit.slope
    newintercept = newfit.intercept

    xfuture2 = list(range(2000, 2050))
    yfuture2 = []

    for xfuture in xfuture2:
      yfuture2.append(newintercept + newslope * xfuture)
        
    plt.plot(xfuture2, yfuture2, 'r', label = 'Best Fit Line 2', color='green')
    plt.legend()
    plt.show()

    # Add labels and title
    plt.xlabel('Year')
    plt.ylabel('Sea Level (inches)')
    plt.title('Rise in Sea Level')

What I’m worried about is …my program never ceases to stop running? Is it because; I’m unable to pull list of values of Sea levels?

I think plt.show() pauses the script in general. So, if you remove plt.show(), from your script that should fix it.
If you close Figure 1, that should also resume the script. This works fine when you are testing the specific plot, but I would remove it once you’re done with that section of the code.