@sanity @kitanikita Yes; I was able to plot Best fit line 1 as per your guidance. I happen to use the logic and plot from 2000 till 2050 for Best fit line 2? Please correct me if I’m wrong; as it says use data from 2000? Have I plotted “Best fit line 2” incorrectly?
Also; when I run this program it never errors out and doesn’t stop from running either?
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress
# import numpy as np
def draw_plot():
# Read data from file
df = pd.read_csv('epa-sea-level.csv')
# Create scatter plot
# plt.figure(figsize=(12,8))
# ax = plt.subplot(1,2,1)
# ax.scatter(x='Year', y ='CSIRO Adjusted Sea Level')
x = df['Year']
y = df['CSIRO Adjusted Sea Level']
plt.xlabel('Year')
plt.ylabel('CSIRO Adjusted Sea Level')
plt.scatter(x,y)
# Create first line of best fit
x = df['Year']
y = df['CSIRO Adjusted Sea Level']
slope, intercept, r_value, p_value, std_err = linregress(x,y)
# Dummy variable to calculate First best fit line
x2 = list(range(1880, 2050))
y2 = []
for year in x2:
y2.append(intercept + slope * year)
plt.plot(x2, y2, 'r', label = 'Best Fit Line 1', color='green')
# Create second line of best fit
x3 = list(range(2000, 2020))
y3 = []
for year in x3:
y3.append(intercept + slope * year)
plt.plot(x3, y3, 'r', label = 'Best Fit Line 2')
plt.legend()
plt.show()
# Add labels and title
plt.xlabel('Year')
plt.ylabel('Sea Level (inches)')
plt.title('Rise in Sea Level')
# Save plot and return data for testing (DO NOT MODIFY)
plt.savefig('sea_level_plot.png')
return plt.gca()
The data for year 2000 onwards follows a different trend. So you need create a separate fit for the data with year>=2000
and plot it with the new slope and intercept values:
xx = df[ df['Year'] >= 2000 ]['Year']
yy = df[ df['Year'] >= 2000 ]['CSIRO Adjusted Sea Level']
fit2 = linregress(xx, yy)
new_slope = fit2.slope
new_intercept = fit2.intercept
Then you can create dummy lists xx2
and yy2
, as you did before, and plot them.
1 Like
@kitanikita @SupremeSadat @sanity
In one of my earlier topics I raised this where I don’t get any error but the program just never ceases to stop running?? So; I have re-written the whole thing on a new forked project/ assignment thinking this would go away. But; after resolving “dependencies” description is required etc. the issue still remains as is.
Also; I can’t seem to see the “Best Line fit 2” plotted successfully?? Please help!
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress
def draw_plot():
# Read data from file
df = pd.read_csv('epa-sea-level.csv')
# Create Scatter plot
# Previously I tried the folowing code:
# plt.figure(figsize=(12,8))
# ax = plt.scatter(x='Year', y='CSIRO Adjusted Sea Level')
# Start off with X-axis and Y-axis as below
x = df['Year']
y = df['CSIRO Adjusted Sea Level']
plt.xlabel('Year')
plt.ylabel('CSIRO Adjusted Sea Level')
plt.scatter(x,y)
# Create FIRST line of best fit
x = df['Year']
y = df['CSIRO Adjusted Sea Level']
# First grab the current Slope & Intercept values
slope, intercept, r_value, p_value, std_err = linregress(x, y)
# Using the above Slope/ Intercept values now plot Best Fit 1
x1 = list(range(1880, 2050))
y1 = []
for year in x1:
y1.append(intercept + slope * year)
plt.plot(x1, y1, 'r', label = 'Best Line Fit 1')
plt.legend()
plt.show()
# Create Second line of Best fit
# SECOND best fit for year 2000 onwards so shall have new slope and intercept
xfuture = df[['Year'] >= 2000] ['Year']
yfuture = df[['Year'] >= 2000] ['CSIRO Adjusted Sea Level']
newfit = linregress(xfuture, yfuture)
newslope = newfit.slope
newintercept = newfit.intercept
for year in xfuture:
yfuture.append(newintercept + newslope * year)
plt.plot(xfuture, yfuture, 'r', label = 'Best Fit Line 2', color='green')
plt.legend()
plt.show()
# Add labels and title
plt.xlabel('Year')
plt.ylabel('Sea Level (inches)')
plt.title('Rise in Sea Level')
# Save plot and return data for testing (DO NOT MODIFY)
plt.savefig('sea_level_plot.png')
return plt.gca()
Remove the first plt.show()
– it separates your plt
into 2 sections.
When creating xfuture
and yfuture
:
- Your conditional statement is not
['Year'] >=2000
(because it is not a variable) but df['Year'] >=2000
.
- So you need to call
df[conditional_statement]
to restrict df
and then call the 'Year'
column.
- All together this will be:
xfuture = df[df['Year'] >= 2000]['Year']
yfuture = df[df['Year'] >= 2000]['CSIRO Adjusted Sea Level']
After you create newfit
, you still need to create a new set dummy variables to plot its function. So something like
xfuture2 = list(range(2000, 2050))
for the x coordinate and as a similar loop to what you did before to fill a new empty list for yfuture2
(the dummy y variable).
@kitanikita @SupremeSadat @sanity Thank you very much for your assistance; looks like I’m almost close…can you please confirm if I get the two lines Best Fit Line 1 and Best Fit Line 2; is there anything else required to be done?
# Create FIRST line of best fit
x = df['Year']
y = df['CSIRO Adjusted Sea Level']
# First grab the current Slope & Intercept values
slope, intercept, r_value, p_value, std_err = linregress(x, y)
# Using the above Slope/ Intercept values now plot Best Fit 1
x1 = list(range(1880, 2050))
y1 = []
for year in x1:
y1.append(intercept + slope * year)
plt.plot(x1, y1, 'r', label = 'Best Fit Line 1')
plt.legend()
# plt.show()
# Create Second line of Best fit
# SECOND best fit for year 2000 onwards so shall have new slope and intercept
xfuture = df[df['Year'] >= 2000] ['Year']
yfuture = df[df['Year'] >= 2000] ['CSIRO Adjusted Sea Level']
newfit = linregress(xfuture, yfuture)
newslope = newfit.slope
newintercept = newfit.intercept
xfuture2 = list(range(2000, 2050))
yfuture2 = []
for xfuture in xfuture2:
yfuture2.append(newintercept + newslope * xfuture)
plt.plot(xfuture2, yfuture2, 'r', label = 'Best Fit Line 2', color='green')
plt.legend()
plt.show()
# Add labels and title
plt.xlabel('Year')
plt.ylabel('Sea Level (inches)')
plt.title('Rise in Sea Level')
What I’m worried about is …my program never ceases to stop running? Is it because; I’m unable to pull list of values of Sea levels?
I think plt.show()
pauses the script in general. So, if you remove plt.show()
, from your script that should fix it.
If you close Figure 1, that should also resume the script. This works fine when you are testing the specific plot, but I would remove it once you’re done with that section of the code.