Page View Time Series: x-axis in first plot

Hi there,

I’m working on this cool exercise. My problem is the x-axis of the first plot. Can someone give me a hint, how I can make it display the dates as in the example?

Also, I don’t understand how I should ‘save’ the plot as a figure. Anybody?

Thanks in advance ! :slightly_smiling_face:

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

# Import data (Make sure to parse dates. Consider setting index column to 'date'.)
df = pd.read_csv("/Users/jimsmithuis/Desktop/fcc-forum-pageviews.csv")
df.index = df['date']
df.drop(['date'], axis=1, inplace=True)

# print(f"The mean value is: {df['value'].mean().round()}")
# print(f"The lower quantile is: {df['value'].quantile(0.025)}")
# print(f"The upper quantile is: {df['value'].quantile(0.975).round()}")

# Clean data
mask = (df['value'] > df['value'].quantile(0.025)) & (df['value'] < df['value'].quantile(0.975))
df = df[mask]

def draw_line_plot():
    # Draw line plot
    plt.figure(figsize=(12, 6))
    plt.plot(df, color='red')
    plt.title('Daily freeCodeCamp Forum Page Views 5/2016-12/2019')
    plt.xlabel('Date')
    plt.ylabel('Page Views')
    plt.show()

    # Save image and return fig (don't change this part)
    fig.savefig('line_plot.png')
    return fig

draw_line_plot()

First, try parsing the dates as you read the data with pd.read_csv(). Currently, your code is treating those dates as strings because you didn’t tell it to do otherwise and it looks like most of the dates are being used as tick labels. I believe the example is done with the default date tick format for matplotlib so just parsing the date strings should solve the problem.

Second, you don’t have a fig because you never declared one because you are using the implicit plt.* interface to matplotlib and not the explicit one that typically uses fig and ax as variables. Consult the matplotlib quick start guide for details.

Thank you for your help Jeremy.
Now I understand regarding parsing the dates and using OOP for making the plots.

However, the problem with the correct order of the months has not yet been solved. This is my code now:

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

# Import data (Make sure to parse dates. Consider setting index column to 'date'.)
df = pd.read_csv("/Users/jimsmithuis/Desktop/fcc-forum-pageviews.csv", parse_dates=['date'])
df.index = df['date']
df.drop(['date'], axis=1, inplace=True)

# print(f"The mean value is: {df['value'].mean().round()}")
# print(f"The lower quantile is: {df['value'].quantile(0.025)}")
# print(f"The upper quantile is: {df['value'].quantile(0.975).round()}")

# Clean data
mask = (df['value'] > df['value'].quantile(0.025)) & (df['value'] < df['value'].quantile(0.975))
df = df[mask]
df.reset_index(inplace=True)
print(df.head())

# make additional year column
df['year'] = df['date'].dt.year

# make additional month column
df['month'] = df['date'].dt.month_name()

# groupby
df_grouped = df.groupby(['year','month'], as_index=False, sort=False).mean()
print(df_grouped)

fig, ax = plt.subplots(figsize=(12,7))
sns.barplot(df_grouped, x='year', y='value', hue='month')
ax.set_xlabel('Years')
ax.set_ylabel('Average Page Views')
ax.legend(title='Months')
plt.show()

Here is an example of the plot.


As you can see, it is not plotting the months from January onwards for the years 2017-2019. Do you have any clue, how I can approach this?

This is what df_grouped looks like:

Thanks in advance.

Seaborn takes the data in order and plots it by default. You can order the data yourself (somehow) or you can give sns.barplot() an order.

1 Like

Hi @jimisin :wave: ,
The code you have posted seems to be correctly plotting the data and saving it as an image. You are using the plot method from matplotlib.pyplot to plot the data in the df dataframe, with the x-axis as the index (dates) and the y-axis as the “value” column. You are also setting the plot title, x-label, and y-label.

After the plot is created, you are using the savefig method from matplotlib.pyplot to save the plot as an image, in this case as a PNG image named “line_plot.png”.

Note that in the line fig.savefig('line_plot.png'), the variable fig is not defined in the code, so this line will raise a NameError. You can either define fig by assigning it to plt.figure(figsize=(12, 6)) before the savefig line, or remove the fig variable and simply use plt.savefig('line_plot.png').

1 Like

Thank you for your answer Jeremy. In the end, I made the months ‘January’ until ‘April’ for 2016 without any page views. Not the nicest programming solution, but it worked.

1 Like

Thank you @Oluyemi, thanks to your insights I understand saving a figure and assigning it much better!

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.