Page time visualiser

I’m having trouble bringing up the box plot. I’ve done every other project apart from this one and it’s giving me a headache with the box plot. Thanks in advance.

This is my code for the box plot.

def draw_box_plot():
    # Prepare data for box plots (this part is done!)
    df_box = df.copy()
    df_box.reset_index(inplace=True)
    df_box['year'] = [d.year for d in df_box.date]
    df_box['month'] = [d.strftime('%b') for d in df_box.date]
    # Draw box plots (using seabotn)  
    month_order = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
    df_box['month'] = pd.Categorical(df_box['month'], categories=month_order, ordered=True)
    fig, axes = plt.subplots(1, 2, figsize=(12, 6))
    # First box plot with custom title and axis labels
    sns.boxplot(x=df_box['year'], y=df_box['value'], orient='v', ax=axes[0])
    axes[0].set_title("Year-wise Box Plot (Trend)")
    axes[0].set_xlabel("Year")
    axes[0].set_ylabel("Page Views")

    # Second box plot with custom title and axis labels
    sns.boxplot(x=df_box['month'], y=df_box['value'], orient='v', ax=axes[1])
    axes[1].set_title("Month-wise Box Plot (Seasonality)")
    axes[1].set_xlabel("Month")
    axes[1].set_ylabel("Page Views")

    # Save image and return fig (don't change this part)
    fig.savefig('box_plot.png')
    return fig

Do you get an error message? What happens?

Hi,
sorry, I didn’t think someone would reply, thank you so much.
The error says; “ERROR: test_box_plot_titles (main.BoxPlotTestCase)”

Traceback (most recent call last):
File “/workspace/boilerplate-page-view-time-series-visualizer/test_module.py”, line 68, in setUp
self.fig = time_series_visualizer.draw_box_plot()
File “/workspace/boilerplate-page-view-time-series-visualizer/time_series_visualizer.py”, line 76, in draw_box_plot
ax2 = sns.boxplot(x=‘Years’, y=‘value’, data=df_box, ax=axes[0])
NameError: name ‘df_box’ is not defined

This is the error in more detail.

I’ve edited your code for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (').

Is this your only error message?

This function passes the test for me.

Can you please share your full code for this project?

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from pandas.plotting import register_matplotlib_converters
from calendar import month_name
register_matplotlib_converters()

# Import data (Make sure to parse dates. Consider setting index column to 'date'.)
df = pd.read_csv('fcc-forum-pageviews.csv', index_col='date')

# Clean data
df = df[(df['value'] >= df['value'].quantile(0.025)) & (df['value'] <= df['value'].quantile(0.975))]
df.index = pd.to_datetime(df.index)


def draw_line_plot():
    # Draw line plot
    value = df['value']
    date = value.index
    fig, ax = plt.subplots()
    ax.plot(date, value)
    ax.set_xlabel('Date')
    ax.set_ylabel('Page Views')
    ax.set_title('Daily freeCodeCamp Forum Page Views 5/2016-12/2019')
    # Save image and return fig (don't change this part)
    fig.savefig('line_plot.png')
    return fig


def draw_bar_plot():
    # Copy and modify data for monthly bar plot
    df_bar = df.copy()
    df_bar['year'] = df_bar.index.year
    df_bar['month'] = df_bar.index.month
    
    df_bar = df_bar.groupby(['year','month'])['value'].mean()
    
    df_bar = df_bar.unstack()
    df_bar.columns = ['January','February','March','April','May','June','July','August','September','October','November','December']
    
    # Draw bar plot
    fig = df_bar.plot(kind = 'bar', figsize = (15,10)).figure
  
    plt.xlabel('Years', fontsize = 15)
    plt.ylabel('Average Page Views', fontsize = 15)
    plt.legend(loc = 'upper left', title = 'Months', fontsize = 13)


    # Save image and return fig (don't change this part)
    fig.savefig('bar_plot.png')
    return fig


def draw_box_plot():
    # Import the data
    df = pd.read_csv('fcc-forum-pageviews.csv', parse_dates=['date'], index_col='date')

    # Clean the data
    df_clean = df[
        (df['value'] >= df['value'].quantile(0.025)) &
        (df['value'] <= df['value'].quantile(0.975))
    ]

    # Create a new DataFrame for grouping by year and month
    df_box_year = df_clean.copy()
    df_box_year['year'] = df_box_year.index.year

    # Create a new DataFrame for grouping by month
    df_box_month = df_clean.copy()
    df_box_month['month'] = df_box_month.index.strftime('%b')

    # Plot the box plots
    f, axes = plt.subplots(figsize=(12, 7), ncols=2, sharex=False)
    sns.despine(left=True)

    ax2 = sns.boxplot(x='Years', y='value', data=df_box, ax=axes[0])
    ax2.set_xlabel('Year')
    ax2.set_ylabel('Page Views')
    ax2.set_title('Year-wise Box Plot(Trend)')

    ax2 = sns.boxplot(x='Months', y='value', data=df_box, ax=axes[1])
    ax2.set_xlabel('Month')
    ax2.set_ylabel('Page Views')
    ax2.set_title('Month-wise Box Plot(Seasonality)')


    plt.show()
    # Save image and return fig (don't change this part)
    fig.savefig('box_plot.png')
    return fig

or you can check it here.
time_series_visualizer.py - boilerplate-page-view-time-series-visualizer - Gitpod Code

So far, yes. I’m not sure if I’m going to pass this one.

Please try to format your code in the future:

I’ve edited your code for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (').

This code is totally different. It’s clear now why you are getting this error:

NameError: name ‘df_box’ is not defined

It’s because you do not have a df_box anymore.

The code you posted earlier passed the tests.

I changed it to see if I get a different result. I changed it back to the original.

Can you please post a link to the project so we can see the instructions?

It doesn’t make sense, since the first code you posted creates df_box and passes the tests as well.

I’m trying to format the code now.

File “/workspace/.pyenv_mirror/user/current/lib/python3.8/site-packages/seaborn/categorical.py”, line 203, in establish_variables
group_names = categorical_order(groups, order)
File “/workspace/.pyenv_mirror/user/current/lib/python3.8/site-packages/seaborn/utils.py”, line 533, in categorical_order
np.asarray(values).astype(np.float)
File “/workspace/.pyenv_mirror/user/current/lib/python3.8/site-packages/numpy/init.py”, line 305, in getattr
raise AttributeError(former_attrs[attr])
AttributeError: module ‘numpy’ has no attribute ‘float’.
np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
NumPy 1.20.0 Release Notes — NumPy v2.2.dev0 Manual


Ran 11 tests in 1.618s

FAILED (errors=4)

time_series_visualizer.py - boilerplate-page-view-time-series-visualizer - Gitpod Code

Sorry because this has me stressed out. I’m moving to fast, I’ve must have changed the code god knows how many times.

This error is when I add the df_box code.