Data Analysis with Python Projects - Page View Time Series Visualizer

Tell us what’s happening:
It is about the unit test in this project. test_bar_plot_number_of_bars
The expected number of bars is 49. The number of bars in my result is 45. What I expected from my plot and the example plot is 44 (3*12 month + 8 month).

df_bar=df.groupby([df.date.dt.year,df.date.dt.month])[‘value’].sum().rename_axis(index=[‘year’, ‘month’]).reset_index()
fig, ax = plt.subplots(figsize=(8,6))
w1=0.6/12
for month1 in range(1,13):
dfl=df_bar[df_bar.month==month1]
ax.bar(dfl.year+w1*(month1-6), dfl.value, w1, )
ax.set_xticks(ticks=df_bar.year.unique().tolist())
ax.ticklabel_format(style=‘plain’)
ax.set_xlabel(‘Years’)
ax.set_ylabel(‘Average Page Views’)
ax.legend([calendar.month_name[i] for i in range(1, 13)], title=‘Month’)

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36

Challenge: Data Analysis with Python Projects - Page View Time Series Visualizer

Link to the challenge:

Can you wrap your code in triple backticks or use the code format in the toolbar? Can’t copy/paste it like this

Screenshot 2023-09-03 082302

Or link to a google colab or replit?

1 Like

And please post the full code? I can’t replicate your error. Maybe there’s something you did to the df before this? Here is the error I get:


AttributeError                            Traceback (most recent call last)

<ipython-input-9-55f359cbef2a> in <cell line: 3>()
      1 df =pd.read_csv('fcc-forum-pageviews.csv')
      2 
----> 3 df_bar=df.groupby([df.date.dt.year,df.date.dt.month])['value'].sum().rename_axis(index=['year', 'month']).reset_index()
      4 fig, ax = plt.subplots(figsize=(8,6))
      5 w1=0.6/12

2 frames

/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/accessors.py in __new__(cls, data)
    510             return PeriodProperties(data, orig)
    511 
--> 512         raise AttributeError("Can only use .dt accessor with datetimelike values")

AttributeError: Can only use .dt accessor with datetimelike values

Here is the code in wrap

def draw_bar_plot():
    # Copy and modify data for monthly bar plot
    df_bar=df.groupby([df.date.dt.year,df.date.dt.month])['value'].sum().rename_axis(index=['year', 'month']).reset_index()
    fig, ax = plt.subplots(figsize=(8,6))
    w1=0.6/12
    # for year1 in df_bar.year.unique().tolist():
    for month1 in range(1,13):
        dfl=df_bar[df_bar.month==month1]
        ax.bar(dfl.year+w1*(month1-6), dfl.value, w1, )
    ax.set_xticks(ticks=df_bar.year.unique().tolist())
    ax.ticklabel_format(style='plain')
    ax.set_xlabel('Years')
    ax.set_ylabel('Average Page Views')
    ax.legend([calendar.month_name[i] for i in range(1, 13)], title='Month')
    # Save image and return fig (don't change this part)
    fig.savefig('bar_plot.png')
    return fig

I still get this error:

Can only use .dt accessor with datetimelike values

Where is the code where you read in the file and convert to date formats?

Sorry I missed the following line

df['date']=pd.to_datetime(df['date'], format='%Y-%m-%d')

Is there any other cleaning of the data that you do before this? That is part of the project.

Clean the data by filtering out days when the page views were in the top 2.5% of the dataset or bottom 2.5% of the dataset.

I need 100% of the code to troubleshoot this, if I’m missing anything it’s not going to be the same and I can’t tell what’s different or why.

One thing that sticks out is this

It should show average daily page views for each month

but your groupby is using .sum instead of .mean()

Is this where you calculate your averages?

w1=0.6/12
        
    for month1 in range(1,13):
        dfl=df_bar[df_bar.month==month1]
         ax.bar(dfl.year+w1*(month1-6), dfl.value, w1, )

Your groupby also isn’t quite complete. You return a dataframe like this:

    year  month    value
0   2016      5   258582
1   2016      6   573731
2   2016      7   722741
3   2016      8   962525
4   2016      9  1244306

but it needs to look like this format:

month 	1 	2 	3 	4 	5 	6 	7 	8 	9 	10 	11 	12
year 												
2016 	NaN 	NaN 	NaN 	NaN 	19432.4 	21875.1 	24109.7 	31049.2 	41476.9 	27398.3 	40448.6 	27832.4
2017 	32785.2 	31113.1 	29369.1 	30878.7 	34244.3 	43577.5 	65806.8 	47712.5 	47376.8 	47438.7 	57701.6 	48420.6
2018 	58580.1 	65679.0 	62693.8 	62350.8 	56562.9 	70117.0 	63591.1 	62831.6 	65941.7 	111378.1 	78688.3 	80047.5

Check out .unstack() : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.unstack.html

This is a tough one, I hope this helps!

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.