(Python) Help with Plotting Data

Hey Y’all,

I’m trying to plot some csv data with python, but I’m getting ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I’ve isolated the issue to x = df_bar.index for the bar plot, but I’m not sure how to fix it.

*Edit: I’ve also found out that I can’t call the month column into my plot

ax = sns.catplot(
      x = df_bar.index, # this is the problem
      y = 'value',
      hue = 'month', # this is also a problem
      kind = 'bar',
      data = df_bar

Here’s a sample of my data

2016	5	19432.400000
        6	21875.105263
        7	24109.678571
        8	31049.193548
        9	41476.866667
       10	27398.322581
       11	40448.633333
       12	27832.419355
2017	1	32785.161290
        2	31113.071429
        3	29369.096774
        4	30878.73333

Hi there-

I’m getting this same error in Mean-Var-STD calculator certification project. I’m not much help, but from looking at my code, I’m suspicious it may have something to do with numpy updating from 1.19.0 -> 1.19.1, as that value error seems to be associate with Numpy.

Or that could be super off! Hopefully someone with more knowledge will shed some light.

1 Like

I’m not using numpy for my bar plot, so I don’t think that’s the issue on my end :upside_down_face:

I’ll try copying the index column into a separate column tomorrow and see if that solves it…

Yeah, ignore what I put before- my issue was me having a function output the wrong object type. That’s probably not helpful either, but best of luck!

As it is right now hue = 'month' creates issues, because there’s no month column in your df_bar.
month column was pulled in the index when .groupby() were called with as_index=True argument.

df_bar looks like

2016 5       19432.400000
     6       21875.105263

While the indices alone

MultiIndex([(2016,  5),
            (2016,  6),
1 Like

So I’ve copied the index column to a new column and set the as_index to false. This seemed to clean up my data and I think I’m on the right track now :wink:

df_bar['year'] = df_bar.index
df_bar = df_bar.groupby([df_bar.index, "month"], as_index=False).mean()

Here’s a sample

   month   	value	     year
0	5	19432.400000	2016.0
1	6	21875.105263	2016.0
2	7	24109.678571	2016.0
3	8	31049.193548	2016.0
4	9	41476.866667	2016.0