(Python) Help with Plotting Data

Hey Y’all,

I’m trying to plot some csv data with python, but I’m getting ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I’ve isolated the issue to x = df_bar.index for the bar plot, but I’m not sure how to fix it.

*Edit: I’ve also found out that I can’t call the month column into my plot

ax = sns.catplot(
      x = df_bar.index, # this is the problem
      y = 'value',
      hue = 'month', # this is also a problem
      kind = 'bar',
      data = df_bar
    )

Here’s a sample of my data

		      value
      month	
2016	5	19432.400000
        6	21875.105263
        7	24109.678571
        8	31049.193548
        9	41476.866667
       10	27398.322581
       11	40448.633333
       12	27832.419355
2017	1	32785.161290
        2	31113.071429
        3	29369.096774
        4	30878.73333

Hi there-

I’m getting this same error in Mean-Var-STD calculator certification project. I’m not much help, but from looking at my code, I’m suspicious it may have something to do with numpy updating from 1.19.0 -> 1.19.1, as that value error seems to be associate with Numpy.

Or that could be super off! Hopefully someone with more knowledge will shed some light.

I’m not using numpy for my bar plot, so I don’t think that’s the issue on my end :upside_down_face:

I’ll try copying the index column into a separate column tomorrow and see if that solves it…

Yeah, ignore what I put before- my issue was me having a function output the wrong object type. That’s probably not helpful either, but best of luck!

As it is right now hue = 'month' creates issues, because there’s no month column in your df_bar.
month column was pulled in the index when .groupby() were called with as_index=True argument.

df_bar looks like

                    value
     month               
2016 5       19432.400000
     6       21875.105263
(...)

While the indices alone

MultiIndex([(2016,  5),
            (2016,  6),
(...)

So I’ve copied the index column to a new column and set the as_index to false. This seemed to clean up my data and I think I’m on the right track now :wink:

df_bar['year'] = df_bar.index
df_bar = df_bar.groupby([df_bar.index, "month"], as_index=False).mean()

Here’s a sample

   month   	value	     year
0	5	19432.400000	2016.0
1	6	21875.105263	2016.0
2	7	24109.678571	2016.0
3	8	31049.193548	2016.0
4	9	41476.866667	2016.0