Hello all, I am trying to do the Machine Learning certification for Python located here, however on the data cleaning step I always am getting an empty dataframe.
I am deleting the users with implicit reviews using
df_ratings = df_ratings[df_ratings.rating != 0]
and I believe I am getting the desired output DataFrame, however in trying to remove the users with less than 200 reviews and the books with less than 100 reviews, I get an empty dataframe which is really confusing as i thought using just a simple groupby pandas dataframe method and filter would do the job. When filtering just the isbn or user by itself, I do not get an empty dataframe, but in doing both i get no shape. Here is a link to my notebook if anyone can help/provide feedback I am extremely grateful. Thank you notebook
I would try displaying the dataframe after each step (after groupby()
and after filter()
) as opposed to just examining the shape as you may see the problem then. You may also want to try storing the results in new data frames instead of over the original so that you can separately have all the ratings, the ratings that meet the per user criteria, and the ratings that meet the per book criteria so that you can properly “and” them.
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.