First, there is a well documented bug with this project that you can find in the several threads about it; the final test has the books in reverse order. I think the number of returned books and the data structure is wrong too; you’ll need to look in the forums for discussion of those problems and fix the test accordingly.

The problems are in the processing. Two are here:

```
# Removing users with less than 200 ratings and books with less than 100 ratings
counts1 = ratings['userID'].value_counts()
ratings = ratings[ratings['userID'].isin(counts1[counts1 >= 200].index)]
counts2 = ratings['bookRating'].value_counts()
ratings = ratings[ratings['bookRating'].isin(counts2[counts2 >= 100].index)]
```

`counts2`

needs to be counted by the book (`'ISBN'`

) and not the rating value. These two counts need to be and’ed together (you want books with 100 ratings **and** users with 200 ratings). The way you have it, you’re likely pulling in some books that meet one requirement but not the other. Another may be here:

```
# Combining ratings and books and removing unnecessary columns
combine_book_rating = pd.merge(ratings, books, on='ISBN')
columns = ['yearOfPublication', 'publisher',
'bookAuthor', 'imageUrlS', 'imageUrlM', 'imageUrlL']
combine_book_rating = combine_book_rating.drop(columns, axis=1)
# Remove rows with no title
combine_book_rating = combine_book_rating.dropna(axis=0, subset=['bookTitle'])
# Adding the total number of ratings and grouping per book
book_ratingCount = (combine_book_rating.
groupby(by=['bookTitle'])['bookRating'].
count().
reset_index().
rename(columns={'bookRating': 'totalRatingCount'})
[['bookTitle', 'totalRatingCount']])
# Merging the previous dataframe with the ratings+books dataframe
rating_with_totalRatingCount = combine_book_rating.merge(
book_ratingCount, left_on='bookTitle', right_on='bookTitle', how='left')
# Removing duclicate ratings
rating_with_totalRatingCount = rating_with_totalRatingCount.drop_duplicates([
'userID', 'bookTitle'])
# Reshaping rating_with_totalRatingCount to have book titles as indices, user IDs as columns and rating as values
rating_with_totalRatingCount_pivot_with_na = rating_with_totalRatingCount.pivot(
index='bookTitle', columns='userID', values='bookRating')
rating_with_totalRatingCount_pivot = rating_with_totalRatingCount_pivot_with_na.fillna(
0)
```

All I know you need is to merge the ratings and books, drop the duplicates, and create the pivot table. The rest may or may not be necessary. (I cut out most of it and got the correct results, so…)