I’m struggling to get the expected test numbers on the Book Recommendation Machine Learning Challenge .
I deleted all ratings from users with less than 200 ratings and all ratings from books with less than 100 ratings. Then I pass the remaining ratings into a csr_matrix where I replace missing values with 0. When I pass this into my NearestNeighbour-fit, I’m getting other results though. It gives me recommendations, but except for 2 books, they are not the expected ones and the distances are off by magnitudes. Could it be that the dataset changed as well?
These are the recommendations and distances I get for the test:
[ "Where the Heart Is (Oprah's Book Club (Paperback))", [ ['The Surgeon', 61.286213], ['Unspeakable', 61.522354], ['The Perks of Being a Wallflower', 61.579216], ['Gap Creek: The Story Of A Marriage', 61.676575], ['The Weight of Water', 61.75759] ] ]
Down below is the link to my google colab, would be thankful for any feedback