Inconsistency when I run one of the notebooks from the ML course on my google colab

I watched the videos on RNNs in the Machine Learning course, and I have a technical question. I downloaded the Natural Language Processing jupyter notebook

When I ran the code on google colab on my own, I’ve noticed something strange. For some reason when I’m training the movie review model, done at 6:53 in the video:

in each epoch my notebook seem to only processes 625 data instead of 20,000, like in the video. I attached a screenshot. Since I haven’t changed anything in the code, I don’t understand why it doesn’t train on the whole 80% of the training data (which should be 20,000). When I check the shape of the training data after training, it’s still 25,000, so there is no reason why it should train on 625 only. On the other hand, when evaluating the model, it appears to do that only on 782 instead of 25,000, this is also visible on the picture.

I would be grateful if anyone could please tell me what could be the problem here.

Side note: 25,000 / 782 = 20,000 / 625 = 32. This may be not a coincidence, since 32 is a power of 2. But in any case, I’m puzzled :slight_smile:

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.