Test data for machine learning


I am currently learning machine learning. I have found myself needing test data to train and test my algorithms lately. So far I have been using python to generate test data like this, but I would like some more image data in particular.

My question is, do you know any other good way to generate test data with python or any other language tbh?

Side note, I am really happy with using scikit learning in python. It’s such a simple machine learning library that seems to have almost anything you need for machine learning. It is therefore a big plus if your suggestion is somewhat similar to sklearn. :slight_smile:


Generating images is no small task. At the most basic level that is still GAN territory, to my knowledge (which, I’ll admit, might be outdated). If that’s the case, then the simplest way to get images to use as test data is to find existing image datasets. If you don’t find one you like you could always make one.

1 Like

Try the U.C. Irvine data repo: https://archive.ics.uci.edu/ml/index.php

It contains a lot of useful info.