Tell us what’s happening:
Hello. I need some help with how to proceed with the RPS challenge. I’ll describe my thoughts and the options I’ve spotted in the following paragraphs. I would be a great help to get your thoughts on my thinking process and tips on how to proceed.
The instructions do not specify that you need to use machine learning algorithms or tensorflow, but I assumed those are in the spirit of the challenge.
Firstly, I proceeded to prepare the data for input into an algorithm. My program outputs the actions and results into lists, and appends them to another list:
[[1, 1, 0], [0, 2, 1],…]
The first item is a integer representation of my move, the second item of the opponents move, and the third item is whether I won (1), tied (0), or lost (-1). I turned this list of lists into a numpy array and used that to create a tensor. I have not managed to use this tensor as input for an algorithm yet, I’m a bit lost on how to proceed. I’ve learned that one-hot encoding is probably a good option here. How would I implement that? Also I’ve seen people make X and y inputs for the tensorflow algorithms, but don’t really get that.
Furthermore, there are two options as I can see for training the AI:
-
Accumulate a large dataset of all the opponents and train an AI afterwards, save it, load it into the program and then use that AI to beat the 4 opponents.
-
Try and train the AI by feeding it a list after every single game, and allowing it to learn while the program is running and use its output as input for the player function, which outputs the results back to the AI as input.
The second option seems more fun, but I do not know how viable it is with only 4000 games and if tensorflow lends itself to iterative updates. I’ve only seen it being trained in the course with a giant dataset all at once, not with data that is generated in the program using the AI itself. Can somebody advise me on this?
Additionally, there are two algorithms that seem feasible:
- The LTST algorithm, due to the time step nature of RPS.
- The Q learning algorithm, because it can be seen as game states.
I’m leaning towards LSTM, it seems to make the most sense. Thoughts?
I hope this is not too much questions. But I have too much options at the moment and as a beginner, I’m not quite sure how to proceed. I tried finding help on forums and google but I can’t quite make sense of it all. Thanks in advance for the help.
Your code so far
Replit
Your browser information:
User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36
Challenge: Machine Learning with Python Projects - Rock Paper Scissors
Link to the challenge: