Build a Probability Calculator Project - Build a Probability Calculator Project

The following code fails test #4 (4. The experiment method should return a different probability.)

import copy
import random

class Hat:
    __slots__ = ('contents')
    def __init__(self, **kwargs):
        self.contents = []

        for kwarg in kwargs.items():
            for _ in range(kwarg[1]):
                self.contents.append(kwarg[0])
        if len(self.contents) == 0:
            raise AttributeError(f'Cannot create class {self.__class__.__name__} with empty contents.')        

    def draw(self, num_balls_to_draw):
        balls_drawn = []
        num_of_balls_in_hat = len(self.contents)
        while num_balls_to_draw > 0 and num_of_balls_in_hat > 0:
            balls_drawn.append(self.contents.pop(int(random.random() * num_of_balls_in_hat)))
            num_balls_to_draw -= 1
            num_of_balls_in_hat = len(self.contents)
        return balls_drawn

def experiment(hat, expected_balls, num_balls_drawn, num_experiments):
    successful_draws = 0
    for _ in range(num_experiments):
        copy_hat = copy.deepcopy(hat)
        balls_drawn = copy_hat.draw(num_balls_drawn)
        successful_draws += 1 if all(balls_drawn.count(color) >= occurences for color, occurences in expected_balls.items()) else 0
    return successful_draws / num_experiments

hat = Hat(black=6, red=4, green=3)
probability = experiment(hat=hat,
                  expected_balls={'red':2,'green':1},
                  num_balls_drawn=5,
                  num_experiments=2000)
print('probability:', probability)

In draw() method, if I replace
balls_drawn.append(self.contents.pop(int(random.random() * num_of_balls_in_hat)))
with
balls_drawn.append(self.contents.pop(random.randint(0, num_of_balls_in_hat - 1)))
the test passes.

I think the two modalities of obtaining a random index to pop from self.contents:
int(random.random() * num_of_balls_in_hat)
vs
random.randint(0, num_of_balls_in_hat - 1)
should have the same effect but the first one fails the test while the second one passes.

I don’t understand why.

well the random.random() gives you a float between 0 and 1, for example 0.7678. I don’t think multiplying that random decimal with a number will give you the index you wanted.

random.randint() is a good way to go about this, as it gives you proper integers, which is required for indexes. hope this helps!

The result of multiplying random() with a number is passed to the int() function, effectively getting a random integer between 0 (inclusive) and that number used for multiplying (not inclusive).

So,
int(random.random() * 5) gets you a random integer in the set {0, 1, 2, 3, 4}
and
random.randint(0, 5 - 1) gets you a random integer in the set {0, 1, 2, 3, 4}

Admittedly, using randint() is the way to go, it’s clearer and easier to understand, but what I am trying to find out is why the other solution fails Test #4.

Is it a problem in the test, is it something I fundamentally misunderstand or something else?

how different is the probability you get in the two cases?

I’ve ran the program 5 times with each variant and the results are:

  • using random(): 0.3685 0.3615 0.3625 0.3635 0.361
  • using randint(): 0.3715 0.3625 0.3565 0.371 0.3585

When Test #4 fails (using random()), this is what I find in the browser console:

> FAIL: test_prob_experiment (test_module.UnitTests.test_prob_experiment)
> Traceback (most recent call last):
> File "/home/pyodide/test_module.py", line 16, in test_prob_experiment
> self.assertAlmostEqual(actual, expected, delta = 0.01, msg = 'Expected experiment method to return a different probability.')
> AssertionError: 0.262 != 0.272 within 0.01 delta (0.010000000000000009 difference) : Expected experiment method to return a different probability.

The test is run with these if you want to try what happens with the two methods

 hat = Hat(blue=3,red=2,green=6)
 probability =experiment(hat=hat, expected_balls={"blue":2,"green":1}, num_balls_drawn=4, num_experiments=1000)

also the solution used to verify if the tests work uses randrange

I’ve found this in the Python Documentation regarding randrange():

Changed in version 3.2: randrange() is more sophisticated about producing equally distributed values. Formerly it used a style like int(random()*n) which could produce slightly uneven distributions.

So apparently the recommended way to generate random integers is to use randrange() or randint() (which is an alias of randrange()).

Test #4 probably uses a seed in order to produce the same results every time the test is run and this in combination with int(random()*n) producing slightly uneven distributions of random numbers causes the test to fail because the result is a little bit out of expected delta (0.01 delta (0.010000000000000009 difference)).

Lesson learned and best practice:
use randrange() or randint() when generating random integers in Python.