Sum All Primes: Why does this work to find prime numbers?

I have been trying to figure out this lesson. Finding the prime numbers, it turns out, is challenging. I found this code, which works, but I don’t understand the logic at all. Why does this code work to get prime numbers? As far as I can tell, sieve is an empty variable, and I don’t see anything that fills it with numbers, so how is it getting the primes from the sieve?

  function getPrimes(max) {
    var sieve = [];
    var i;
    var j;
    var primes = [];
    for (i = 2; i <= max; ++i) {
      if (!sieve[i]) {
        // i has not been marked -- it is prime
        primes.push(i);
        for (j = i << 1; j <= max; j += i) {
          sieve[j] = true;
        }
      }
    }

    return primes;
  }
2 Likes

First of all, please properly format your code. If you put three backticks (below the ESC on your keyboard) on the line before your code block and three more on the line after your code block, it will format nicely and be easy to read.

That being said, this is an add algorithm. It is based on the famous Sieve of Eratosthenes algorithm. Perhaps a quick perusal of that will help. That gif they have basically shows what this algorithm is doing.

sieve is an array that will contain “true” if that index is not a prime number.
primes will contain the primes that we’ve found.

in pseudo code:

loop i from 2 up to max         // 2 is the lowest prime
  if sieve[i] != true then             // check if that number has been marked as not prime
    add i to our list of primes        // if not, add it to our list of primes
    loop j from i*2 through multiples of 2 until max // this is marking new "not primes" above our current prime
       mark each of those indexes in sieve as true   // AKA, not prime

Dose that make sense? If !sieve[i] then i must not be prime. If it were, it would have been marked as “true” already.

The “true” marking code only does it for multiples of prime numbers. For example, we don’t need to mark multiples of 6 because we already got all of those when we did 2 and when we did 3. The only potential new “not primes” are multiples of primes. And it may seem odd that we’re filling in the sieve array as we go, but remember that it’s filling in the “not primes” above the prime that we’ve found - by definition no number is evenly divisible than anything greater than itself so it doesn’t matter that the sieve indexes above the number we’re checking haven’t been filled in.

In theory, you could have not worried about primes at first and just looped and marked anything that is a multiple of your index as “true” indicating that it is “not prime”. But then you would have had to loop again to collect all the primes (anything index in sieve not marked as “true”.)

The only odd thing in this example is i << 1. This is a binary shift, shifting the binary digits to the left and backfilling with zero. Effectively, that is multiplying by 2. (Just like in decimal, adding a zero on the end is multiplying by 10.) That line could have been for (j = i * 2; j <= max; j += i) {

Why use a binary operator? Showing off? Maybe. Is a binary operator a little faster when put into machine code. If I remember correctly it is a faster. But it is a comprehensional speed bump.

I should also point out that that example won’t solve the challenge on fCC since it still needs to sum those numbers.

1 Like

Thanks for the detailed reply.

I’m mostly wrapping my head around how the algorithm works, but not how the code is using the algorithm. Is “sieve” a variable that javascript natively understands? From the way the code is written, it looks like sieve is defined as an array, but then nothing else is done with that variable until the code checks if the array index is false, and then pushes the prime number to the primes array. But it doesn’t look like there should be any way for it to know.

1 Like

As pointed our by @ksjazzguitar formatting the code goes a long way in making the code more readable…

function getPrimes(max) {
    var sieve = [];
    var i;
    var j;
    var primes = [];
    for (i = 2; i <= max; ++i) {
        if (!sieve[i]) {
            // i has not been marked -- it is prime
            primes.push(i);
            for (j = i << 1; j <= max; j += i) {
                sieve[j] = true;
            }
        }
    }

    return primes;
}

I had never seen this code before and it’s great for finding prime numbers. Its simplicity makes it a bit confusing… I admit that my head spun for few minutes trying to understand the implementation.

There were couple things that I learned from this sample:
1. j = i << 1
It basically is the same as j = 2*i

2. sieve[j] = true;
It will put true in the jth position of the array. Even if there are no previous values. If we isolate the statement and we just do something like this

var sieve = [];
sieve[5] = true;

Then we get an array like this:

[null, null, null, null, true];

  1. Be careful with this loop, which is key to the functionality and may be misread if you are not clear

for (j = i << 1; j <= max; j += i) {

which could be properly read as

for (j = 2 * i; j <= max; j = j + i) { // that's an i not a 1 !

So let’s supposed that i = 5, the loop will iterate over these values for j (making the multiples true - not primes)

10, 15, 20, 25, 30, 35, 40, 45, ....

2 Likes

Is “sieve” a variable that javascript natively understands?

No, “sieve” is just a variable name. We could have called it “elephantPajamas”. The author chose that name as a reference to the Sieve of Eratosthenes, the algorithm developed by Erantosthenes in 3rd century BC Greece.

From the way the code is written, it looks like sieve is defined as an array, but then nothing else is done with that variable until the code checks if the array index is false, and then pushes the prime number to the primes array. But it doesn’t look like there should be any way for it to know.

Yes, the way sieve is used is a bit odd. It is declared but never initialized. So, how does an uninitialized element in an array evaluate in JS? Let’s test it:

var myArray = [];

console.log(myArray[3]);

// -> undefined

if (myArray[3])
  console.log("JS evaluated 'undefined' as 'true'!");
else
  console.log("JS evaluated 'undefined' as 'false'!");

// -> JS evaluated 'undefined' as 'false'!

Aha! So, sieve is an array that is never initialized so every index in it evaluates as false. This is what the algorithm uses. Perhaps a better name for that array would be hasBeenMarkedAsNotPrime because that is how the logic is working. It starts out assuming that every number is prime (undefined). As we do our i loop, we will check if an index has not been flipped to “true” (true == not prime) - if it has not then it is prime.

In the for loop, first we check if (!sieve[i]) {, or we’re checking if that index has been marked true yet. It hasn’t of course because we start at 2 and nothing in sieve has been marked true. So, that means that it’s prime (this assumes - accurately - that we are starting with a prime - this would fail if we started at 4) and we add that to our primes array. Then we index through all the multiples of i and mark them as true (in other words, “not prime”):

for (j = i << 1; j <= max; j += i) {  // remember i << 1 is the same as i*2, starting at the next multiple
  sieve[j] = true;
}

So, this will let us know about any non-primes above us. This is where sieve is getting set. It is a little odd that we’re setting up sieve on the fly, but it works because we’re working from the bottom up and sieve is always one step ahead of the number that we’re checking. When we get to 3, sieve[3] is still undefined so we know that 3 is prime. When we get to sieve[4], that is true (marked when we looped through the multiples of 2) so we know that that is not prime. (We don’t need to loop through the multiples of 4 since we already got all of those as multiples of 2). sieve[5] is undefined so that is prime. sieve[6] is true (we got that as a multiple of 2 and then again for 3) so we know that it is not prime, etc, etc.

If the algorithm itself is confusing you, then please go to that wiki and see that gif they have. Perhaps even make a grid on a piece of paper and try it yourself. It’s a strange little algorithm, but it’s ingenious. Hopefully with an understanding of the wacky way sieve is being used, this is falling into place.

Don’t feel bad - I’m pretty good at algorithms and this one took me a couple looks, especially how sieve is being used. That wouldn’t work in many programming languages. But it works in JS.

4 Likes

Another thing that may be confusing if you’re not familiar with the syntax: ! means logical not, so !x evaluates to true if x is falsy and false if x is truthy. Falsy values are 0, '', false, null, undefined, and NaN; truthy values are everything else (1, true, 42, 'potato waffles', 0xFF, etc.)

Thank you so much for the help everyone!

When I read this, it much more completely clicked. I see it now. My understanding is still pretty mucky, but I’m going to experiment. Thank you for taking the time to help me with this!

Also, I’ll be more cognizant of formatting in the future. My apologies!

Hey, I am on the verge of understanding this. I am having a problem/need a clarification with the last part of the ‘j’ loop. the part j += i.

Please tell me if I am correct. When we use the number 2

we go 2 *2 which is the case when we are at (j = I <<1;)

the next part(j <= max) states we should iterate through all the numbers in which case it would be ten(assuming we did not change the number in the freecodecamp example). Then the last part

j = 2 *2 in which case the next number we would iterate through would be 4? am I reading that right? Am I missing something?

I’m not quite sure I understand your questions (a few fingers of bourbon after 6 hours of straight coding probably isn’t helping).

The purpose of the j loop is to mark all the multiples of i.

If we reach the j loop, then we know that i is prime (if (!sieve[i]) {) and we just just added i to the list of primes (primes.push(i);). Now, following the Sieve of Eratosthenes, we’re going to mark all multiples of i as “not prime”, by assigning their array slots as true.

So, for the *for: loop, we start off with j being twice i (the first multiple of i). That is the j = i << 1 (which could have been written as j = i*2).

We’re going to loop and mark while j is less than max - we don’t need to mark above max because the problem only requires us to check up until max.

And on each iteration, we’re going to increment up to the next multiple of i, by adding i to j.

The j loop could also have been written as:

        for (j=2; j*i<=max; j++) {
          sieve[j*i] = true;
        }

Would that be more clearly stepping though the multiples of i? It does the same exact thing, just with different math. But it steps through the same multiples in the same way. It’s just instead of j being the multiple, j is now the multiplier. I think the original would be a hair faster, but they both do the exact same thing.

And remember, we only need to mark multiples of primes - the multiples on nonprimes have already been marked. For example, for the number 6, any multiple of 6 has already been marked first when we marked the multiples of 2 and the multiples of 3. To be clear how this works, I still highly recommend the wikipedia page which has an excellent graphic showing how the multiples of the primes eliminate all the nonprimes.

I think you should step through the code. Whenever I get confused by something like this, I start putting in console.log statements to make sure I understand. Here is a pen of how I would do it.

Okay after a day or two I finally get it. thanks for your explanation. I do have one more question though. Why is it that we have to set up another function instead of just going num.length. like we do with other loops?

I’m not sure I understand? What is the other function?

You seem to be talking about the second statement in the for loop j <= max. But that isn’t a function.

The purpose of the second statement in a for loop definition is to create a condition for the loop to stop. It can be anything - if it evaluates to true, the loop will continue, and once it evaluates to false, then the loop stops.

If you are talking about something else, please show me the code about which you are writing.

Oops sorry about that . I assumed this was also written like Mr. Rafase282 code. He has this function written in another function.

function sumPrimes(num) {
  var res = 0;

  // FUnction to get the primes up to max in an array
  function getPrimes(max) {
    var sieve = [];
    var i;
    var j;
    var primes = [];
    for (i = 2; i <= max; ++i) {
      if (!sieve[i]) {
        // i has not been marked -- it is prime
        primes.push(i);
        for (j = i << 1; j <= max; j += i) {
          sieve[j] = true;
        }
      }
    }

    return primes;
  }

  // Add the primes
  var primes = getPrimes(num);
  for (var p = 0; p < primes.length; p++) {
    res += primes[p];
  }

  return res;
}

After messing around with the code. I came up with this. The reason I like my code is because if you wanted to see the prime numbers in the prime array you can type “return primes” and it shows you the number it holds.

function sumPrimes(num) {
    var res;
    var sieve = [];
    var i;
    var j;
    var primes = [];

    for (i = 2; i <= num; ++i) {
      if (!sieve[i]) {
        // i has not been marked -- it is prime
        primes.push(i);
       
        for (j = i << 1; j <= num; j += i) {
          sieve[j] = true;
          }
         }
       }
    
    var add = primes.reduce(function (a, b) {
     return a + b;
    }, 0);

  return add;
}
sumPrimes(10);

I’ve edited your code to make it readable. Please remember to put a line of three backticks before your code and a like of three backticks after the code.

markdown_Forums

Yeah, they both work.

Why break it off into a separate function, I think was your question?

You don’t have to. Sometimes it makes sense to break different tasks into different functions. Certainly as programs get larger, this makes more sense. If you end up having to repeat the same code in different spots (not the case here) then it makes sense. Also, maybe you come up with a really cool algorithm for generating primes and want to be able to use it again.

Also, as things get bigger, breaking things apart into functions can greatly improve readability. I know that readability may not seem important now, but trust me, the first time you need to go in and fix some code that you wrote 11 months ago and you spend 5 hours just trying to figure out how you did it - then it becomes really important. Now imagine that someone else is having to try to fix your 800 line program. While coding, I always try to keep that idea in the back of my head.

Do we need a separate function here? Not really. With a few comments, this is very readable. But it’s also not wrong to build it that way.

I am trying to get better at the forma thing. I forget it its 2 or 3 back ticks. Anyways that totally makes sense. Thank you so much for all your help. I was worried there for a bit that maybe you wouldn’t be active anymore. Thank’
s again.

1 Like