Sort and Math.random, need simple explanation

Sort and Math.random, need simple explanation
0

#1

Given the following arr:

const arr = ['Jane', 'Kelvin', 'Gabriel', 'Bob', 'Alex'];

If I use the sort method to sort an array, I know the order of the elements will be rearranged based on alphabetic order in case the elements are strings;

console.log(arr.sort()); // output: Alex, Bob, Gabriel, Jane, Kelvin

If the elements are numbers:

const arrNumbers = [20, 1, 17, 30, 28];
console.log(arrNumbers.sort()); // output: 1, 17, 20, 28, 30

I have a solid understanding of how the sort method works, and I can see it’s not completely random, that’s why people would use the following code:

arr.sort(() => 0.5 - Math.random());

With the above code the arr will be rearranged randomly and problem solved, but what’s the logic here? I’d like to know what’s going on and not just use a code blindly, but I don’t understand this part 0.5 - Math.random()

Math.random() returns any number from 0 to 0.99999… but why subtract 0.5 from it?

Thanks in advance.


#2

In the above scenario the sort method does even try to create a random ordering of the elements. In fact, it is a very specific order. The numbers are treated as strings and then sorted in lexical order, so in the following array “100” would come before “2” and “40000” would come before “500” after the sort.

const arrNumbers = [2, 100, 500, 4000];
console.log(arrNumbers.sort()); // output: [ 100, 2, 4000, 500 ]

The above implementation of using sort with the callback returning 0.5 - Math.random() is not the preferred way, because it can behave very differently between browsers based on how the browser implements the sort behind-the-scenes.

The best way to randomly shuffle an array is to use the Fisher-Yates (aka Knuth) Shuffle algorithm. Seethis SO thread for more information.

EDIT: There is a great read here on how Microsoft almost got sued for using the sort method you first mentioned.


#3

Ok I will read this article, thanks.


#4

I was curious how inefficient the Math.random approach was so I wrote a pen. (Ignoring for the moment a debate about the mathematics of randomness.)

const numItems = 10000
const numPasses = 10
const results = { fy: [], rand: [] }
const arrProto = []

const shuffle = (array) => {
  let currentIndex = array.length, temporaryValue, randomIndex
  let count = 0
  while (0 !== currentIndex) {
    count++
    randomIndex = Math.floor(Math.random() * currentIndex)
    currentIndex -= 1
    temporaryValue = array[currentIndex]
    array[currentIndex] = array[randomIndex]
    array[randomIndex] = temporaryValue
  }
  return { array, count } 
}


for (let i = 1; i <= numItems; i++) {
  arrProto.push(Math.random())
}

/*************/

for (let i = 0; i < numPasses; i++) {
  let count
  let arr1 = [...arrProto]
  let arr2 = [...arrProto]
  
  count = 0
  arr1 = arr1.sort((a, b) => {
    count++
    return 0.5 - Math.random()
  })
  
  results.rand.push(count)
  
  results.fy.push(shuffle(arr2).count)
}

// yes, it's silly to take avgFy, it will always be the same as the number of elements
const avgFy = results.fy.reduce((a, c) => a + c)/results.fy.length
const avgRand = results.rand.reduce((a, c) => a + c)/results.rand.length

console.log('\n\n\nAverage iterations for FY =', avgFy)
console.log('Average iterations for rand =', avgRand)

console.log('\nWith ', numItems, 'elements and ', numPasses, ' trials,')
console.log('rand took', (avgRand/avgFy), 'times longer than fy.')

It’s interesting that it gets longer each time (that’s what she said). In small data sets it might not matter, but certainly would in large data sets. For 10M elements, it was 20X slower. (That last test took quite a while.)


#5

Got curious about how well the given algorithms shuffle. Here is a test

https://repl.it/@RoyGunhooGunhoo/Shuffle-test

I’m not so sure about the correctness of the test, but it is testing like this:

  • Get n shuffled arrays using x shuffling method of m unique element base array.
  • Compute average similarity between n shuffled arrays.