If I use the sort method to sort an array, I know the order of the elements will be rearranged based on alphabetic order in case the elements are strings;
I have a solid understanding of how the sort method works, and I can see it’s not completely random, that’s why people would use the following code:
arr.sort(() => 0.5 - Math.random());
With the above code the arr will be rearranged randomly and problem solved, but what’s the logic here? I’d like to know what’s going on and not just use a code blindly, but I don’t understand this part 0.5 - Math.random()
Math.random() returns any number from 0 to 0.99999… but why subtract 0.5 from it?
In the above scenario the sort method does even try to create a random ordering of the elements. In fact, it is a very specific order. The numbers are treated as strings and then sorted in lexical order, so in the following array “100” would come before “2” and “40000” would come before “500” after the sort.
The above implementation of using sort with the callback returning 0.5 - Math.random() is not the preferred way, because it can behave very differently between browsers based on how the browser implements the sort behind-the-scenes.
The best way to randomly shuffle an array is to use the Fisher-Yates (aka Knuth) Shuffle algorithm. Seethis SO thread for more information.
EDIT: There is a great read here on how Microsoft almost got sued for using the sort method you first mentioned.
I was curious how inefficient the Math.random approach was so I wrote a pen. (Ignoring for the moment a debate about the mathematics of randomness.)
const numItems = 10000
const numPasses = 10
const results = { fy: [], rand: [] }
const arrProto = []
const shuffle = (array) => {
let currentIndex = array.length, temporaryValue, randomIndex
let count = 0
while (0 !== currentIndex) {
count++
randomIndex = Math.floor(Math.random() * currentIndex)
currentIndex -= 1
temporaryValue = array[currentIndex]
array[currentIndex] = array[randomIndex]
array[randomIndex] = temporaryValue
}
return { array, count }
}
for (let i = 1; i <= numItems; i++) {
arrProto.push(Math.random())
}
/*************/
for (let i = 0; i < numPasses; i++) {
let count
let arr1 = [...arrProto]
let arr2 = [...arrProto]
count = 0
arr1 = arr1.sort((a, b) => {
count++
return 0.5 - Math.random()
})
results.rand.push(count)
results.fy.push(shuffle(arr2).count)
}
// yes, it's silly to take avgFy, it will always be the same as the number of elements
const avgFy = results.fy.reduce((a, c) => a + c)/results.fy.length
const avgRand = results.rand.reduce((a, c) => a + c)/results.rand.length
console.log('\n\n\nAverage iterations for FY =', avgFy)
console.log('Average iterations for rand =', avgRand)
console.log('\nWith ', numItems, 'elements and ', numPasses, ' trials,')
console.log('rand took', (avgRand/avgFy), 'times longer than fy.')
It’s interesting that it gets longer each time (that’s what she said). In small data sets it might not matter, but certainly would in large data sets. For 10M elements, it was 20X slower. (That last test took quite a while.)