Hello,
I’m questioning why my algorithm for producing information doesn’t carry out higher when two Go routines have been began to generate the info.
I ran the check on a multi-core machine, which suggests that the 2 Go routines can run in parallel. Every of those two Go routines solely generates half of the whole information quantity to be generated. I subsequently assumed that the whole operating time must be about half as lengthy in comparison with a single Go routine.
It might be nice if somebody might evaluation the connected code pattern under. There is likely to be one thing improper with my strategy, however I couldn’t establish it.
Thanks!
bundle fundamental
import (
"fmt"
"math/rand"
"sync"
"time"
)
func gendata(numGoRoutines, numElements int) []int {
var wg sync.WaitGroup
wg.Add(numGoRoutines)
rand.Seed(time.Now().UnixNano())
ssi := make([][]int, numGoRoutines)
for i := 0; i < numGoRoutines; i++ {
ssi[i] = make([]int, 0, numElements/numGoRoutines)
go func(i int) {
defer wg.Completed()
for j := 0; j < numElements/numGoRoutines; j++ {
ssi[i] = append(ssi[i], rand.Intn(10))
}
}(i)
}
wg.Wait()
// Create the end result by appending all of the sub-slices into one end result slice
var res []int
for i := 0; i < numGoRoutines; i++ {
res = append(res, ssi[i]...)
}
return res
}
func fundamental() {
n := 100000000
// One Go routine
g := 1
t1 := time.Now()
gendata(g, n)
t2 := time.Now()
fmt.Println("Information era of", n, "components with", g, "Go routines took", t2.Sub(t1))
// Two Go routines
g = 2
t1 = time.Now()
gendata(g, n)
t2 = time.Now()
fmt.Println("Information era of", n, "components with", g, "Go routines took", t2.Sub(t1))
}
math/rand
produces a predictable sequence of pseudo-random values. To make sure that successive calls to math/rand.Intn
at all times produce the identical values, the supply is protected with a mutex, so including goroutines gained’t assist.
As an alternative, use math/rand.NewSource
and move that to rand.New
for every goroutine individually. You can’t share a rand.Supply
between goroutines as a result of solely the worldwide supply is secure for concurrent use. That ought to permit a number of cores to generate values.
2 Likes
That’s an excellent level. It improved the efficiency. Now, the runtime for the info era by one goroutine and for the info era by two goroutines is nearly equal.
However, I anticipated the runtime to be roughly halved with the two-goroutine strategy.
I additionally examined with a set worth for the info to be appended to the slice, e.g.
ssi[i] = append(ssi[i], 1)
The end result is similar. The runtime is similar for each variants. It appears there’s something else which influences the runtime of the 2 goroutines. I nonetheless marvel why this occurs…
I like to recommend wanting into the runtime/pprof bundle to analyze the place the timw is being spent.
I did extra analysis and was capable of additional enhance the runtime of the perform.
Along with the change relating to the random supply (thanks for that trace) I’ve modified the next two issues:
- The calculation of the exit standards of the for-loop is carried out solely as soon as earlier than coming into the loop
- new components are inserted into the slice utilizing the suitable slice indices
func gendata(numGoRoutines, numElements int) []int {
var wg sync.WaitGroup
wg.Add(numGoRoutines)
ssi := make([][]int, numGoRoutines)
for i := 0; i < numGoRoutines; i++ {
ssi[i] = make([]int, numElements/numGoRoutines)
go func(i int) {
defer wg.Completed()
s := rand.NewSource(time.Now().UnixNano())
r := rand.New(s)
cond := numElements / numGoRoutines
for j := 0; j < cond; j++ {
ssi[i][j] = r.Intn(10)
}
}(i)
}
wg.Wait()
// Create the end result by appending all of the sub-slices into one end result slice
var res []int
for i := 0; i < numGoRoutines; i++ {
res = append(res, ssi[i]...)
}
return res
}
After all! You’re proper; I’m pissed off at myself for not noticing this! Each time you append to the slice, you’re rewriting the brand new size again into ss[i]
which is adjoining in reminiscence to ss[i±1]
which is probably going leading to false sharing which might nullify any good thing about parallelism.
1 Like