Thursday, December 5, 2024
HomeGolangColly callback not receiving variable from go func - Getting Assist

Colly callback not receiving variable from go func – Getting Assist


Principally, it’s like this:

x := "It labored earlyer lol"
go func(x string){
  c.onHTML(...){
    print(x)
  }
  c.go to(https://www.website.com)
}

in order that’s the gist once I print the variables which might be clean however exterior of the go func they’re outlined right here’s the total guardian perform:

eventCollector.OnHTML(".rgMasterTable tr", func(h *colly.HTMLElement) {
    eventName := h.ChildText("td:nth-child(3) a")
    eventURL := h.ChildAttr("td:nth-child(3) a", "href")
    state := h.ChildText("td:nth-child(2)")
    wgFR.Add(1)             // Increment WaitGroup counter for every goroutine
    semaphore <- struct{}{} // Purchase a token

    go func(eventName, eventURL, state string) {
       defer wgFR.Performed() // Sign completion when the goroutine exits
       defer func() { <-semaphore }()
       contestCollector := eventCollector.Clone()
       var postedDateStr string
       contestCollector.OnHTML("#ctl00_ContentPlaceHolder1_FormView1_Report_2Label", func(d *colly.HTMLElement) {
          postedDateStr = d.Textual content
       })

       contestCollector.OnHTML(".rgMasterTable tr", func(c *colly.HTMLElement) { // troubled line

          contestName := c.ChildText("td:nth-child(1)")
          contestURL := c.ChildAttr("td:nth-child(3) a", "href")
          if contestURL == "" {
             contestURL = "FORMAT-ERROR"
          } // tmp handler for doc model outcomes
          postedDate, timeErr := time.Parse("Jan 2, 2006", postedDateStr)
          if timeErr != nil {
             log.Printf("Error parsing time from %s", eventURL)
          }
          contest := Contest{
             EventName:   eventName,
             ContestName: contestName,
             PostedDate:  postedDate,
             ContestURL:  contestURL,
             State:       state,
             Current:     true,
          }
          fResults = append(fResults, contest)
       })
       err := contestCollector.Go to("https://www.judgingcard.com/Outcomes/" + eventURL)
       if err != nil {
          log.Printf("Couldn't discover occasion: %s -- %s", eventURL, eventName)
       }
       contestCollector.Wait() // Anticipate the interior collector to complete
    }(eventName, eventURL, state)
})

within the full model all variables handed to the go func return clean if within the callback (contestCollector.OnHTML) sadly I’m undecided climate difficulty likes within the go routine, the truth that is as callback or no matter else.
Thanks upfront!

This appears fairly convoluted. It seems to be like you might be utilizing github.com/gocolly/colly. Why is that part in a goroutine precisely? I’d first strive eliminating that. It appears as if the collectors are already utilizing goroutines (I’m simply guessing since they’ve a Wait perform; I didn’t have a look at the docs to substantiate this).

This code seems to be prefer it’s lacking some stuff (like the place are you ready for wgFR? The place is wgFR even outlined?) so off the highest of my head one one thing might be modifying the strings in your outer features that you just won’t be anticipating. Within the problematic contestCollector.OnHTML callback you might be not passing these values to that perform so it might be falling prey to one thing like this:

func principal() {
	myAwesomeValue := "superior"
	wg := sync.WaitGroup{}
	wg.Add(1)
	go func() {
		time.Sleep(time.Millisecond)
		fmt.Println("The worth is", myAwesomeValue)
		wg.Performed()
	}()
	myAwesomeValue = "not superior"
	wg.Wait()
}

… which prints The worth isn't superior.



1 Like

Thanks the Colly framework does have its personal supervisor that dosen’t require defining a brand new wait group. Ive constructed this take a look at script utilizing what Ive realized that works as anticipated.

bundle principal

import (
	"fmt"
	"github.com/gocolly/colly"
)

kind Occasion struct {
	event_date   string
	state        string
	event_name   string
	event_url    string
	contest_name string
	contest_url  string
}

func principal() {
	var occasions []Occasion

	ECollector := colly.NewCollector(colly.Async(true))
	ECollector.Restrict(&colly.LimitRule{DomainGlob: "*", Parallelism: 2})

	ECollector.OnHTML(".rgMasterTable tr", func(e *colly.HTMLElement) {
		if e.Index > 5 {
			return
		}

		date := e.ChildText("td:nth-child(1)")
		state := e.ChildText("td:nth-child(2)")
		title := e.ChildText("td:nth-child(3)")
		url := e.ChildAttr("td:nth-child(3) a", "href")

		occasion := Occasion{
			event_date: date,
			state:      state,
			event_name: title,
			event_url:  url,
		}
		occasions = append(occasions, occasion)
	})

	ECollector.Go to("https://www.judgingcard.com")
	ECollector.Wait()

	CCollector := ECollector.Clone()

	for _, occasion := vary occasions {
		CCollector.OnHTML(".rgMasterTable tr", func(c *colly.HTMLElement) {
			//skip header line
			if c.Index == 0 || c.Request.URL.String() == "https://www.judgingcard.com/Outcomes/default.aspx" {
				return
			}
			//Extra contests than occasions/construct new struct
			contest_name := c.ChildText("td:nth-child(1)")
			contest_url := c.ChildAttr("td:nth-child(3) a", "href")
			
			fmt.Println(contest_name, contest_url)
		})
		CCollector.Go to("https://www.judgingcard.com" + occasion.event_url)
	}
	CCollector.Wait()
}

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments