Can I scrape this?

I’m coding a web scraping CLI app in ruby and am not sure which CSS selectors I can use to scrape the content in the picture attached.

It’s also the first time I’ve encountered “data-reactid” and I’m not sure how to pull the string “Africa” that comes after it. Any insight would be appreciated.

Site I’m trying to scrape: https://www.surfline.com/surf-reports-forecasts-cams#africa

Thanks!
Emily

data-reactid is just a data attribute

Only thing though is that “28” is the value of the data attribute “data-reactid”.

For example if I was to scrape using doc.search(‘a’).attr(‘data-reactid’) it’ll yield “28” not “Africa”.

Do you happen to know of a css selector, or some other selector, that would retrieve “Africa”?

Are you using an XML/HTML parser with Ruby? If so, which one?

Yes, I’m using Nokogiri

Can you provide a little more code so I can understand your current structure for the parsing?

I was reading through the Nokogirl documentation and it mentions a text method, so maybe try:

 doc.search('a').text

Sure! Thanks so much for your help. I have tried .text but still returns just the numerical value. Here’s what I have for this section:

continents = []
      doc = Nokogiri::HTML(open("https://www.surfline.com/surf-reports-forecasts-cams/"))
      continent.name = doc.search('a').attr('data-reactid').text
      continents << continent

Without the attribute it will return all text on the page.

doc.search('a').text
=> "Ad-free Cams. Expert Forecast Analysis. GO PREMIUM.✕Surfline surf reports, surf forecasts and cams.Cams & Rep
ortsSign in or create accountAll Surf SpotsMapQuad CamForecastsSign in or create accountWave BuoysChartsLOLA Clas
sicMeet the ForecastersNewsAllBreakingFeaturesVideoTravelGearTrainingScienceGo PremiumSign InCamsForecastsFavorit
esNewsFavoritesBreezy Point1-2ftAdd favoritesQuickly access the spots you care about most.AfricaAsiaEuropeNorth A
mericaOceaniaSouth AmericaAfricaAsiaEuropeNorth AmericaOceaniaSouth AmericaAlgeria Surf Reports & CamsAngola Surf
 Reports & CamsCape Verde Surf Reports & CamsGhana Surf Reports & CamsIvory Coast Surf Reports & CamsLiberia Surf
 Reports & CamsMadagascar Surf Reports & CamsMauritius Surf Reports & CamsMorocco Surf Reports & CamsMozambique S
urf Reports & CamsNamibia Surf Reports & CamsNigeria Surf Reports & CamsSenegal Surf Reports & CamsSouth Africa S
urf Reports & CamsSwaziland Surf Reports & CamsSão Tomé and Príncipe Surf Reports & CamsChina Surf Reports & Cams
Hong Kong Surf Reports & CamsIndia Surf Reports & CamsIndonesia Surf Reports & CamsIsrael Surf Reports & CamsJapa
n Surf Reports & CamsLebanon Surf Reports & CamsMalaysia Surf Reports & CamsMaldives Surf Reports & CamsPalestine
 Surf Reports & CamsPhilippines Surf Reports & CamsRussia Surf Reports & CamsSouth Korea Surf Reports & CamsSri L
anka Surf Reports & CamsTaiwan Surf Reports & CamsThailand Surf Reports & CamsTurkey Surf Reports & CamsUnited Ar
ab Emirates Surf Reports & CamsBelgium Surf Reports & CamsBulgaria Surf Reports & CamsDenmark Surf Reports & Cams
Finland Surf Reports & CamsFrance Surf Reports & CamsGermany Surf Reports & CamsGreece Surf Reports & CamsGuernse
y Surf Reports & CamsIceland Surf Reports & CamsIreland Surf Reports & CamsItaly Surf Reports & CamsJersey Surf R
eports & CamsLatvia Surf Reports:

I can work with this!

doc.search('div.quiver-world-taxonomy__continents').text
=> "AfricaAsiaEuropeNorth AmericaOceaniaSouth AmericaAfricaAsiaEuropeNorth AmericaOceaniaSouth America"