Jump to content
UBot Underground

Help With Xpath On Google Local Pack Url


Recommended Posts

Hi,

I'm trying to scrape results from Google using x path but I'm struggling with one issue;

As you can see below I am able to scrape:

List01 - top 10 google search results

List02 - Google Local pack titles

List03 - Google Adwords url

but I am also trying to scrape the urls associated with List02 into List04.

 

The Xpath that I am appling below works when using a tool call seotoolsfor excel (which allows use of xpath to scrape data), but it does not seem to be working here.  Any insights would be greatly appreciated.

clear list(%list01)
clear list(%list02)
clear list(%list03)
clear list(%list04)
navigate("https://www.google.com/?gws_rd=ssl","Wait")
wait($rand(3,5))
type text(<name="q">,"Divorce Lawyer Naperville","Standard")
click(<name="btnK">,"Left Click","No")
add list to list(%list01,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//h3[@class=\'r\']/a", "href", "True"),"Delete","Global")
add list to list(%list02,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//div[@aria-level=\'3\']", "innertext", "True"),"Delete","Global")
add list to list(%list03,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//div[@class=\'ads-visurl\']/cite", "innertext", "True"),"Delete","Global")
add list to list(%list04,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//*[@id=\"\"rso\"\"]/div[1]/div/div/div[2]/div/div[4]/div[1]/div/div/div/a[2]/@href", "href", "True"),"Delete","Global")

Thanks!
Chris

Edited by christojuan
Link to post
Share on other sites

The first three lists are working for me with those xpath's using Chrome 49. What is your user agent in Ubot? Mine is:

Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36
Link to post
Share on other sites

Hey Nick - Thanks for the reply.  the first 3 lists for fine for me too.  The issue is that the List04 is not capturing the urls associated with each of the 3 listings in the Google Local Pack.

BTW I am using Chrome 49 and I tried using the user agent you referenced, but no go :(

 

I'm trying to get each of the 3 urls like the first one in this example https://www.screencast.com/t/hiZZGuACaLe where you can see https://www.collaborativelawrlh.com/  

 

Is it possible to pull that with xpath?

 

Any help/insights would be appreciated.

Chris

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...