christojuan 5 Posted September 4, 2017 Report Share Posted September 4, 2017 (edited) Hi,I'm trying to scrape results from Google using x path but I'm struggling with one issue;As you can see below I am able to scrape:List01 - top 10 google search resultsList02 - Google Local pack titlesList03 - Google Adwords urlbut I am also trying to scrape the urls associated with List02 into List04. The Xpath that I am appling below works when using a tool call seotoolsfor excel (which allows use of xpath to scrape data), but it does not seem to be working here. Any insights would be greatly appreciated. clear list(%list01) clear list(%list02) clear list(%list03) clear list(%list04) navigate("https://www.google.com/?gws_rd=ssl","Wait") wait($rand(3,5)) type text(<name="q">,"Divorce Lawyer Naperville","Standard") click(<name="btnK">,"Left Click","No") add list to list(%list01,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//h3[@class=\'r\']/a", "href", "True"),"Delete","Global") add list to list(%list02,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//div[@aria-level=\'3\']", "innertext", "True"),"Delete","Global") add list to list(%list03,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//div[@class=\'ads-visurl\']/cite", "innertext", "True"),"Delete","Global") add list to list(%list04,$plugin function("XpathPlugin.dll", "$Generic Xpath Parser", $document text, "//*[@id=\"\"rso\"\"]/div[1]/div/div/div[2]/div/div[4]/div[1]/div/div/div/a[2]/@href", "href", "True"),"Delete","Global") Thanks!Chris Edited September 4, 2017 by christojuan Quote Link to post Share on other sites
HelloInsomnia 1103 Posted September 4, 2017 Report Share Posted September 4, 2017 The first three lists are working for me with those xpath's using Chrome 49. What is your user agent in Ubot? Mine is: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36 Quote Link to post Share on other sites
christojuan 5 Posted September 4, 2017 Author Report Share Posted September 4, 2017 Hey Nick - Thanks for the reply. the first 3 lists for fine for me too. The issue is that the List04 is not capturing the urls associated with each of the 3 listings in the Google Local Pack.BTW I am using Chrome 49 and I tried using the user agent you referenced, but no go I'm trying to get each of the 3 urls like the first one in this example https://www.screencast.com/t/hiZZGuACaLe where you can see https://www.collaborativelawrlh.com/ Is it possible to pull that with xpath? Any help/insights would be appreciated.Chris Quote Link to post Share on other sites
Marani 80 Posted September 4, 2017 Report Share Posted September 4, 2017 This works to get the three URLs you want using xPath: //div[@class="_M4k"]//a[contains(.,"Website")]/@href You can test here:http://videlibri.sourceforge.net/cgi-bin/xidelcgi Quote Link to post Share on other sites
christojuan 5 Posted September 5, 2017 Author Report Share Posted September 5, 2017 That worked perfectly! Thanks Marani! Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.