dschmid 0 Posted July 15, 2014 Report Share Posted July 15, 2014 Hi, I'm new to ubot and trying to follow the tutorial videos.I'm now at video 5 (my problem is at 14:30) It's about scraping google search results. My problem is, that google changed how the search results are delivered and the method described in the video did not works for me. Here is a screenshot of the advanced element editor. As you can see there is no class attribute: http://screencast.com/t/q0HnY6Ww Any idea, how I can get all search results into my list?? Thanks for any tips Quote Link to post Share on other sites
sunny9495 42 Posted July 17, 2014 Report Share Posted July 17, 2014 Yeah,can anyone please give solution for this. Quote Link to post Share on other sites
JohnB 255 Posted July 26, 2014 Report Share Posted July 26, 2014 What are you trying to scrape exactly? Quote Link to post Share on other sites
dschmid 0 Posted July 28, 2014 Author Report Share Posted July 28, 2014 Hi, yes right I try to scrape the links of the the SERPs. It's not working like described in the video. Does anybody got a working solution for this? Quote Link to post Share on other sites
Code Docta (Nick C.) 639 Posted July 28, 2014 Report Share Posted July 28, 2014 try this clear list(%scrape url)add list to list(%scrape url, $scrape attribute(<class=w"_zd*">, "innertext"), "Delete", "Global") works here see the <cite class is _zd then make the rest a wild card because they are different except _zd use a web inspector either ubots or fire bug for fire fox and look at what is the same for each element you wish to scrape then wild card the rest. so look at a few to see. where they are the same or different. TC Quote Link to post Share on other sites
sunny9495 42 Posted July 29, 2014 Report Share Posted July 29, 2014 This code didnt worked.can you please helpon how to scrape urls from google.try this clear list(%scrape url)add list to list(%scrape url, $scrape attribute(<class=w"_zd*">, "innertext"), "Delete", "Global") works here see the <cite class is _zd then make the rest a wild card because they are different except _zd use a web inspector either ubots or fire bug for fire fox and look at what is the same for each element you wish to scrape then wild card the rest. so look at a few to see. where they are the same or different. TC Quote Link to post Share on other sites
Code Docta (Nick C.) 639 Posted July 29, 2014 Report Share Posted July 29, 2014 You are correct hmmm twas working at the moment I am working on it. been a while with this method(advanced editor) I usually use http and/or xpath now a days Quote Link to post Share on other sites
Code Docta (Nick C.) 639 Posted July 29, 2014 Report Share Posted July 29, 2014 here you go I looked at my gttp and remembered how I did it was like months ago but still good stuff and this should work for you fuys....remember just a quick example here clear list(%keywords) add list to list(%keywords, $list from text("purple green yellow blue red", $new line), "Delete", "Global") clear list(%scrape url) set user agent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0") loop($list total(%keywords)) { set(#KW_next_item, $next list item(%keywords), "Global") clear cookies navigate("https://www.google.com/", "Wait") wait for element(<innertext="Google Search">, 10, "Appear") type text(<name="q">, "{#KW_next_item} balloons", "Standard") click(<name="btnK">, "Left Click", "No") wait($rand(3, 10)) wait for element(<innertext="Help">, 10, "Appear") add list to list(%scrape url, $list from text($find regular expression($scrape attribute(<class="r">, "outerhtml"), "(?<=<h3 class=\"r\"><a href=\").*?(?=\" onm)"), $new line), "Delete", "Global") } ui stat monitor("urls: {$list total(%scrape url)}", "") hope that helps, TC Quote Link to post Share on other sites
Code Docta (Nick C.) 639 Posted July 29, 2014 Report Share Posted July 29, 2014 here you go I looked at my http code and remembered how I did it was like months ago but still good stuff and this should work for you fuys....remember just a quick example here to show page scrape not navigate and all other stuff clear list(%keywords) add list to list(%keywords, $list from text("purple green yellow blue red", $new line), "Delete", "Global") clear list(%scrape url) set user agent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0") loop($list total(%keywords)) { set(#KW_next_item, $next list item(%keywords), "Global") clear cookies navigate("https://www.google.com/", "Wait") wait for element(<innertext="Google Search">, 10, "Appear") type text(<name="q">, "{#KW_next_item} balloons", "Standard") click(<name="btnK">, "Left Click", "No") wait($rand(3, 10)) wait for element(<innertext="Help">, 10, "Appear") add list to list(%scrape url, $list from text($find regular expression($scrape attribute(<class="r">, "outerhtml"), "(?<=<h3 class=\"r\"><a href=\").*?(?=\" onm)"), $new line), "Delete", "Global") } ui stat monitor("urls: {$list total(%scrape url)}", "") hope that helps, TC 1 Quote Link to post Share on other sites
sunny9495 42 Posted July 29, 2014 Report Share Posted July 29, 2014 A bit complicated for a newbie..but it works fine..i need your suggestion..can you tell me where can i learn more about ubot?here you go I looked at my http code and remembered how I did it was like months ago but still good stuff and this should work for you fuys....remember just a quick example here to show page scrape not navigate and all other stuff clear list(%keywords) add list to list(%keywords, $list from text("purple green yellow blue red", $new line), "Delete", "Global") clear list(%scrape url) set user agent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0") loop($list total(%keywords)) { set(#KW_next_item, $next list item(%keywords), "Global") clear cookies navigate("https://www.google.com/", "Wait") wait for element(<innertext="Google Search">, 10, "Appear") type text(<name="q">, "{#KW_next_item} balloons", "Standard") click(<name="btnK">, "Left Click", "No") wait($rand(3, 10)) wait for element(<innertext="Help">, 10, "Appear") add list to list(%scrape url, $list from text($find regular expression($scrape attribute(<class="r">, "outerhtml"), "(?<=<h3 class=\"r\"><a href=\").*?(?=\" onm)"), $new line), "Delete", "Global") } ui stat monitor("urls: {$list total(%scrape url)}", "") hope that helps, TC Quote Link to post Share on other sites
Code Docta (Nick C.) 639 Posted July 29, 2014 Report Share Posted July 29, 2014 http://ubotstudio.com/tutorials http://wiki.ubotstudio.com/wiki/Main_Page put this in your favorite search engine site:udotstudio.com then what ever you are looking to learn like this site:ubotstudio.com how to scrape TC Quote Link to post Share on other sites
sunny9495 42 Posted July 29, 2014 Report Share Posted July 29, 2014 Thanks Traffik Cop i would like to add you as a friend,please accept me.http://ubotstudio.com/tutorials http://wiki.ubotstudio.com/wiki/Main_Page put this in your favorite search engine site:udotstudio.com then what ever you are looking to learn like this site:ubotstudio.com how to scrape TC Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.