Legend 181 Posted February 10, 2012 Report Share Posted February 10, 2012 I coulda sworn there was a simple google search URL scraper here (actually thought there were a few) but now I can't find it... far be it for me to try to reinvent the wheel... would someone kindly direct me to it... I need to modify it to scrape image URLs and the one I put together is not collecting all the URLs properly. Thanks!http://ubotstudio.com/forum/public/style_emoticons/default/rolleyes.gif ui text box("Keyword", #keyword)navigate("http://images.google.com/", "Wait")wait(3)type text(<name="q">, #keyword, "Standard")click(<name="btnG">, "Left Click", "No")wait(3)add list to list(%urls, $scrape attribute(<class="rg_i">, "fullsrc"), "Delete", "Global") Quote Link to post Share on other sites
JohnB 255 Posted February 10, 2012 Report Share Posted February 10, 2012 Hey Duane. I've done so many variations of these, I don't know which is which anymore. This one scrapes the green urls I believe (with the cite tag). ui text box("Keyword: ", #kw)clear list(%urls)ui stat monitor("Total Scraped", $list total(%urls))reset account("Any")navigate("http://www.google.com", "Wait")type text(<name="q">, #kw, "Standard")click(<name="btng">, "Left Click", "No")wait(2)loop(10) { add list to list(%urls, $scrape attribute(<class="f kv">, "outertext"), "Delete", "Global") click(<id="pnnext">, "Left Click", "No") wait(4)}save to file("{$special folder("Desktop")}/urls.txt", %urls) John But I'm not positive! http://ubotstudio.com/forum/public/style_emoticons/default/smile.gif Quote Link to post Share on other sites
mariah 0 Posted February 11, 2012 Report Share Posted February 11, 2012 (edited) I think It does not work for the Image jhon. ===========I'm fail to reply===== I suppose Duane is trying to scrape URL from Google image and it has javscript all over the link. and you are showing the text scraper sample . cheers navigate("http://images.google.com/", "Wait") Edited February 11, 2012 by mariah Quote Link to post Share on other sites
JohnB 255 Posted February 11, 2012 Report Share Posted February 11, 2012 What do you mean mariah? John Quote Link to post Share on other sites
Legend 181 Posted February 11, 2012 Author Report Share Posted February 11, 2012 Thanks John! That's what I was originally looking for! I thought it might help as a model to scrape http://images.google.com/ for image urls. The results I'm getting are whacky though... I don't know where rows 1-14 are coming from (data:image...) and there are huge gaps in the data that I can't seem to scrape. ui text box("Keyword: ", #kw)ui stat monitor("Total Scraped", $list total(%results))reset account("Any")clear list(%results)clear table(&final)navigate("http://images.google.com/", "Wait")type text(<name="q">, #kw, "Standard")click(<login button>, "Left Click", "No")wait(3)click(<id="smb">, "Left Click", "No")wait(10)add list to list(%results, $scrape attribute(<class="rg_i">, "src"), "Don\'t Delete", "Global")save to file("{$special folder("Application")}/portraits.csv", &final) http://ubotstudio.com/forum/public/style_emoticons/default/blink.gif Quote Link to post Share on other sites
JohnB 255 Posted February 11, 2012 Report Share Posted February 11, 2012 you can lose all the blank entries by deleting duplicates... ui text box("Keyword: ", #kw)ui stat monitor("Total Scraped", $list total(%results))reset account("Any")clear list(%results)clear table(&final)navigate("http://images.google.com/", "Wait")type text(<name="q">, #kw, "Standard")click(<login button>, "Left Click", "No")wait(3)click(<id="smb">, "Left Click", "No")wait(10)add list to list(%results, $scrape attribute(<class="rg_i">, "fullsrc"), "Delete", "Global")save to file("{$special folder("Application")}/portraits.csv", &final) John Quote Link to post Share on other sites
JohnB 255 Posted February 11, 2012 Report Share Posted February 11, 2012 Here...this should work for you... ui text box("Keyword: ", #kw)ui stat monitor("Total Scraped", $list total(%results2))reset account("Any")clear list(%results)clear list(%results2)clear table(&final)navigate("http://images.google.com/", "Wait")type text(<name="q">, #kw, "Standard")click(<login button>, "Left Click", "No")wait(3)click(<id="smb">, "Left Click", "No")wait(10)add list to list(%results, $scrape attribute(<class="rg_i">, "fullsrc"), "Delete", "Global")set(#position, 0, "Global")loop($list total(%results)) { if($contains($list item(%results, #position), "gstatic.com")) { then { add item to list(%results2, $list item(%results, #position), "Delete", "Global") } else { } } increment(#position)}add list to table as column(&final, 0, 0, %results2)save to file("{$special folder("Desktop")}/portraits.csv", &final) John Quote Link to post Share on other sites
Legend 181 Posted March 6, 2012 Author Report Share Posted March 6, 2012 BTW... this worked perfectly... thanks again! http://ubotstudio.com/forum/public/style_emoticons/default/smile.gif Quote Link to post Share on other sites
JohnB 255 Posted March 6, 2012 Report Share Posted March 6, 2012 Cool! John Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.