Josh 37 Posted October 15, 2012 Report Share Posted October 15, 2012 First of all I wanted to say thanks again to LoWrIdErTJ and willywonka for their hugs lists of footprints. You can find the list as a sticky at the top of the Tutorials, Tips and Trick forum or by clicking here. I decided to create a bot to search and scrape footprint and keywords to find websites based on a list of footprints. Because doing it one at a time sucks. And I wanted to give it to the ubot community because so many people have helped me with so many questions. There is always someone on this forums eager to help, give ideas and even write some code for you when you're stuck. Thanks to k1lv9h for helping me fix a scraping issue with the bot. Here it is! footprint-website-finder-005.ubot 7 Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted October 15, 2012 Report Share Posted October 15, 2012 very nice job bud.. Quote Link to post Share on other sites
Josh 37 Posted October 16, 2012 Author Report Share Posted October 16, 2012 very nice job bud.. Thanks! Quote Link to post Share on other sites
a2mateit 395 Posted October 16, 2012 Report Share Posted October 16, 2012 Nice Share Josh, Thank you. Quote Link to post Share on other sites
Legend 181 Posted October 16, 2012 Report Share Posted October 16, 2012 Very nice... I especially like the UI, too! Quote Link to post Share on other sites
Josh 37 Posted October 16, 2012 Author Report Share Posted October 16, 2012 Happy to offer something of some value. Hope people can put it to use. I've used it quite a lot already in combination with the huge list of footprints. Quote Link to post Share on other sites
VentureOnline 49 Posted October 16, 2012 Report Share Posted October 16, 2012 Hey Josh great share. Im finding the URLs are all coming out with long strings at the end of them though. Like this for example. http://katymarvelcatering.com/guestbook&sa=U&ei=3Il9UOTJBeqW0QGKz4CoAQ&ved=0CBUQFjAA&usg=AFQjCNFwHUl6NKQ5oK2Yuqk6cXURwY96_w Any idea why thats happening. I didn't have time to check into it fully. Not sure how you're going about the scraping. Quote Link to post Share on other sites
Josh 37 Posted October 16, 2012 Author Report Share Posted October 16, 2012 Hey Josh great share. Im finding the URLs are all coming out with long strings at the end of them though. Like this for example. http://katymarvelcat...Yuqk6cXURwY96_w Any idea why thats happening. I didn't have time to check into it fully. Not sure how you're going about the scraping. I have not had that issue. Is anyone else seeing that issue? Quote Link to post Share on other sites
freddie77 4 Posted October 16, 2012 Report Share Posted October 16, 2012 Nice UI!! Quote Link to post Share on other sites
Josh 37 Posted October 17, 2012 Author Report Share Posted October 17, 2012 Hey Josh great share. Im finding the URLs are all coming out with long strings at the end of them though. Like this for example. http://katymarvelcat...Yuqk6cXURwY96_w Any idea why thats happening. I didn't have time to check into it fully. Not sure how you're going about the scraping. I am seeing that now. Strange it wasn't doing that before. I'll try to see if I can scrape another way, and let you know. Quote Link to post Share on other sites
a2mateit 395 Posted October 17, 2012 Report Share Posted October 17, 2012 Beat me to this one Josh, I have been working on one as a side project for a while now. I will post it up for free when I'm done with it. Not to steal your thunder... Mine is pretty different. Quote Link to post Share on other sites
Josh 37 Posted October 17, 2012 Author Report Share Posted October 17, 2012 Beat me to this one Josh, I have been working on one as a side project for a while now. I will post it up for free when I'm done with it. Not to steal your thunder... Mine is pretty different. I live in San Diego. Not too much thunder here. It's always sunny! Quote Link to post Share on other sites
Josh 37 Posted October 18, 2012 Author Report Share Posted October 18, 2012 So I figured out that google is doing something to the url on the page and you can see it in the address bar right before you are redirected to the website from the SERP. I found the piece of code in google SERP source code, but I don't know how to go about extracting the url. <a class="l" onmousedown="return rwt(this,'','','','1','AFQjCNGs3NfmEYRSQfhQPeSOOjJSfHZbFg','','0CCEQFjAA',null,event)" href="http://as.wwu.edu/kugs/playlist/"> Any ideas? Quote Link to post Share on other sites
a2mateit 395 Posted October 18, 2012 Report Share Posted October 18, 2012 Hey Josh, This will scrape the url's very well: add list to list(%urls, $scrape attribute(<class="l">, "href"), "Delete", "Global") It's what I'm using in my scraper that will blow yours out of the water J/K Quote Link to post Share on other sites
Josh 37 Posted October 18, 2012 Author Report Share Posted October 18, 2012 Hey Josh, This will scrape the url's very well: add list to list(%urls, $scrape attribute(<class="l">, "href"), "Delete", "Global") It's what I'm using in my scraper that will blow yours out of the water J/K lol Well I tried it and it's not scraping any data. Are you setting a user agent? if so which one? Quote Link to post Share on other sites
a2mateit 395 Posted October 18, 2012 Report Share Posted October 18, 2012 No thats without a useragent. Just realized that your bot sets the useragent to ie6. Quote Link to post Share on other sites
Josh 37 Posted October 18, 2012 Author Report Share Posted October 18, 2012 No thats without a useragent. Just realized that your bot sets the useragent to ie6. That worked. But then how are you changing the google search settings to show 100 results per page? Quote Link to post Share on other sites
a2mateit 395 Posted October 18, 2012 Report Share Posted October 18, 2012 I'm not. Just a sacrifice you have to make or choose not to. I don't think it's possible using default ua. There was a thread about it a while back, but I couldn't find it... Quote Link to post Share on other sites
Josh 37 Posted October 19, 2012 Author Report Share Posted October 19, 2012 I'm not. Just a sacrifice you have to make or choose not to. I don't think it's possible using default ua. There was a thread about it a while back, but I couldn't find it... Found the solution. Look in the bot bank! Under search providers. There is a google scrape bot. That has the solution. Quote Link to post Share on other sites
Josh 37 Posted October 19, 2012 Author Report Share Posted October 19, 2012 got the bot fixed. Newest is uploaded at the top! Quote Link to post Share on other sites
Automator 2 Posted November 6, 2012 Report Share Posted November 6, 2012 I downloaded V4 and it worked great, except for the extra strings on the results urls. So I downloaded V5, but it doesn't run at all and actually crashed my computer a couple of times - couldn't shut it down or anything. I'm using V4.5 professional edition - could that be why? On windows 7 Quote Link to post Share on other sites
Josh 37 Posted November 6, 2012 Author Report Share Posted November 6, 2012 I downloaded V4 and it worked great, except for the extra strings on the results urls. So I downloaded V5, but it doesn't run at all and actually crashed my computer a couple of times - couldn't shut it down or anything. I'm using V4.5 professional edition - could that be why? On windows 7 It was created with DEV version. I can compile it and send it to you if you like. Quote Link to post Share on other sites
Automator 2 Posted November 6, 2012 Report Share Posted November 6, 2012 It was created with DEV version. I can compile it and send it to you if you like. Ah, so I need the DEV version to get it to work? I compiled it and tried it that way too, but it also won't work for me then. I didn't know it made a difference. If that's the deal, thanks for clearing it up for me. And yes, I would LOVE the compiled version, thank you!! Great bot! Edit here - the first version worked for me, though. Wasn't that made in using the DEV version also? Quote Link to post Share on other sites
vonragnor 2 Posted November 7, 2012 Report Share Posted November 7, 2012 too bad i can't use it as i don't have the dev edition :-( Quote Link to post Share on other sites
Josh 37 Posted November 7, 2012 Author Report Share Posted November 7, 2012 Sorry guys the forum will not allow me to post the compiled file, probably because it is an executable. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.