smb1970 0 Posted March 22, 2013 Report Share Posted March 22, 2013 HI,I've been working on a Yellow Pages scraper the last few days. It pretty much works but the browser seems to lock out every now and then. So the click event on the Element to Click for the next page won't work for some reason. Stopping the script and trying to manually click on the web page doesn't work either, if I switch to another web site manually in the internal browser everything is fine again. It's almost as though Yell can tell they are being scraped and are somehow stopping me. Is this possible or is is potentially an issue with my code. I'm using Hide My Ass Pro and switching the proxy every page - which I guess might look suspicious to have random page requests keep coming in from different places. Any suggestions? Happy to share the code if anyone has the time to take a look. Best regards Steve Quote Link to post Share on other sites
bestmacros 60 Posted March 22, 2013 Report Share Posted March 22, 2013 set referrer, set user agent - many popular sites today are blocking ubot default browser, so you need to hide it. 1 Quote Link to post Share on other sites
smb1970 0 Posted March 26, 2013 Author Report Share Posted March 26, 2013 (edited) Thanks for the tip. I'll give it a go.Best regards Steve edit: I added in these lines. I also added in a clear cookies command at the end of each cycle, an extra wait for 2 seconds before each page scrape and switched it so the proxy change only happens at the end of each location scrape. Seems to work a lot better now, unfortunately it takes a little longer and I still need to bomb proof the HMA proxy switch but other than that I appear to have a sweet little Yell scraper now. Edited March 26, 2013 by smb1970 Quote Link to post Share on other sites
kapose 0 Posted April 19, 2013 Report Share Posted April 19, 2013 Hi Steve, I am a newbie and tasked with trying writing my first yellowpage script. Any chance I can take a look at it to learn from it? Thanks,Jeff Quote Link to post Share on other sites
smb1970 0 Posted April 22, 2013 Author Report Share Posted April 22, 2013 Hi Jeff,That's a big ask. There are people out there marketing Yellow Pages scrapers and while mine is not robust enough to release commercially in it's current state, it is a solid and reliable bot that I've already used to obtain thousands of pages from Yell. There's really nothing to stop you taking what I've done, giving it a quick polish and releasing it yourself tomorrow. I'm happy to help out with advice if you have specific issues with your own development though. Feel free to PM me.Best regards Steve Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.