getting blocked by google...

HarryPotter · December 28, 2010

hey guys,

just wondering if anyone had any insights.

i am running my bot querying "info:url" to find if my urls are indexed... i am running

- 70 proxies from YPP

- random delay of 1 to 10 seconds between queries, with proxy rotation

i have had other bots that uses my home connection and ran fine with a similar delay setting

i am guessing google is more sensitive to queries like "info:". i am going to increase my random delay to 10 to 60 seconds... any other suggestions to stay under the radar?

thanks!

Frank · December 28, 2010

I've had to increase bots to randomly touch Google as far apart as hours at times. Google is getting more and more sensitive. I find after about 20 consecutive touches in under 5 minutes will make the radar go off.

Frank

HarryPotter · December 28, 2010

interesting... increasing to hours? then the bot will take days to run as i am running to lists of 10000

JohnB · December 28, 2010

My threshold seemed to be nine queries...after that...suspended for a while

HarryPotter · December 29, 2010

here is the interesting part!

with the same set of proxies, i can use it with scrapebox no problem... WHY!?

meter · December 29, 2010

The proxies are imperfect or your browser is bleeding information because it is configured improperly, or both. Scrapebox works fine because it doesn't use a browser; it uses raw HTTP GETs; not much room for information to be leaked there.

Ubot on the other hand uses a web browser. Depending on how the browser is configured, it might use the proxy to load the page, but then all the ajax will go through without a browser, or stupid stuff like that.

I find it much easier to simple use scrapebox or write my own simple scraper with HTTP GETs instead of messing around with browser security configurations.

UBotBuddy · December 29, 2010

Thanks for sharing that meter. Interesting. Is scrapebox a valuable tool to have in a toolbox?

HarryPotter · December 29, 2010

@botbuddy, scrapebox is awesome and for the price, it is WELL worth it.

@meter, wow, i wish i knew that before i spent 2 whole days building my ubot script to scrape google...

how would i start making a HTTP GETs program?

HarryPotter · December 29, 2010

discount link here in case anyone is interested: http://www.scrapebox.com/cheapskate

don't worry, i am not an affiliate

meter · December 29, 2010

Thanks for sharing that meter. Interesting. Is scrapebox a valuable tool to have in a toolbox?

Scrapebox is the most useful tool I've ever bought.

how would i start making a HTTP GETs program?

Just look up how to do an HTTP Get in any language you know.

Also, Ubot sockets bots should be able to scrape google with proxies just fine. This is because they use HTTP GETs, not the WebBrowser, to interact with the target site.

Abs* · December 29, 2010

Scrapebox is the most useful tool I've ever bought.

Just look up how to do an HTTP Get in any language you know.

Also, Ubot sockets bots should be able to scrape google with proxies just fine. This is because they use HTTP GETs, not the WebBrowser, to interact with the target site.

Hi - UbotSockets - I heard of this being released as a add on and should allow us some capability of multi threading - Is this what your referring to ?

Im sure I read Seth make a note somewhere that it will be a advanced feature still to be released - I hope tutorials will be available.

If so then it sounds great

thanks

meter · December 29, 2010

Abs, multithreading should already be implemented in Ubot. I just imagine that Seth has some tweaking to do with it to get it in better working order. But yes, sockets will support multithreading.

-meter

Net66 · December 29, 2010

All the 'google' stuff is catching on. Blogger.com triggers a block on a block very quickly now. I suspect its detecting the use of ubot from the http referrer maybe? Anyone know what ubot identifies itself as?

Andy

Abs* · December 30, 2010

HI Meter -

I was playing with the software when I upgraded and was testing the multi threaded function

Some areas which I noted -

It was slow - both browsers or even 3 or 4 wouldnt work as they should - in the sense that they would all wait for each browser to complete a step at a time

Im not sure if we are able to use change proxy in each separate browser - so that we can make multiple requests from different ip's - but then again as Andy and others have noted - if the refer will be passed even when using Ip's then this will not really help.

a number of browsers will cause many users to have system issues due to the amount of cpu usage - If it was posibble to split the browser into seperate sections - or open small browsers similar to senuke then it would be great.

- Ive actually got a app that i cant wait to multi thread - but i think i will wait for the sockets to see if they will be more what im looking for

thansk

HarryPotter · January 1, 2011

clear cookies... haven't included that in my bot yet... maybe this will help?

getting blocked by google...

Recommended Posts

HarryPotter 9

Link to post

Share on other sites

Frank 177

Link to post

Share on other sites

HarryPotter 9

Link to post

Share on other sites

JohnB 255

Link to post

Share on other sites

HarryPotter 9

Link to post

Share on other sites

meter 145

Link to post

Share on other sites

UBotBuddy 331

Link to post

Share on other sites

HarryPotter 9

Link to post

Share on other sites

HarryPotter 9

Link to post

Share on other sites

meter 145

Link to post

Share on other sites

Abs* 12

Link to post

Share on other sites

meter 145

Link to post

Share on other sites

Net66 54

Link to post

Share on other sites

Abs* 12

Link to post

Share on other sites

HarryPotter 9

Link to post

Share on other sites

Join the conversation