Ant 1 Posted June 21, 2012 Report Share Posted June 21, 2012 Hi, I have being playing around with making a proxy scraper/tester. I have the scraping down but my problem is that on the site that I get the proxies from, they have banner ads inserted into the table of proxies. 5,66.29.173.139,80,Anonymous,United States,2012-06-21,WHOIS 6,71.30.219.250,8080,Anonymous,United States,2012-06-21,WHOIS 7,77.73.130.70,8081,Anonymous,Kazakhstan,2012-06-21,WHOIS "<!-- google_ad_client = ""pub-6974495340240718""; /* cooleasy_728x90, 䶿9-3-21 */ google_ad_slot = ""8294635992""; google_ad_width = 728; google_ad_height = 90; //-->",,,,,, 8,88.191.127.178,3128,Anonymous,France,2012-06-21,WHOIS 9,87.98.142.150,8080,Anonymous,France,2012-06-21,WHOIS My bot works fine in testing the proxies until it hits the ad code which obviously throws up an error as its not a valid ip:port. My question is, what is the best way to skip this ad code and just move onto the next line? ThanksAnt Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted June 21, 2012 Report Share Posted June 21, 2012 add a if statement.. if(table column 1 = $nothing){increment row and set proxy} else { set proxy } because where the ad shows up, the table column 1 is blank, and where you have proxys its not blank. Quote Link to post Share on other sites
Ant 1 Posted June 25, 2012 Author Report Share Posted June 25, 2012 add a if statement.. if(table column 1 = $nothing){increment row and set proxy} else { set proxy } because where the ad shows up, the table column 1 is blank, and where you have proxys its not blank. Thanks for your reply but I am still confused. When the loop hits the ad code line, doesn't the ad code data become the first cell and so there is some text there so $nothing wouldn't work? Sorry if I am being a bit thick. Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted June 25, 2012 Report Share Posted June 25, 2012 Based on the information you provided when it hits the ad code line, column 1 is blank. there is nothing there, the ad code is in column 0 Therefore it would be $nothing The way i said to do it based on the above is workign just fine without issue. save to file("{$special folder("Desktop")}\\savetest.csv", "5,66.29.173.139,80,Anonymous,United States,2012-06-21,WHOIS 6,71.30.219.250,8080,Anonymous,United States,2012-06-21,WHOIS 7,77.73.130.70,8081,Anonymous,Kazakhstan,2012-06-21,WHOIS \"<!-- google_ad_client = \"\"pub-6974495340240718\"\"; /* cooleasy_728x90, 䶿9-3-21 */ google_ad_slot = \"\"8294635992\"\"; google_ad_width = 728; google_ad_height = 90; //-->\",,,,,, 8,88.191.127.178,3128,Anonymous,France,2012-06-21,WHOIS 9,87.98.142.150,8080,Anonymous,France,2012-06-21,WHOIS") wait(2) clear table(&test) create table from file("{$special folder("Desktop")}\\savetest.csv", &test) set(#row, 0, "Global") loop($subtract($table total rows(&test), 1)) { if($comparison($table cell(&test, #row, 1), "=", $nothing)) { then { } else { change proxy("{$table cell(&test, #row, 1)}:{$table cell(&test, #row, 2)}") navigate("http://google.com", "Wait") if($exists(<name="q">)) { then { add item to list(%good proxy, "{$table cell(&test, #row, 1)}:{$table cell(&test, #row, 2)}", "Delete", "Global") } else { } } } } increment(#row) } save to file("{$special folder("Desktop")}\\Good Proxys.txt", %good proxy) 1 Quote Link to post Share on other sites
Ant 1 Posted July 1, 2012 Author Report Share Posted July 1, 2012 Sorry I was totally being an idiot getting the column order mixed up. That worked great thanks again. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.