blight 0 Posted February 22, 2010 Report Share Posted February 22, 2010 Hey there.I know im prob missing something really simple here, but its getting a bit urgent.I need to know how to scrape url's but exclude urls with certain words in the link text. Eg, scrape google serp page and add to list only unique domains, none of google links with "google" in the link text. The tutorial shows how to do this but not how to exclude certain links. Thanks! Quote Link to post Share on other sites
greencat 18 Posted February 23, 2010 Report Share Posted February 23, 2010 Hey there.I know im prob missing something really simple here, but its getting a bit urgent.I need to know how to scrape url's but exclude urls with certain words in the link text. Eg, scrape google serp page and add to list only unique domains, none of google links with "google" in the link text. The tutorial shows how to do this but not how to exclude certain links. Thanks! One simple way of doing this might be to simply refine your Google search eg: blog -google would find blogs that don't mention google. blog -intitle:google finds blogs that don't have google in the title. blog -site:google.com -intitle:google finds blogs which aren't google owned/on the google domain. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.