J Bot 5 Posted March 12, 2013 Report Share Posted March 12, 2013 I am working on a fairly straight forward bot that will scrape google results. I have several methods in mind that could get to a results page in the browser and wonder which would be best. My intent is to sell this bot. I'm only scraping the titles and URLS, not going into results. The search query itself is fairly complex (long). I had to trim the original down as google has a 32 word limit on queries. I want to run the queries specifically against a fixed # of sites using the site: operator in google, so.. run the query multiple times. I want Google search settings to include 'Safe Search' = ONGoogle Instant = Off (required for more than 10 results)Results = 100 (to improve scrape speed?) Also: After results are returned I go under search tools and change 'any time' to 'last month' Now, for the question -- which is best? 1. Do all these settings in the bot each time it runs. 2. I could do the search and modifiations in a browser, and save the URL of the Google results page, which includes all of the criteria and the modifications and hard code this 'results' url into the bot. 3. I could get the results url as in option 2. but put a link to it on a page in my website, which the bot could scrape when run. This allows me to update the query without updating the bot. For 2 and 3, I don't know if this 'saved url' is 'permanent' or if it will stop working in a couple weeks or days. Any thoughts or similar experiences would be greatly appreciated. Quote Link to post Share on other sites
AutomationNinja 194 Posted March 13, 2013 Report Share Posted March 13, 2013 1. Sure.2. I wouldn't do that as that link probably changes.3. I would stick with doing #1 Quote Link to post Share on other sites
J Bot 5 Posted March 13, 2013 Author Report Share Posted March 13, 2013 Hey, thanks for the reply. I really being able to come here and get the opinions of people have been around the block a few times. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.