Ira_Hayes 4 Posted January 19, 2012 Report Share Posted January 19, 2012 Check out this video to see the speed at which I scrapped my last project (3-4 complete page scrapes per second at times). Video attached. I wanted to see if I could scrape faster by using this command.set(#data, $read file("http://www.example.com"), "Global") That takes the contents of a URL, and loads it into a variable. Then you just use regex to find what you need. My script has about a dozen lists and multiple nested regex, replace, trim functions, and it's still blazing fast on a virtual machine. Here is an example of the code used to get the title of the off road trail I am scraping. add item to list(%title, $replace($replace($find regular expression(#data, "<meta name=\"TrailName\" content=\"[^\".]*\" />"), "<meta name=\"TrailName\" content=\"", $nothing), "\" />", $nothing), "Don\'t Delete", "Global") That looks tricky in code view, but paste that into your ubot studio and it will make more sense to you. I hope this helps, this is how I will scrape everything in the future.Speed.mov 1 Quote Link to post Share on other sites
gandensang 11 Posted January 19, 2012 Report Share Posted January 19, 2012 can't download attacment Quote Link to post Share on other sites
Legend 181 Posted January 19, 2012 Report Share Posted January 19, 2012 downloaded fine here... http://ubotstudio.com/forum/public/style_emoticons/default/blink.gif Quote Link to post Share on other sites
Praney Behl 314 Posted January 19, 2012 Report Share Posted January 19, 2012 I think if you are just trying scrape meta description it'll be easier to use: set(#data, "{$title},{$meta description},{$meta keywords}", "Global") Just my .02c Praney 1 Quote Link to post Share on other sites
odeesuba 24 Posted January 19, 2012 Report Share Posted January 19, 2012 Interesting idea , worth exploring. Thanks for sharing. Quote Link to post Share on other sites
Ira_Hayes 4 Posted January 19, 2012 Author Report Share Posted January 19, 2012 I think if you are just trying scrape meta description it'll be easier to use: set(#data, "{$title},{$meta description},{$meta keywords}", "Global") Just my .02c PraneyI wasn't scraping keywords or descriptions. These were all meta tags, but not the keyword or description tag. Quote Link to post Share on other sites
JustinF 0 Posted April 4, 2012 Report Share Posted April 4, 2012 This thread was a tremendous help tonight, thank you! My scraper was virtually unusable even with sockets and multithreading until I tried your tip, and now it is blazing fast. Quote Link to post Share on other sites
Legend 181 Posted April 4, 2012 Report Share Posted April 4, 2012 Definitely... good stuff... thanks! http://ubotstudio.com/forum/public/style_emoticons/default/smile.gif Quote Link to post Share on other sites
whoami 26 Posted August 12, 2012 Report Share Posted August 12, 2012 $meta keywords doesnt help on all pages..Any other suggestion for scraping keywords? Quote Link to post Share on other sites
sktan7 12 Posted August 14, 2012 Report Share Posted August 14, 2012 I cannot download the attachment Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.