Tum 0 Posted November 7, 2010 Report Share Posted November 7, 2010 Hi,Am having a little trouble trying to scrape all the url's from a blog. Lets say i have my bot goto a website with a url of "http://THISISABLOG.com/" So then the bot will grab this URL from the address bar (as the address wont always be the same), and then search and scrape and place into a list from the site, ALL the LINKS in the page that contain the URL (in the address bar) + LINKS ie "http://THISISABLOG.com/THIS-IS-A-POST"ie "http://THISISABLOG.com/THIS-IS-ANOTHER-POST" Hopefully you can understand what i mean ! Thanks Quote Link to post Share on other sites
crazyflx 22 Posted November 7, 2010 Report Share Posted November 7, 2010 Check out my attached example. Change the "thisisablog.com" to a real blog URL & then add a "save to file -> List of Sites URLs" command to see what you've saved. If you've got any questions let me know.Example.ubot Quote Link to post Share on other sites
Tum 0 Posted November 7, 2010 Author Report Share Posted November 7, 2010 Thanks Crazyfix, that works great. Is there any way i could maybe just pull the urls that are listed under the recent posts section of the blogs ? Quote Link to post Share on other sites
crazyflx 22 Posted November 7, 2010 Report Share Posted November 7, 2010 Thanks Crazyfix, that works great. Is there any way i could maybe just pull the urls that are listed under the recent posts section of the blogs ? No problem at all, happy to help. As for only pulling the URLs that are listed under the recent posts, the only way that would be possible would be if there were some sort of characteristic those URLs all shared. For instance: http:// thisisablog.com/recent-posts/this-is-a-post-url.htm Then you would use: Choose by Attribute -> href|*/recent-posts/*|wildcards Add to List -> List of Sites URLs -> Scrape Chosen Attribute|href But I doubt that each blog you're navigating to is going to have the same URL format as every other blog. Quote Link to post Share on other sites
Tum 0 Posted November 7, 2010 Author Report Share Posted November 7, 2010 Ok, well thanks again for that Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.