Gogetta 263 Posted January 2, 2010 Report Share Posted January 2, 2010 Sorry if the title doesn't match my post. But I have a problem that is driving me nuts! I have a major script that does plenty of complex task. Though I still get a couple of errors that is not the problem I need help with. What I do need help with is this... Say part of my script extracts urls from Google, then afterward it saves and then navigates to each of those urls.Now once it has been visited I want to store the url somewhere so that it never again is visited. I know how to extract the url of the current page and all that. But how to tell Ubot to check if the url has already been visited, and if so don't navigate to, or to skip this url? See basically If I have already posted a message on a site I don't want to extract the same url again by using a similar keyword, and then navigate to that url and double post the message. So is there a way to have Ubot check something like a database where all the urls that have been visited is stored, and if so don't navigate to that site? Set list position won't work for this, so... Quote Link to post Share on other sites
Confuscius 1 Posted January 2, 2010 Report Share Posted January 2, 2010 I can visualise a number of ways to tackle this - in fact, I have a similar situation where I have a physical EXTERNAL database and I do not want to add a new record if it already exist BUT I do want to update it's content. This is solved by interfacing UBot to a private password protected php page where I pass forward the values to the script and let the php script handle the logic for updating BUT in your case then you can probably do it through UBot internal arrays. Your logic, if I understand it, should be something like :sub GRAB current Visited List OR clear it depending on what you want to happen!sub Scrape URLs from Google and ADD to Scraped URL listsub LOOP through Scraped URL listsub set a variable to say 'Continue'sub EVALUATE if Current Scraped URL is in Visited List by LOOPing through Visited List and comparing values and if it is then CHANGE set variable to 'Do not Continue'sub IF you EVALUATE the variable and it still says 'Continue' THEN sub NAVIGATE to each URLsub POST a comment on SPECIFIC full URLsub ADD to Visited List and continue Scraped URL loop ELSE do nothing and continue Scraped URL loop or something very similar! Of course as the Visited list gets bigger and bigger then it will take longer on each cycle to vet the current URL against the visited list - it would be interesting to see how performance changes for different sized lists. Lots of things to play with here! HTH. Quote Link to post Share on other sites
billywizz 0 Posted January 30, 2010 Report Share Posted January 30, 2010 Did you ever sort this im having the same problem. Quote Link to post Share on other sites
Aaron Nimocks 19 Posted January 30, 2010 Report Share Posted January 30, 2010 Think what I would do is to make 2 new lists as you are going through it. A dont visit again list and a visit again list. Then just use those lists when you need them. Im sure you figured out a way around this already, but for anyone else that runs into this I think this might be the easiest way (maybe not the best). Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.