Jump to content
UBot Underground

Never Visit The Same Url Twice | Help Please!!!.


Recommended Posts

Sorry if the title doesn't match my post. But I have a problem that is driving me nuts!

 

I have a major script that does plenty of complex task. Though I still get a couple of errors that is not the problem I need help with. What I do need help with is this...

 

Say part of my script extracts urls from Google, then afterward it saves and then navigates to each of those urls.

Now once it has been visited I want to store the url somewhere so that it never again is visited. I know how to extract the url of the current page and all that. But how to tell Ubot to check if the url has already been visited, and if so don't navigate to, or to skip this url?

 

See basically If I have already posted a message on a site I don't want to extract the same url again by using a similar keyword, and then navigate to that url and double post the message. So is there a way to have Ubot check something like a database where all the urls that have been visited is stored, and if so don't navigate to that site?

 

Set list position won't work for this, so...

Link to post
Share on other sites

I can visualise a number of ways to tackle this - in fact, I have a similar situation where I have a physical EXTERNAL database and I do not want to add a new record if it already exist BUT I do want to update it's content. This is solved by interfacing UBot to a private password protected php page where I pass forward the values to the script and let the php script handle the logic for updating BUT in your case then you can probably do it through UBot internal arrays.

 

Your logic, if I understand it, should be something like :

sub GRAB current Visited List OR clear it depending on what you want to happen!

sub Scrape URLs from Google and ADD to Scraped URL list

sub LOOP through Scraped URL list

sub set a variable to say 'Continue'

sub EVALUATE if Current Scraped URL is in Visited List by LOOPing through Visited List and comparing values and if it is then CHANGE set variable to 'Do not Continue'

sub IF you EVALUATE the variable and it still says 'Continue' THEN

sub NAVIGATE to each URL

sub POST a comment on SPECIFIC full URL

sub ADD to Visited List and continue Scraped URL loop ELSE do nothing and continue Scraped URL loop

 

or something very similar!

 

Of course as the Visited list gets bigger and bigger then it will take longer on each cycle to vet the current URL against the visited list - it would be interesting to see how performance changes for different sized lists. Lots of things to play with here!

 

HTH.

Link to post
Share on other sites
  • 4 weeks later...

Think what I would do is to make 2 new lists as you are going through it.

 

A dont visit again list and a visit again list.

 

Then just use those lists when you need them. Im sure you figured out a way around this already, but for anyone else that runs into this I think this might be the easiest way (maybe not the best).

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...