Jump to content
UBot Underground

Creating A Crawler Like Google Bot Possible?


Recommended Posts

Hi

 

I was thinking - would it be possible to create a bot that acts like a web crawler (google bot). My idea would require to collect about 1.200.000 domain names, dns, footprint etc. for 1 top level tld. 

 

I was thinking to offload the data burden from ubot to a mysql database, so the data isn't stored locally. 

 

How would you guys go about such a task ?

Just collecting ideas about how to do it at this point. 

 

I would allow for some nice statistical analysis.

Link to post
Share on other sites

Well there is this large data plugin (free) to start with :)

 

http://www.ubotstudio.com/forum/index.php?/topic/16308-free-plugin-large-data/

 

And here is some regex to scrape google urls

 

add list to list(%results, $find regular expression($scrape attribute(<class="r">, "innerhtml"), "(?<=href\\=\\\")http.*?(?=\\\")"), "Delete", "Global")

 

Hope it helps :)

  • Like 1
Link to post
Share on other sites

For a program, but I have found somewhat of a work around.

 

If I set the UI HTML panel height to e.g. 2000px (a value higher than the screen) and create some custom css/html where my panels have 100% height, then I can almost get what I want. Although I can't create a "footer" bar, but I can live with that.

Link to post
Share on other sites

Ahh... that answer didn't make sense - must have been late :D

Thought I answered to a question regarding UI height. 

 

But the collected data would not be used directly, but as I said for statistical analysis. 

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...