Jump to content
UBot Underground

Recommended Posts

add list to list(%links,$find regular expression($eval("var myLinks = []
var x =document.getElementsByClassName(\"b_algo\")

for (var y = 0 ; y < x.length; y++)\{
  
var z  = x[y].getElementsByTagName(\"a\")

myLinks.push(decodeURI( z[0].attributes[0].nodeValue))
\}

var a = JSON.stringify(myLinks)
a

"),"(?<=\\\")[^,].*?(?=\\\")"),"Delete","Global")

 

 

 

Just the scraping element,might save you some time,go to a search result page and hit run,or put it in your script

 

Strangely in firefox attributes[0] is actually attributes[1] so the DOM must render the page differently or something,so if it doesn't work for Ubot 4 etc let me know,would be interested to know that

 

this returns the links on the page in the format "www.example.com",if anyone knows the regex(mine sucks!) for after " before" so the quotes are removed in the find regular expression node please post and I will add it,thanks,am always looking for the expression after " and before the following ",drives me nuts !!

 

<EDIT>

 

Thanks itexpert and helloInsomnia

ha ha that solution you have itexpert is pretty much the solution I have on my own script,but I will replace my regex with helloInsomnias,see my request above for removing "www.example.com" to become www.example.com ,thanks lads

 

</EDIT>

Link to post
Share on other sites

Here you go

 

See we are more powerful when we work together!

 

No More " Signs!

 

add list to list(%links,$find regular expression($eval("var myLinks = []
var x =document.getElementsByClassName(\"b_algo\")

for (var y = 0 ; y < x.length; y++)\{
  
var z  = x[y].getElementsByTagName(\"a\")

myLinks.push(decodeURI( z[0].attributes[0].nodeValue))
\}

var a = JSON.stringify(myLinks)
a

"),"\\\".+?\\\""),"Delete","Global")
set(#replace,$replace(%links,"\"",$nothing),"Global")
add list to list(%List,$list from text(#replace,$new line),"Delete","Global")

Link to post
Share on other sites

Thanks guys, good work!

 

 

This returns the links on the page in the format "www.example.com",if anyone knows the regex(mine sucks!) for after " before" so the quotes are removed in the find regular expression node please post and I will add it,thanks,am always looking for the expression after " and before the following ",drives me nuts !!

 

Generally this is the easy way to do it:

(?<=\").*?(?=\")

That will capture everything between the quotes for you without getting the actual quotes. Here is an example: http://rubular.com/r/Oqw8fFQVPv

Link to post
Share on other sites
  • 2 weeks later...
  • 2 years later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...