deliter 203 Posted March 19, 2015 Report Share Posted March 19, 2015 add list to list(%links,$find regular expression($eval("var myLinks = []var x =document.getElementsByClassName(\"b_algo\")for (var y = 0 ; y < x.length; y++)\{ var z = x[y].getElementsByTagName(\"a\")myLinks.push(decodeURI( z[0].attributes[0].nodeValue))\}var a = JSON.stringify(myLinks)a"),"(?<=\\\")[^,].*?(?=\\\")"),"Delete","Global") Just the scraping element,might save you some time,go to a search result page and hit run,or put it in your script Strangely in firefox attributes[0] is actually attributes[1] so the DOM must render the page differently or something,so if it doesn't work for Ubot 4 etc let me know,would be interested to know that this returns the links on the page in the format "www.example.com",if anyone knows the regex(mine sucks!) for after " before" so the quotes are removed in the find regular expression node please post and I will add it,thanks,am always looking for the expression after " and before the following ",drives me nuts !! <EDIT> Thanks itexpert and helloInsomniaha ha that solution you have itexpert is pretty much the solution I have on my own script,but I will replace my regex with helloInsomnias,see my request above for removing "www.example.com" to become www.example.com ,thanks lads </EDIT> Quote Link to post Share on other sites
itexspert 47 Posted March 20, 2015 Report Share Posted March 20, 2015 Here you go See we are more powerful when we work together! No More " Signs! add list to list(%links,$find regular expression($eval("var myLinks = []var x =document.getElementsByClassName(\"b_algo\")for (var y = 0 ; y < x.length; y++)\{ var z = x[y].getElementsByTagName(\"a\")myLinks.push(decodeURI( z[0].attributes[0].nodeValue))\}var a = JSON.stringify(myLinks)a"),"\\\".+?\\\""),"Delete","Global")set(#replace,$replace(%links,"\"",$nothing),"Global")add list to list(%List,$list from text(#replace,$new line),"Delete","Global") Quote Link to post Share on other sites
HelloInsomnia 1103 Posted March 20, 2015 Report Share Posted March 20, 2015 Thanks guys, good work! This returns the links on the page in the format "www.example.com",if anyone knows the regex(mine sucks!) for after " before" so the quotes are removed in the find regular expression node please post and I will add it,thanks,am always looking for the expression after " and before the following ",drives me nuts !! Generally this is the easy way to do it: (?<=\").*?(?=\") That will capture everything between the quotes for you without getting the actual quotes. Here is an example: http://rubular.com/r/Oqw8fFQVPv Quote Link to post Share on other sites
gavind 6 Posted March 30, 2015 Report Share Posted March 30, 2015 Nice of you to share this for free as well! Thumbs up! Quote Link to post Share on other sites
Monark 0 Posted July 20, 2017 Report Share Posted July 20, 2017 nice Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.