theninjamanz 29 Posted November 12, 2014 Report Share Posted November 12, 2014 I've got a bot that is scraping pretty happily and placing all the contents in a nicely formatted CSV, but there's a bug that is making it fall foul of its duty. When the bot encounters a block of HTML where the user has made use of a carriage return the scrape is captured like this: Example from this page: ipb.vars['highlight_color'] = "#ade57a"; ipb.vars['charset'] = "UTF-8"; ipb.vars['time_offset'] = "-7"; ipb.vars['hour_format'] = "12"; ipb.vars['seo_enabled'] = 1; What I need to do is take the above and perform some kind of regex or other on this to get the following result. ipb.vars['highlight_color'] = "#ade57a";ipb.vars['charset'] = "UTF-8"; ipb.vars['time_offset'] = "-7"; ipb.vars['hour_format'] = "12"; ipb.vars['seo_enabled'] = 1; IE that all the hard breaks get removed and the output gets put into one line. Does anyone have any idea how to accomplish this. My ubot can then return to duties. Cheers. Quote Link to post Share on other sites
UBotDev 276 Posted November 12, 2014 Report Share Posted November 12, 2014 You mean something line $replace(#text,$new line,"") ? Quote Link to post Share on other sites
LazyBotter 188 Posted November 13, 2014 Report Share Posted November 13, 2014 Something like that? set(#Content, " ipb.vars[\'highlight_color\'] = \"#ade57a\"; ipb.vars[\'charset\'] = \"UTF-8\"; ipb.vars[\'time_offset\'] = \"-7\"; ipb.vars[\'hour_format\'] = \"12\"; ipb.vars[\'seo_enabled\'] = 1;", "Global") set(#Pos, 0, "Global") clear list(%List) add list to list(%List, $list from text(#Content, " "), "Delete", "Global") set(#Output, "", "Global") loop($list total(%List)) { set(#Output, "{#Output}{$list item(%List, #Pos)}", "Global") increment(#Pos) } Quote Link to post Share on other sites
deliter 203 Posted February 1, 2015 Report Share Posted February 1, 2015 replace regular expression search text \s+ replace text blank Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.