Randy Role 1 Posted August 21, 2012 Report Share Posted August 21, 2012 Hey, Regex is usually denotated to match groups as $n, where n is the capture group. Does anybody know if there's a way to choose a group number with ubot? $1, $2.. Thanks Quote Link to post Share on other sites
MiriamMB 63 Posted August 21, 2012 Report Share Posted August 21, 2012 This website is the perfect place to try different regex to see is which works best: http://rubular.com/ If you are unsure, try the website for tips and ways to test your regex. Should be of substantial help. Quote Link to post Share on other sites
Randy Role 1 Posted August 22, 2012 Author Report Share Posted August 22, 2012 This website is the perfect place to try different regex to see is which works best: http://rubular.com/ If you are unsure, try the website for tips and ways to test your regex. Should be of substantial help. I use rubular.com all the time, unfortunately I couldn't find a solution how to choose a specific match group. I know in Java they can choose which group to pick out of the matching groups (first, second...). Here's an example of what I'm trying to achive: http://rubular.com/r/BB7KMMQxyO There are two match groups. I'm trying to be able to always pick the first (or second) group. Quote Link to post Share on other sites
brusacco 20 Posted September 6, 2012 Report Share Posted September 6, 2012 This will be very usefull for data scrapping.You can for example do match groups in a javascript block, but I dont know how to return values from it. Any ideas? Quote Link to post Share on other sites
k1lv9h 76 Posted September 7, 2012 Report Share Posted September 7, 2012 Hi, You could load the regex matches in to a ubot list. Then select from the ubot list. Sample code:regexpage() regextext() define regexpage { load html("<html> <head></head> <body> <div id=data1>!cba123!</div> <div id=data2>!abc123!</div> <div id=data3>!123abcabc!</div> <div id=data4>!321cbacba!</div> </body> </html>") set(#loadpagescrapedata, $scrape attribute(<outerhtml=w"<div id=\"data*\">*</div>">, "innertext"), "Global") set(#loadpagedata, $find regular expression(#loadpagescrapedata, "!((?:abc|123)+)!"), "Global") clear list(%regexpagematches) add list to list(%regexpagematches, $list from text(#loadpagedata, $new line), "Delete", "Global") set list position(%regexpagematches, 0) loop($list total(%regexpagematches)) { if($comparison($list position(%regexpagematches), "<", $list total(%regexpagematches))) { then { if($contains($next list item(%regexpagematches), "3abc")) { then { set(#pagedatamatch, $list item(%regexpagematches, $subtract($list position(%regexpagematches), 1)), "Global") } else { } } } else { } } } } define regextext { set(#pagedata, "!321cba! !abc123! !123abcabc! !321cbacba!", "Global") set(#extractdata, $find regular expression(#pagedata, "!((?:abc|123)+)!"), "Global") clear list(%regexmatches) add list to list(%regexmatches, $list from text(#extractdata, $new line), "Delete", "Global") set list position(%regexmatches, 0) loop($list total(%regexmatches)) { if($comparison($list position(%regexmatches), "<", $list total(%regexmatches))) { then { if($contains($next list item(%regexmatches), "c12")) { then { set(#datamatch, $list item(%regexmatches, $subtract($list position(%regexmatches), 1)), "Global") } else { } } } else { } } } }sample-regex-group-list-001.ubot Kevin Quote Link to post Share on other sites
blumi40 222 Posted September 7, 2012 Report Share Posted September 7, 2012 This will be very usefull for data scrapping.You can for example do match groups in a javascript block, but I dont know how to return values from it. Any ideas? dont know if u searching for that...but maybe it helps set(#jsvar, $eval("var getJsValue = whatEvergetJsValue"), "Global") 1 Quote Link to post Share on other sites
brusacco 20 Posted August 19, 2016 Report Share Posted August 19, 2016 Thanks blumi40! That worked very well for me. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.