Jump to content
UBot Underground

Choose specific match result group


Recommended Posts

This website is the perfect place to try different regex to see is which works best: http://rubular.com/

 

If you are unsure, try the website for tips and ways to test your regex.

 

Should be of substantial help.

 

I use rubular.com all the time, unfortunately I couldn't find a solution how to choose a specific match group.

 

I know in Java they can choose which group to pick out of the matching groups (first, second...).

 

Here's an example of what I'm trying to achive: http://rubular.com/r/BB7KMMQxyO

 

There are two match groups. I'm trying to be able to always pick the first (or second) group.

Link to post
Share on other sites
  • 3 weeks later...

This will be very usefull for data scrapping.

You can for example do match groups in a javascript block, but I dont know how to return values from it.

 

Any ideas?

Link to post
Share on other sites

Hi,

 

You could load the regex matches in to a ubot list. Then select from the ubot list.

 

Sample code:

regexpage()
regextext()
define regexpage {
load html("<html>
<head></head>
<body>
<div id=data1>!cba123!</div>
<div id=data2>!abc123!</div>
<div id=data3>!123abcabc!</div>
<div id=data4>!321cbacba!</div>
</body>
</html>")
set(#loadpagescrapedata, $scrape attribute(<outerhtml=w"<div id=\"data*\">*</div>">, "innertext"), "Global")
set(#loadpagedata, $find regular expression(#loadpagescrapedata, "!((?:abc|123)+)!"), "Global")
clear list(%regexpagematches)
add list to list(%regexpagematches, $list from text(#loadpagedata, $new line), "Delete", "Global")
set list position(%regexpagematches, 0)
loop($list total(%regexpagematches)) {
 if($comparison($list position(%regexpagematches), "<", $list total(%regexpagematches))) {
	 then {
		 if($contains($next list item(%regexpagematches), "3abc")) {
			 then {
				 set(#pagedatamatch, $list item(%regexpagematches, $subtract($list position(%regexpagematches), 1)), "Global")
			 }
			 else {
			 }
		 }
	 }
	 else {
	 }
 }
}
}
define regextext {
set(#pagedata, "!321cba!
!abc123!
!123abcabc!
!321cbacba!", "Global")
set(#extractdata, $find regular expression(#pagedata, "!((?:abc|123)+)!"), "Global")
clear list(%regexmatches)
add list to list(%regexmatches, $list from text(#extractdata, $new line), "Delete", "Global")
set list position(%regexmatches, 0)
loop($list total(%regexmatches)) {
 if($comparison($list position(%regexmatches), "<", $list total(%regexmatches))) {
	 then {
		 if($contains($next list item(%regexmatches), "c12")) {
			 then {
				 set(#datamatch, $list item(%regexmatches, $subtract($list position(%regexmatches), 1)), "Global")
			 }
			 else {
			 }
		 }
	 }
	 else {
	 }
 }
}
}

sample-regex-group-list-001.ubot

 

Kevin

Link to post
Share on other sites

This will be very usefull for data scrapping.

You can for example do match groups in a javascript block, but I dont know how to return values from it.

 

Any ideas?

 

dont know if u searching for that...but maybe it helps

 

set(#jsvar, $eval("var getJsValue = whatEver

getJsValue"), "Global")

  • Like 1
Link to post
Share on other sites
  • 3 years later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...