Jump to content
UBot Underground

regex select first instance in ubot4


Recommended Posts

in ubot3.5 i could choose the first instance of regex, how do i do it in ubot4. basically i want to turn off the GLOBAL option (so taht only one instance gets selected from my regex

(?<=\,).*?(?=\,)

)

Link to post
Share on other sites

Hmmm... I might misinterpret you, but I'll take the chance...

 

RegExp are for mapping purposes, therefor everything on e.g. a page that maps towards a regexp will be returned when you use the regexp. The mapped occurances are returned as a list, so if you only need the first occurance, then get the first position in that list.

 

Are you possibly referring to the first part of the regexp?

 

(?<=\,).*?(?=\,)

 

^^^^^^^

 

If you don't want the other part them remove the parenthesis there.

 

 

Please elaborate a bit more on what you need to do.

Link to post
Share on other sites

hi john and anonym, i have attached 2 pics, one from ubot3.5 and the other from ubot4. notice in 3.5 there is the option to return back 'single' or a 'list' while in 4, it automatically returns back a list. lemme know how i get back the single element in ubot4, or can u suggest a regex to do the same. currently i am using (?<=\,).*?(?=\,) as my pattern

 

 

post-263-0-92937700-1329374862_thumb.png this is in 3.5 (notice the return type=single)

 

 

post-263-0-31804500-1329374881_thumb.png this is in 4 (no way to specify return type)

  • Like 1
Link to post
Share on other sites

If you are adding the scrape to a list, simply call the list item (at position 0). This will only work if you are using add list to list and not add item to list.

 

 

John

Link to post
Share on other sites

ummmm... apart from returning position 0, any other way around? (just asking)

 

bcoz currently im using

change attribute(<id=$find regular expression($list item(%forum_names_demo3, $list position(%forum_names_demo3)), "(?<=\\,).*?(?=\\,)")> blah blah

 

as u can see, im not adding the scrape to a list and i dont want to either. now what

 

------------------------------------

 

just to make things much clearer, i use Gskinnera to verify my regex, now if u visit that site and type in co in the pattern box, u will notice that all the co on the page get matched (around 6 times), however, you will see that the checkbox for GLOBAL is checked, if u uncheck it, then only the first instance gets returned back (the co from the word WELCOME gets highlighted....). Ubot 3.5 could do this return of only the first instance, do u know what is the regex code i could use to turn off the GLOBAL setting in ubot4? i have tried using (?g)co and even (?-g)co in Gskinner but its of no use

Link to post
Share on other sites

c'mon guys a little help, my bot has been stuck for the last 4 days because of this and honestly speaking ubot 4 is quite bad, 3.5 was better..

 

Who said that they learning curve of this s/w was easy, it is too tough (php was easier!)...

Link to post
Share on other sites

Also putting it like this /(?<=\,).*?(?=\,)/ should make it ungreedy but it don't work with ubot

The other way I have used if I remember correct is /(?<=\,).*?(?=\,)U/ but again it fails in ubot

Link to post
Share on other sites

In UBot 4, $find regular expression always returns a list. If you only want the first match, you can just use $list item.

 

For example:

 

$find regular expression("apple pie", "p")

Would return a list with 3 items in it [p, p, p]

 

While this example:

 

$list item($find regular expression("apple pie", "p"), 0)

Would return the first item (0) of that list.

 

So putting it in practice you might do something like:

 

set(#first word, $list item($find regular expression($document text, "you\\w+"), 0), "Global")

To find the first word that starts with "you" on a page.

 

Hope that helps, let me know if you need more examples.

Link to post
Share on other sites

ummm... but i dont want to use a list as im already getting confused with the number of lists that i have

 

The code im using is

<id=$find regular expression($list item(%forum_names_demo3, $list position(%forum_names_demo3)), "(?<=\\,).*?(?=\\,)")>

 

So if i have to use a list, then i have to add the list and add the regex to it, then change the above code to point the id to the 0 pos of the new list, then this new list needs to be cleared and reset everytime while im looping through the forum_names_demo3 list.... see how many steps.

 

and zap, i tried everything to remove the global but apparently ubot was created in some language that doesnt support any of this, i really cannot believe that ubot loves lists so much, doesnt anyone every get confused with all the lists?

 

Had ubot4 for 1 month and already frustrated with it :(

Link to post
Share on other sites

ummm... but i dont want to use a list as im already getting confused with the number of lists that i have

 

The code im using is

<id=$find regular expression($list item(%forum_names_demo3, $list position(%forum_names_demo3)), "(?<=\\,).*?(?=\\,)")>

 

So if i have to use a list, then i have to add the list and add the regex to it, then change the above code to point the id to the 0 pos of the new list, then this new list needs to be cleared and reset everytime while im looping through the forum_names_demo3 list.... see how many steps.

 

and zap, i tried everything to remove the global but apparently ubot was created in some language that doesnt support any of this, i really cannot believe that ubot loves lists so much, doesnt anyone every get confused with all the lists?

 

Had ubot4 for 1 month and already frustrated with it :(

 

I'm pretty confused. In my example I didn't create any new lists. I just used the $list item function. Notice there are no % signs, there would be no clearing or resetting of the list. There's only one step, just adding $list item. If you find there are too many steps in that selector you can always use a variable.

 

For example you could change it from:

 

<id=$find regular expression($list item(%forum_names_demo3, $list position(%forum_names_demo3)), "(?<=\\,).*?(?=\\,)")>

to

 

set(#item, $list item(%forum_names_demo3, $list position(%forum_names_demo3)), "Local")
set(#first match, $list item($find regular expression(#item, "(?<=\\,).*?(?=\\,)"), 0), "Local")

Then your selector would be:

 

<id=#first match>

 

There really is just one extra step in what I posted, and that is wrapping your $find regular expression in $list item. There's no clearing lists or anything else like that.

Link to post
Share on other sites

I'm pretty confused.

Well that makes 2 of us, but i seen what you were trying to do, i managed to get it to work through a very complex method, thank you. But i still dont understand why everything was being returned as a list for regex.

Thanks again

Link to post
Share on other sites

Thats pretty good just added it to my library

 

Thats after ubot let me login as i forgot to change the proxy setting back from a custom ip back to use system ip not ever user has a fixed ip so i dont see why this should be a problem

Link to post
Share on other sites

Thats after ubot let me login as i forgot to change the proxy setting back from a custom ip back to use system ip not ever user has a fixed ip so i dont see why this should be a problem

 

Different issue unrelated to this thread

 

 

john

 

 

 

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...