Jump to content
UBot Underground

My Regex Not Working With $Find Regular Expression Command


Recommended Posts

My Regex works in Regexbuddy and EditPad Lite 7. But no in Ubot. What am I doing wrong? This is my Regex code
 

(?<=class="business-name" data-impressed="1">)[0-9a-z ]+

Im a trying to grab the Company names from the results. Its a total of 32. 


Here is the code:

set(#page,$nothing,"Global")
navigate("http://www.yellowpages.com/search?search_terms=realtor&geo_location_terms=Homestead%2C+FL","Wait")
wait(3)
set(#page,$document text,"Global")
set(#Compan Names,$find regular expression(#page,"(?<=class=\"business-name\" data-impressed=\"1\">)[0-9a-z ]+"),"Global")

Link to post
Share on other sites

Hi,

 

This should work with Ubot Studio version 5.:

Replace new lines with nothing when loading #page

Added to regex A-Z for capital letters

set(#page,$nothing,"Global")
navigate("http://www.yellowpages.com/search?search_terms=realtor&geo_location_terms=Homestead%2C+FL","Wait")
wait(3)
set(#page,$replace($document text,$new line,$nothing),"Global")
set(#Compan Names,$find regular expression(#page,"(?<=class=\"business-name\" data-impressed=\"1\">)[0-9A-Za-z ]+"),"Global")

This should work with Ubot Studio version 4.

Replace new lines with nothing when loading #page

Added to regex A-Z for capital letters

Removed from regex data-impressed=\"1\"

set(#page, $nothing, "Global")
navigate("http://www.yellowpages.com/search?search_terms=realtor&geo_location_terms=Homestead%2C+FL", "Wait")
wait(3)
set(#page, $replace($document text, $new line, $nothing), "Global")
set(#Compan Names, $find regular expression(#page, "(?<=class=\"business-name\">)[0-9A-Za-z ]+"), "Global")

Thanks,

Kevin

Link to post
Share on other sites

try this,its not perfect but its close enough

 

set(#done,$replace regular expression($find regular expression($document text,"class=\"business-name\"(?s)\\sdata-impressed=\"1\".+?</a>"),"class.+?>|</a>",""),"Global")

 

I think its everything

 

here is if you dont want the featured listings on the side,just the 30 links

 

set(#done,$replace regular expression($find regular expression($document text,"class=\"n\">\\d+\\.\\s<a\\shref.+?class=\"business-name\"(?s)\\sdata-impressed=\"1\".+?</a>"),"class=\"n\">\\d+\\.\\s<a\\shref.+?class=\"business-name\"(?s)\\sdata-impressed=\"1\">|</a>",""),"Global")

 

yes as Kevin said,ubot is case specific,whereas editpad lite is not,tip for that site,their is some lovely JSON data on that page that has everything inside of it,that builds the list,using Aymans JSON parser you could match much easier,and even build your own database out of it

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...