BigEfromDaBX 25 Posted March 22, 2015 Report Share Posted March 22, 2015 My Regex works in Regexbuddy and EditPad Lite 7. But no in Ubot. What am I doing wrong? This is my Regex code (?<=class="business-name" data-impressed="1">)[0-9a-z ]+ Im a trying to grab the Company names from the results. Its a total of 32. Here is the code: set(#page,$nothing,"Global") navigate("http://www.yellowpages.com/search?search_terms=realtor&geo_location_terms=Homestead%2C+FL","Wait") wait(3) set(#page,$document text,"Global") set(#Compan Names,$find regular expression(#page,"(?<=class=\"business-name\" data-impressed=\"1\">)[0-9a-z ]+"),"Global") Quote Link to post Share on other sites
k1lv9h 76 Posted March 22, 2015 Report Share Posted March 22, 2015 Hi, This should work with Ubot Studio version 5.:Replace new lines with nothing when loading #pageAdded to regex A-Z for capital letters set(#page,$nothing,"Global") navigate("http://www.yellowpages.com/search?search_terms=realtor&geo_location_terms=Homestead%2C+FL","Wait") wait(3) set(#page,$replace($document text,$new line,$nothing),"Global") set(#Compan Names,$find regular expression(#page,"(?<=class=\"business-name\" data-impressed=\"1\">)[0-9A-Za-z ]+"),"Global")This should work with Ubot Studio version 4.Replace new lines with nothing when loading #pageAdded to regex A-Z for capital lettersRemoved from regex data-impressed=\"1\" set(#page, $nothing, "Global") navigate("http://www.yellowpages.com/search?search_terms=realtor&geo_location_terms=Homestead%2C+FL", "Wait") wait(3) set(#page, $replace($document text, $new line, $nothing), "Global") set(#Compan Names, $find regular expression(#page, "(?<=class=\"business-name\">)[0-9A-Za-z ]+"), "Global")Thanks,Kevin Quote Link to post Share on other sites
deliter 203 Posted March 22, 2015 Report Share Posted March 22, 2015 try this,its not perfect but its close enough set(#done,$replace regular expression($find regular expression($document text,"class=\"business-name\"(?s)\\sdata-impressed=\"1\".+?</a>"),"class.+?>|</a>",""),"Global") I think its everything here is if you dont want the featured listings on the side,just the 30 links set(#done,$replace regular expression($find regular expression($document text,"class=\"n\">\\d+\\.\\s<a\\shref.+?class=\"business-name\"(?s)\\sdata-impressed=\"1\".+?</a>"),"class=\"n\">\\d+\\.\\s<a\\shref.+?class=\"business-name\"(?s)\\sdata-impressed=\"1\">|</a>",""),"Global") yes as Kevin said,ubot is case specific,whereas editpad lite is not,tip for that site,their is some lovely JSON data on that page that has everything inside of it,that builds the list,using Aymans JSON parser you could match much easier,and even build your own database out of it Quote Link to post Share on other sites
BigEfromDaBX 25 Posted March 23, 2015 Author Report Share Posted March 23, 2015 Thanks guys. Makes sense now. Usually I have regex automatically disregarding the case sensitive on the results. You guys rock Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.