BigEfromDaBX 25 Posted October 14, 2015 Report Share Posted October 14, 2015 So im trying to create a bot that scrapes the HREFs then with a regular expression grabs the any href link that contains the word "contact". I created the regex in regexbuddy which is really good. But the regex doest work in ubot. clear all data navigate("http://keywestwatertours.com/","Wait") wait for browser event("Everything Loaded","") set(#links,$page scrape("<a href=\"","\">"),"Global") set(#links2,$find regular expression(#links,"(^[-a-zA-Z0-9]*contact\\.[a-zA-Z0-9]*)|(^[-a-zA-Z0-9]*contact[-a-zA-Z0-9]*.[a-zA-Z0-9]*)"),"Global") The regex i created will grab anything like: contact.htmlyesimacontactpage.phpcontact-us.aspetc See snapshot. Thanks for all your help as always Quote Link to post Share on other sites
Bot-Factory 602 Posted October 14, 2015 Report Share Posted October 14, 2015 Hi. This site only gives:contact.html Your regex could look like:set(#links2,$find regular expression(#links,".*contact.*"),"Global") And if you want the full URLs and not just the short ones you can use:set(#links,$scrape attribute(<tagname="a">,"fullhref"),"Global") CheersDan Quote Link to post Share on other sites
BigEfromDaBX 25 Posted October 14, 2015 Author Report Share Posted October 14, 2015 Thanks Dan. That worked great. Your way seems alot more cleaner and easier . I only added those extra pages on to my regex buddy to illustrate how my regex should work. I should of informed in the post that the page would only return one contact link. My only question is how come the regex I created with regex buddy didnt work on ubot? Quote Link to post Share on other sites
Bot-Factory 602 Posted October 14, 2015 Report Share Posted October 14, 2015 Thanks Dan. That worked great. Your way seems alot more cleaner and easier . I only added those extra pages on to my regex buddy to illustrate how my regex should work. I should of informed in the post that the page would only return one contact link. My only question is how come the regex I created with regex buddy didnt work on ubot? Hi. There are a lot of different Regex Engines out there. And depending on which one regex buddy uses and which one ubot uses, there might be different feature sets. If you take a look at:https://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines You will see that there are really a LOT of differences. Dan Quote Link to post Share on other sites
BigEfromDaBX 25 Posted October 14, 2015 Author Report Share Posted October 14, 2015 Hi. There are a lot of different Regex Engines out there. And depending on which one regex buddy uses and which one ubot uses, there might be different feature sets. If you take a look at:https://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines You will see that there are really a LOT of differences. Dan Regexbuddy gives me the option to change to .net, java, perl etc. Is that the engine you speak of? Quote Link to post Share on other sites
Bot-Factory 602 Posted October 14, 2015 Report Share Posted October 14, 2015 Regexbuddy gives me the option to change to .net, java, perl etc. Is that the engine you speak of?Yes, that's what I was referring to. But I can't tell you which one Ubot is actually using. But something you can experiment with. Dan Quote Link to post Share on other sites
BigEfromDaBX 25 Posted October 15, 2015 Author Report Share Posted October 15, 2015 Thanks Dan. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.