whoami 26 Posted March 4, 2014 Report Share Posted March 4, 2014 Hello..Lets suppose I have this $list or #variable://www.domain.com/folder/anything1//www.domain.com/folder/anything2//www.domain.com/folder/anything3//www.domain.com/folder/anything4//www.domain.com/folder/anything5/something-different/something-elsehttp://nothing.com/regularI need to only scrape the links like //www.domain.com/folder/ and remove the other links.What would be the easiest way to achieve this?Thanks guys! Quote Link to post Share on other sites
Bot-Factory 602 Posted March 4, 2014 Report Share Posted March 4, 2014 Hi. You do that with regex. But it depends on how the links are structured. If the ones you want to filter always start with //www you could use: set(#tmp1, "//www.domain.com/folder/anything1//www.domain.com/folder/anything2//www.domain.com/folder/anything3//www.domain.com/folder/anything4//www.domain.com/folder/anything5/something-different/something-elsehttp://nothing.com/regular", "Global")set(#tmp2, $find regular expression(#tmp1, "//www.+"), "Global") If the matching part is different, you need to modify the regex of course. CheersDan Quote Link to post Share on other sites
whoami 26 Posted March 5, 2014 Author Report Share Posted March 5, 2014 Thanks a lot!So from now on + is to extend on regex.I could have used also something like "+//www.+" right? Quote Link to post Share on other sites
Bot-Factory 602 Posted March 5, 2014 Report Share Posted March 5, 2014 The . means any character. the + just says One or more of. If you are new to regex I highly recommend checking out:http://www.ubotstudio.com/forum/index.php?/topic/15905-sell-learn-regular-expressions-video-course-2-hours-of-content/ Dan Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.