Kev 69 Posted March 14, 2013 Report Share Posted March 14, 2013 Hi All, Im trying to regex the results from this page:https://www.google.com/search?q=site:cnn.com&num=100&&start=000 I only wnat the urls not google webcache or anything to do with google. Here's the regex I'm using (http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])? Unfortunately this gives me everything but also not the correct URLs. I need the urls as they appear on the results page. Thanks in advance! Quote Link to post Share on other sites
Aymen 385 Posted March 14, 2013 Report Share Posted March 14, 2013 first , are you using the http post plugin ?if yes,i have a link scraper example , did you take a look at it ? Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.