ILoveUBot 10 Posted August 19, 2015 Report Share Posted August 19, 2015 Is there a way to check if an RSS feed has a new entry and if it does then trigger a bot to run? Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted August 19, 2015 Report Share Posted August 19, 2015 Hi, The best way is to do a HEAD request and check the eTag header.This is an MD5 of las modified time stamp.Then if they are equal then nothing has changed since you checked last.That is the proper way as it is server friendly.You can do this with http plugin or ubot's native python.Otherwise scape feed titles and compaere list if they are different, scrape.or so an MD5 on the doctext or titles and compare the MD5 as above.do you have http plugin?there is an MD5 plugin.you can use native ubot python for MD5 too Regards,CD 1 Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted August 20, 2015 Report Share Posted August 20, 2015 Here is an ALPHA look at in ubot python using urllib2. but the logic is there and I this is not production ready and no error handling either I am stressing the ALPHA stage but it does work run python("import urllib2 import time #### Define the url url = \'http://www.prweb.com/rss2/daily.xml\' ### Add your headers headers = \{\'User-Agent\' : \'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0\'\} ## intial feedscrape ### Create the Request. and set the head for it request = urllib2.Request(url, None, headers) #request.get_method = lambda : \'HEAD\' ### navigating to url and getting the response response = urllib2.urlopen(request) first_Html = response.read() resp_Header = response.info() eTag_first = resp_Header[\'ETag\'] ") set(#firsrt_Req_HTML,$run python with result("first_Html"),"Global") alert($run python with result("eTag_first")) set(#index,0,"Global") loop while(#ubotLW) { increment(#index) set(#scrape,"I am waiting to check if I need to scrape.","Global") run python("time.sleep(60) req2 = urllib2.Request(url, None, headers) req2.get_method = lambda : \'HEAD\' resp2 = urllib2.urlopen(req2) resp_Header2 = resp2.info() eTag_2 = resp_Header2[\'ETag\'] if eTag_first == eTag_2: html = resp2.read() scrape = \'true\' else: html = \'\' scrape = \'false\'") if($run python with result("scrape")) { then { set(#scrape,"I scrape!!","Global") set(#rss html,$run python with result("req3 = urllib2.Request(url, None, headers) resp3 = urllib2.urlopen(req3) resp3_HTML = resp3.read() resp3_HTML"),"Global") load html($run python with result("resp3_HTML")) } else { set(#scrape,"I don\'t need to scrape!!","Global") } } } ui check box("Ubot Loop While",#ubotLW) ui stat monitor(#scrape,"<br>I am on loop {#index}") I would do it different in Python 3.4 I'll do a tutorial some day lol. CD Quote Link to post Share on other sites
ILoveUBot 10 Posted September 7, 2015 Author Report Share Posted September 7, 2015 I've just noticed your replies. Thanks a lot. I'll give this a go soon. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.