Computer Gecko 2 Posted November 16, 2015 Report Share Posted November 16, 2015 Hi All, I am trying to scrape this url - http://www.4sgm.com/is-bin/INTERSHOP.enfinity/WFS/4sgm-Storefront-Site/en_US/-/USD/ViewParametricSearch-RSS;pgid=P2mm3vcqkHaR00Ov2sHlGKz50000J7Lx-OUs?SearchCategoryUUID=xRbAwGQT.4sAAAELSNE0E4U1&rsstitle=Artificial+Plant%2FBasket However, when UBOT navigates to that url, it strips out a bunch of the code that I would use to parse the feed. - http://screencast.com/t/E9Ac9ycoXff Any ideas? Quote Link to post Share on other sites
MiriamMB 63 Posted November 16, 2015 Report Share Posted November 16, 2015 What will work is download file since you are dealing with an xml file and not htmlGive it a try navigate("http://www.npr.org/rss/rss.php?id=1057","Wait") wait(3) download file("http://www.npr.org/rss/rss.php?id=1057","C:\\Users\Desktop\\Documents\\doc text.txt") Quote Link to post Share on other sites
Computer Gecko 2 Posted November 16, 2015 Author Report Share Posted November 16, 2015 Hi Miriam! Thanks for the insights! Sorry for the ignorance, but how do I parse it now? Quote Link to post Share on other sites
pash 504 Posted November 17, 2015 Report Share Posted November 17, 2015 RSS feed = XML code use XML plugin for parse or regex Quote Link to post Share on other sites
Computer Gecko 2 Posted November 17, 2015 Author Report Share Posted November 17, 2015 Thanks, Pash! Where do I get the XML Plugin? Quote Link to post Share on other sites
BoosterBots 23 Posted November 17, 2015 Report Share Posted November 17, 2015 Yeah you can use some simple regex to pull the information you need. Quote Link to post Share on other sites
pash 504 Posted November 17, 2015 Report Share Posted November 17, 2015 Thanks, Pash! Where do I get the XML Plugin?http://network.ubotstudio.com/forum/index.php/topic/13088-ubot-xml-plugin-ubot-discount/ Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.