mdc101 15 Posted June 26, 2011 Report Share Posted June 26, 2011 Is it possible to be able to scrape pdf content via the browser?This would be a great feature Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted June 26, 2011 Report Share Posted June 26, 2011 pdf's are downloadable from IE if i am correct. have you tried to select any text or scrape page when viewing. Quote Link to post Share on other sites
JohnB 255 Posted June 26, 2011 Report Share Posted June 26, 2011 This should be a good indicator of why it's not possible: http://screencast.com/t/BiAuogPDB5U John Quote Link to post Share on other sites
Guerrilla 19 Posted June 29, 2011 Report Share Posted June 29, 2011 The is a programming library I came across once (cant remember name) that allowed you to feed in a pdf and it would export the PDF as HTML. I even remember finding a few websites that you could upload a pdf to and get a HTML version of it. This would be the route to go down to scrape the text via ubot. 1 Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted June 29, 2011 Report Share Posted June 29, 2011 it is possible to render the pdf as plain text using c++ and C# but we will see if this becomes an option or not. Good idea on the download of, and uplaod to a site to convert it thought. +1 Quote Link to post Share on other sites
mdc101 15 Posted December 1, 2011 Author Report Share Posted December 1, 2011 Thanks for the feedback guys Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.