Pizza Pro 11 Posted August 16, 2013 Report Share Posted August 16, 2013 Is there any way to scrape text or URLs or other stuff that's generated with javascript? When viewing the source for the page, I can't see any HTML related to what's generated with javascript... Thanks for the help in advance! Quote Link to post Share on other sites
iDollarsteam 13 Posted August 16, 2013 Report Share Posted August 16, 2013 no, as far as I know you can't get anything generated with javascript, for that you need to load the page in the browser so the javascript code can be executed then some parameters generated by that java code can be scraped For example, the javascript parameters generated on youtube pages ... 1 Quote Link to post Share on other sites
Pizza Pro 11 Posted August 17, 2013 Author Report Share Posted August 17, 2013 no, as far as I know you can't get anything generated with javascript, for that you need to load the page in the browser so the javascript code can be executed then some parameters generated by that java code can be scraped For example, the javascript parameters generated on youtube pages ... Thanks for the info, iDollarsteam. I see. So what can you do once you scrape the java code? Is there any way to get anything out of them? Quote Link to post Share on other sites
iDollarsteam 13 Posted August 17, 2013 Report Share Posted August 17, 2013 usually no, many of the javascript functions are not on the page, they are just called from the page, so you can't execute them in ubot ... if you know exactly how those functions work I think you can write your own javascript code and make ubot run the scripts scraped from the page and get the parameters you need... Usually those javascript codes will also drop cookies on your computer so it will not be easy, Also the javascript can implement other security measures, for example on youtube, when loading a video page, the form for commenting will have a delay of 1 second before loading (using javascript on the page) so getting the page with HTTP request will be useless (you only get the video but not the commenting part and the required parameters for commenting)... there are solutions for this also... anyway, bottom line is: it is almost impossible to obtain the javascript parameters from a page unless the entire script is on that page and is not obfuscated ... Google sites are the first to use those techniques for protection but now more and more big websites use this trick ... you need to find their "version" for non javascript browsers and use that... Since javascript is more and more a requirement of all browsers I think that in 1 year at most there will be no "backdoor" to this system 1 Quote Link to post Share on other sites
Pizza Pro 11 Posted August 17, 2013 Author Report Share Posted August 17, 2013 I see... That's tough. I hope there will be ways around it in future... Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.