mwickett 0 Posted March 6, 2014 Report Share Posted March 6, 2014 Can anyone share some advice for scraping obfuscated email addresses? I've run into this: <tr> <td bgcolor="dddddd" align="left" valign="top"><b>Room</b></td> <td bgcolor="#eeeeee">203</td> <td bgcolor="dddddd"><b>Email</b></td> <td bgcolor="#eeeeee"><a href="mailto:abelsonj@mcmaster.ca">abelsonj@mcmaster.ca</a></td> </tr> <tr> and this: <tr> <td align=RIGHT><b><i>E-mail Address</i></b></td><td><script language='JavaScript' type='text/javascript'> var data=new Array(133,137,129,132,156,135,210, 140,137,158,129,140,198,155,133,129,156,128,168,157,167,156,156,137, 159,137,198,139,137,232); var idx=0, n=data[data.length-1]; document.write('<a href='); while( data[idx]!=n ) { document.write(''+(data[idx++]^n)+';');} document.write('>'); idx= 7; while( data[idx]!=n ) { document.write(''+(data[idx++]^n)+';');} document.write('</a>'); </script> </td></tr> Any way to piece these back together? Appreciate any and all insight. Quote Link to post Share on other sites
UBotDev 276 Posted March 6, 2014 Report Share Posted March 6, 2014 Google knows everything, you would just need to enter it: http://bit.ly/1hNDhQN Joke aside, it's called HTML entities and you need to decode the string to human readable format. You can decode it with javascript or I think there is even a plugin that will help you to do that. Else you can test it here for example: http://htmlentities.net/ Any way you can see that you catually have this html there (actual email is removed): mailto:****@*****.*">*****@*****.* Quote Link to post Share on other sites
mwickett 0 Posted March 6, 2014 Author Report Share Posted March 6, 2014 Cheers, thanks for the pointer. Quote Link to post Share on other sites
Aymen 385 Posted March 6, 2014 Report Share Posted March 6, 2014 Plenty of functions you can try http://www.ubotstudio.com/forum/index.php?/topic/16039-free-string-management-plugin/ Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.