lars 0 Posted February 14, 2010 Report Share Posted February 14, 2010 I'm trying to scrape email adresses from various sites (NOT for spamming ;-) ). the problem is almost all emails are in different formats. some are in a simple name@domain.com -format and some use something like these name at domain.com name [at] domain dot com etc. i'm having a hard time to even choose a normal email adress (name@domain.com). i tried to choose by innertext attribute: *@*.* but since this doesn't seem to work like i hoped i'm really sure i'm missing something. thanks in advance Quote Link to post Share on other sites
Aaron Nimocks 19 Posted February 14, 2010 Report Share Posted February 14, 2010 Getting a normal one is pretty easy. Takes an extra step though. Just scrape the page and then run it through an online extractor, then scrape the results. Quote Link to post Share on other sites
lars 0 Posted February 14, 2010 Author Report Share Posted February 14, 2010 Thanks a bunch! But the problem in cases like name at domain dot com remains. I think the main problem lies in using the wildcards. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.