Jump to content
UBot Underground

Issue by scraping emails + names


Recommended Posts

Hey guys, I'm trying to scrape emails + names in this format:

testemail@gmail.com , Brown

Anyway seems that I can only scrape for:

mailto:testemail@gmail.com

Is not a big problem, with a text editor I can easily remove all the "mailto:", anyway I cannot scrape also the name. I don't know what happened, but I get always a blank document.

 

Here a screen of the code where I get the error:

http://i.imgur.com/3E8qx1N.png

 

If anyone need the website that I'm scraping, I'll send you the link via PM.

 

Thank you so much!

Link to post
Share on other sites

Hi,

 

You could use $replace to remove mailto: from email value as it is saved to the table.

 

For Name value "attribute to scrape" should be "outertext" not "name".

 

Kevin

  • Like 1
Link to post
Share on other sites

Hi,

 

You could use $replace to remove mailto: from email value as it is saved to the table.

 

For Name value "attribute to scrape" should be "outertext" not "name".

 

Kevin

 

Trying the $replace...for the name was "innertext".

 

Anyway I don't understand why with just a link it works but when I try multiple links it doesn't save data (or at least: the .txt is blank).

Link to post
Share on other sites

Please let me know if the code that I sent you helped.

 

To anyone else looking to grab emails from a page this regex works great:

 

 

[\\.\\-_A-Za-z0-9]+?@[\\.\\-A-Za-z0-9]+?[\\.A-Za-z0-9]\{2,\}

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...