Jump to content
UBot Underground

Replace Text In String


Recommended Posts

I'm scraping a website that when I scrape the email addresses they have 'mailto:' at the start. What I would like to do is remove 'mailto:' so I just have the address. The email addresses are currently stored in a list. 

 

Here's the code I have, but it's not doing anything. I can't quite figure it out because I can't use replace with the if statement. Any suggestions?

define Clean Business Data {
    loop($list total(%email_address)) {
        set(#email_row,$next list item(%email_address),"Global")
        set(#email_row_clean,$replace(#email_row,"mailto:",""),"Global")
    }
}
Link to post
Share on other sites

Hello,

 

Try something like this

clear list(%a)
clear list(%clean)
add list to list(%a,$list from text("mail to: lkfjlsdkjf lskjlfj s ljf alkjdf
mail to: what evva
mail to: mail sub
mail to: mail body
mail to: some text",$new line),"Delete","Global")
add list to list(%clean,$list from text($replace(%a,"mail to: ",$nothing),$new line),"Delete","Global")
alert(%clean)

HTHelps,

CD

Link to post
Share on other sites

That helped me be able to remove mailto:, however it's removing my blank values from my first list. 

 

Example:

%email_address has 118 values, some of which are filled with $nothing because there isn't an email for that.

 

%email_address_clean then only has 44 items and has removed all the lines with $nothing. 

 

Is it possible to use replace but keep the $nothing values intact? (then I should have 118 items in my list, even though I just have 44 email addresses)

 

Here is the code I'm using to remove mailto:

define Clean Business Data {
    loop($list total(%email_address)) {
        add list to list(%email_address_clean,$list from text($replace(%email_address,"mailto:",$nothing),$new line),"Delete","Global")
    }
}
Link to post
Share on other sites

Alright, so I changed the list so it doesn't delete duplicates. However, for some reason everything is being duplicated... a lot. Out of less than 52 items, the following code returned 1144 emails (many duplicated and the blank lines aren't there).

define Clean Business Data {
    clear list(%email_address_clean)
    loop($list total(%email_address)) {
        add list to list(%email_address_clean,$list from text($replace(%email_address,"mailto:",$nothing),$new line),"Don\'t Delete","Global")
    }
}
Link to post
Share on other sites

You are looping for the list total of the email addresses and then adding the second list into it multiple times, that is an issue. Instead try this:

define Clean Business Data {
    clear list(%email_address_clean)
    add list to list(%email_address_clean,$list from text($replace(%email_address,"mailto:",$nothing),$new line),"Don\'t Delete","Global")
}
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...