Jump to content
UBot Underground

Whats the Regex code for capturing text between two words


Recommended Posts

Trying to scrape text that appears between an <address> and </address> but when I use <address>.*</address> it finds the text and duplicates it into another line below. I tested it with another tag "</a>" and it does the same thing, seems like it only duplicates it when I use tags, is it the brackets that throw it off?

 

thanks in advance

Edited by daveconor
Link to post
Share on other sites

it's probably displayed 2 times in the text that you are scraping. You can use regex and add the scraped item to a list, and have the list delete the duplicates, this way you are left with 1 unique item.

Link to post
Share on other sites

Also use .*? (non-greedy) instead of .*, because currently it matches first <address> and last </address> in the string because the operator is greedy.

 

Although it might work well is you always have only 1 address in the input string.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...