Jump to content
UBot Underground

How To Use Regular Expression To "capture/parse" The String Between 2 Strings ?


Recommended Posts

Hello,

 

Me and my friend are working on a project and we need to know how to capture/parse a string between 2 strings.

 

<div class="phone">+4921661305247</div>

 

We would like to capture/parse +4921661305247 no matter what it is.

If +4921661305247 is present multiple times on the html source, it must only capture/parse the first occurrence.

 

Any help would be greatly appreciated !!

Edited by microsoft
Link to post
Share on other sites

no need for regular expression

 

ifyou just have strings exactly like above,not scraping a webpage then theres a function their called strip tags it will get the innertext of html tags

 

otherwise

load html("<div class=\"phone\">+4921661305247</div>
<div class=\"phone\">+4921661305247</div>
<div class=\"phone\">+4921661305247</div>
<div class=\"phone\">+4921661305247</div>")
add item to list(%phoneNumber,$scrape attribute($element offset(<class="phone">,0),"innertext"),"Don\'t Delete","Global")

Link to post
Share on other sites

Hi,

there are many ways to achieve this.. nothing wrong with any of them

here is the regex tho

(?<=>).*?(?=<)

gets between > and <

if u are using from a body of text just make it more unique and add more

 

(?<="phone">).*?(?=<)

or (?<="phone">).*?(?=</div)

 

don't need the plus?

(?<="phone">+).*?(?=</div)

 

here is my cheat sheet I got from this forum somewhere

(?=ABC)      - Positive lookahead. Matches a group after your main expression without including it in the result.
 
(?!ABC)      - Negative lookahead. Specifies a group that can not match after your main expression (ie. if it matches, the result is discarded).
 
(?<=ABC)     - Positive lookbehind. Matches a group before your main expression without including it in the result.
 
(?<!ABC)     - Negative lookbehind. Specifies a group that can not match before your main expression (ie. if it matches, the result is discarded).
 
(?<=ABC).*?(?=ABC) - Extracts the text between specified goups.

 

here is how I test it

http://regexhero.net/tester/

 

 

 

Hope this helps,

 

CD

  • Like 1
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...