Jump to content
UBot Underground

Regex Stop Match Before Character


Recommended Posts

I am working on a url scraper, and the website tends to include the referall source, but I cannot include that when saving it to a file later on. I was successful in getting it to grab everything after the "/".
 
testwebsite.com/ubotstudio?ref=referral 
 
This is what I have so far:
 
[^/]*$
testwebsite.com/ubotstudio?ref=referral
 
How should I have it match everything before the "?" sign?

  • Like 1
Link to post
Share on other sites

The simplest way to do it is this:

 

.*(?=\?)

 

But it would be nice to know if the format will always be like that so you can come up with something better. For example if the format is always like that (url without www or http) then you can also use something more specific:

 

[a-zA-Z0-9]+\.[a-zA-Z]{2,4}[a-zA-Z0-9\/\.-_+%!]+(?=\?)

  • Like 1
Link to post
Share on other sites

The simplest way to do it is this:

 

.*(?=\?)

 

But it would be nice to know if the format will always be like that so you can come up with something better. For example if the format is always like that (url without www or http) then you can also use something more specific:

 

[a-zA-Z0-9]+\.[a-zA-Z]{2,4}[a-zA-Z0-9\/\.-_+%!]+(?=\?)

 

http://i.imgur.com/EeVAAOq.png

It works!

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...