Extracting only one number ( 1,2 or 3 digits) out of a page

webpro · June 10, 2013

I'm scratching my head with regex arghhhhh lol

I need to grab the number "100".

In fact it could be any number but i also need to make sure it doesn't grab any other numbers on the page

Credits #1: <font color="green">100 Ads

I came up with this

(Credits #1).*?(">)(\d+)\sAds

It selects only

Credits #1: <font color="green">100 Ads

out of the hole page. So maybe i'm getting there ?

Thanks

Pete · June 10, 2013

Over a year using ubot and you can't use a simple page scrap,

UBotDev · June 10, 2013

I've just edited your REGEX a bit:

(?<=(Credits #1).*?(">))(\d{1,3})(?=\s+Ads)

webpro · June 10, 2013

Over a year using ubot and you can't use a simple page scrap,

I can't do UBOT like you guys 24/7

365 day/year sadly

3 jobs, kids, house and guess what, sleep or i'll die before i reach 50 which wouldn't be too long from now...

I forgot, lost all my hair, went down to 110 pounds puked like i never thought i human could do for more than 4 months but got ride (so far) of "it".

Trust me, you don't want to go where i went...

Never judge a man before walking a mile in his shoes (i think that's how they say this in english)

I forgot, the page i'm trying to scrape is wayyyy more complex than the simple example i gave you, so i can't use UB's scrapping capabilities. RegEx is the only way

webpro · June 10, 2013

I've just edited your REGEX a bit:
(?<=(Credits #1).*?(">))(\d{1,3})(?=\s+Ads)

Man that is so close !!!

<table align="center" width="860" bgcolor="#FFFFFF"><tbody><tr><td align="center">
<font color="red"><b>You Have Clicked Today:</b></font>   <font color="blue">Credits #1: <font color="green">70 Ads</font> | Credits #2: <font color="green">0 Ads</font> | Credits #3: <font color="green">0 Ads</font> | Credits #4: <font color="green">0 Ads</font> | Credits #5: <font color="green">0 Ads</font></font>
</td></tr></tbody></table>

Have a look, it gets the other digits in front of Ads

Let's say i only want Credits #1

Thanks

UBotDev · June 10, 2013

Here you go:

(?<=Credits #1[^<]*<font[^>]*>)(\d{1,3})(?=\s+Ads)

An BTW, I strongly recommend you start learning REGEX, because its super useful.

webpro · June 10, 2013

Well i started but it's complicated. Don't know where to start as tutorials. I looked around but i ain't sure what i should start with so if you got a neat site to show me or a place where i could learn this, it will be more than welcome !

Thanks

Pete · June 10, 2013

Never said i was judging you just noted your post count to the problem and this regex is not that hard

(?<=\"\>)\d{1,3}(?=\sAds) or (?<=Credits\s\#)\d{1,3}(?=\:\s)

UBotDev · June 10, 2013

I think here is a nice place to start: http://www.regular-expressions.info/quickstart.html

They even provide you with examples so its easier to follow.

VaultBoss · June 10, 2013

One mistake many people do with regex is that they are trying to get EXACTLY the bit of info they want, in a single step.

It is NOT necessary! You could scrape in Step 1 something close to what you need, even if 'polluted' with extra characters... then take the result of the scrape and apply regex to it in Step 2, using either $replace regex expression, or $find regex expression, depending on your needs.

And if you need to do it, DO continue, Step 3, Step 4..... Step N.. till you get what you want.

UBotDev · June 11, 2013

Good point VaultBoss!

Sometimes it is also useful to add all REGEX matches into a list, so you can count them and process one by one.

webpro · June 11, 2013

Ohhhh i didn't thought about using UB's regex features once you got a piece of the puzzle ! Thanks for pointing this out Vault boss.

Thanks again UbotDev, it's working fine now.

And no hard feelings Zap, after what i've been thru, my mentality regarding life and everything that surronds it, changed a lot hahahahahaha

Regarding regex, at least i'm going in the right direction. Now if i could only find time to devote to this and get this sorted as once you know, well you know and it must be a killer time saver !

Anyways, if you guys see something neat tutorials about it for newbies, don't forget about this thread please and post the link.

(I'll go to your link Ubotdev right now.)

Cheers guys

Extracting only one number ( 1,2 or 3 digits) out of a page

Recommended Posts

webpro 31

Link to post

Share on other sites

Pete 121

Link to post

Share on other sites

UBotDev 276

Link to post

Share on other sites

webpro 31

Link to post

Share on other sites

webpro 31

Link to post

Share on other sites

UBotDev 276

Link to post

Share on other sites

webpro 31

Link to post

Share on other sites

Pete 121

Link to post

Share on other sites

UBotDev 276

Link to post

Share on other sites

VaultBoss 310

Link to post

Share on other sites

UBotDev 276

Link to post

Share on other sites

webpro 31

Link to post

Share on other sites

Join the conversation