webpro 31 Posted June 10, 2013 Report Share Posted June 10, 2013 I'm scratching my head with regex arghhhhh lol I need to grab the number "100".In fact it could be any number but i also need to make sure it doesn't grab any other numbers on the page Credits #1: <font color="green">100 Ads I came up with this (Credits #1).*?(">)(\d+)\sAds It selects onlyCredits #1: <font color="green">100 Adsout of the hole page. So maybe i'm getting there ? Thanks Quote Link to post Share on other sites
Pete 121 Posted June 10, 2013 Report Share Posted June 10, 2013 Over a year using ubot and you can't use a simple page scrap, Quote Link to post Share on other sites
UBotDev 276 Posted June 10, 2013 Report Share Posted June 10, 2013 I've just edited your REGEX a bit: (?<=(Credits #1).*?(">))(\d{1,3})(?=\s+Ads) Quote Link to post Share on other sites
webpro 31 Posted June 10, 2013 Author Report Share Posted June 10, 2013 Over a year using ubot and you can't use a simple page scrap, I can't do UBOT like you guys 24/7365 day/year sadly 3 jobs, kids, house and guess what, sleep or i'll die before i reach 50 which wouldn't be too long from now...I forgot, lost all my hair, went down to 110 pounds puked like i never thought i human could do for more than 4 months but got ride (so far) of "it".Trust me, you don't want to go where i went...Never judge a man before walking a mile in his shoes (i think that's how they say this in english) I forgot, the page i'm trying to scrape is wayyyy more complex than the simple example i gave you, so i can't use UB's scrapping capabilities. RegEx is the only way Quote Link to post Share on other sites
webpro 31 Posted June 10, 2013 Author Report Share Posted June 10, 2013 I've just edited your REGEX a bit: (?<=(Credits #1).*?(">))(\d{1,3})(?=\s+Ads) Man that is so close !!! <table align="center" width="860" bgcolor="#FFFFFF"><tbody><tr><td align="center"> <font color="red"><b>You Have Clicked Today:</b></font> <font color="blue">Credits #1: <font color="green">70 Ads</font> | Credits #2: <font color="green">0 Ads</font> | Credits #3: <font color="green">0 Ads</font> | Credits #4: <font color="green">0 Ads</font> | Credits #5: <font color="green">0 Ads</font></font> </td></tr></tbody></table> Have a look, it gets the other digits in front of Ads Let's say i only want Credits #1 Thanks Quote Link to post Share on other sites
UBotDev 276 Posted June 10, 2013 Report Share Posted June 10, 2013 Here you go: (?<=Credits #1[^<]*<font[^>]*>)(\d{1,3})(?=\s+Ads) An BTW, I strongly recommend you start learning REGEX, because its super useful. 1 Quote Link to post Share on other sites
webpro 31 Posted June 10, 2013 Author Report Share Posted June 10, 2013 Well i started but it's complicated. Don't know where to start as tutorials. I looked around but i ain't sure what i should start with so if you got a neat site to show me or a place where i could learn this, it will be more than welcome !Thanks Quote Link to post Share on other sites
Pete 121 Posted June 10, 2013 Report Share Posted June 10, 2013 Never said i was judging you just noted your post count to the problem and this regex is not that hard (?<=\"\>)\d{1,3}(?=\sAds) or (?<=Credits\s\#)\d{1,3}(?=\:\s) Quote Link to post Share on other sites
UBotDev 276 Posted June 10, 2013 Report Share Posted June 10, 2013 I think here is a nice place to start: http://www.regular-expressions.info/quickstart.html They even provide you with examples so its easier to follow. Quote Link to post Share on other sites
VaultBoss 310 Posted June 10, 2013 Report Share Posted June 10, 2013 One mistake many people do with regex is that they are trying to get EXACTLY the bit of info they want, in a single step. It is NOT necessary! You could scrape in Step 1 something close to what you need, even if 'polluted' with extra characters... then take the result of the scrape and apply regex to it in Step 2, using either $replace regex expression, or $find regex expression, depending on your needs.And if you need to do it, DO continue, Step 3, Step 4..... Step N.. till you get what you want. Quote Link to post Share on other sites
UBotDev 276 Posted June 11, 2013 Report Share Posted June 11, 2013 Good point VaultBoss! Sometimes it is also useful to add all REGEX matches into a list, so you can count them and process one by one. 1 Quote Link to post Share on other sites
webpro 31 Posted June 11, 2013 Author Report Share Posted June 11, 2013 Ohhhh i didn't thought about using UB's regex features once you got a piece of the puzzle ! Thanks for pointing this out Vault boss.Thanks again UbotDev, it's working fine now.And no hard feelings Zap, after what i've been thru, my mentality regarding life and everything that surronds it, changed a lot hahahahahaha Regarding regex, at least i'm going in the right direction. Now if i could only find time to devote to this and get this sorted as once you know, well you know and it must be a killer time saver !Anyways, if you guys see something neat tutorials about it for newbies, don't forget about this thread please and post the link.(I'll go to your link Ubotdev right now.) Cheers guys Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.