Jump to content
UBot Underground

Recommended Posts

Hey there guys i have a bit of a problem since i literally never used Regex at all so i never had to use it inside Ubot so i need small help to determine the best way to scrape Information.

 

So i am doing this "Test Ride" on this particular Example

 

Check the image attached first 

 

http://www.ripoffreport.com/r/Routes-Car-and-Truck-Rental-Calgary/Calgary-Alberta/Routes-Car-and-Truck-Rental-Calgary-FRAUD-CHARGING-MY-CREEDIT-CARD-WITHOUT-PERMISSION-1193004

 

I am trying to Scrape this info with regex the Phone Number, Website and Category

Thing is Phone Number sometimes look different check the below link:

 

 http://www.ripoffreport.com/r/UHAUL/internet/UHAUL-JEFF-vermonts-regional-manager-uhauls-website-said-they-had-a-trailor-avaiable-a-1189474

 

See those are my issues i tried Wildcarding it a while back but after few loops it makes errors,so i don't even have a code to show you since i never had to use regex before inside Ubot but i have to learn it and all examples were confusing to me  so i was hoping you could help me on this simple example with the regex and how to import it into Ubot

(I have no issues with Logic commands)

 

I simply don't know a lot about regex!

 

If anyone has the time to look at my example i would greatly appreciate it, if you could help me with this!

 

 

EDIT: Ok i just started to check again and it looks like i was a lot younger when i first tried to scrape so this is my new code,seems it works!

 

But i would still like to know more about how to do this in regex,the procedure!

 

 

set(#Category,$scrape attribute(<outerhtml=w"<li> <strong>Category:</strong> <a href=\"http://www.ripoffreport.com/*\">*</a> </li>">,"innertext"),"Global")
set(#Web,$scrape attribute(<outerhtml=w"<li><strong>Web:</strong> <a href=\"*\" rel=\"nofollow\" target=\"_blank\">*</a></li>">,"innertext"),"Global")
set(#Phone,$scrape attribute(<outerhtml=w"<li><strong>Phone:</strong> *</li>">,"innertext"),"Global")
set(#adresa,$scrape attribute(<class="address">,"innertext"),"Global")

post-18544-0-55360800-1418907786_thumb.png

Link to post
Share on other sites

While scraping this regex code should work about 90% of the time.

set(#Category, $find regular expression($find regular expression($document text, "(?<=Category:</strong>).*?(?=/a>)"), "(?<=>).*?(?=<)"), "Global")
alert(#Category)
set(#Web, $find regular expression($document text, "(?<=Web:</strong> <a href=\").*?(?=\")"), "Global")
alert(#Web)
set(#Phone, $find regular expression($document text, "(?<=Phone:</strong> ).*?(?=<)"), "Global")
alert(#Phone)
set(#adresa, $trim($replace regular expression($replace regular expression($find regular expression($find regular expression($replace($document text, $new line, $nothing), "(?<=companyBullet\" style=\";background-position:0px 3px;padding-left:9px\"> <span><strong>).*?(?=</td> <td> <ul>)"), "(?<=</span> </div> <span>).*?(?=</span> </div> )"), "<[^>]*>", $nothing), "\\s\{2,\}", " ")), "Global")
alert(#adresa)
Edited by Gogetta
Added additional examples for Website, Category, and Address.
Link to post
Share on other sites

EDIT: Thanks for adding more Examples Gogetta i still have a lot to learn about this field of Ubot and you were very helpful.

 

Thank You Kindly,I hope i will help you too some day!

  • Like 1
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...