Jump to content
UBot Underground

Recommended Posts

Is there anyone who can help me with this bit of regex?  I am using Aymen's http post command to pull all of the text and then I want to gather all of the innertext from links like this one:

<h1 style="margin: 2px 0 0 0;"><b></b> <a href="http://www.rvwholesalers.com/design/AmeriLite/AmeriLite.php?floorplan=21MB">AmeriLite - 21MB</a></h1>

 

It needs to distinguish by using the h1 tags since there are other similar links on the same page.

I just want to gather all of the innertext from these results - in this case it would be AmeriLite - 21MB.  There are typically 10 results per page that all have a similar format and the innertext will be different on every listing.  Any help is much appreciated.  I can't wait until the regex builder in ubot5. 

Link to post
Share on other sites

Thanks Kreatus.  That works to capture the first one but there are some other ones on the page that it catches as well.  Here is the URL I'm trying to gather the list from:

http://www.rvwholesalers.com/design/rvsearch.php?SEARCHRVTYPE=&Sleeps=0|50&Fiberglass=Z&pricerange=0|500000&exteriorkitchen=Z&Bunks=Z&searchrvlengthrange=0|600&manufacturer=Z&exteriordoors=Z&searchrvweight=0|50000&brand=Z&Slides=Z

The extra ones it captures are the similar ones that have <img src... after the opening h1 tag.

Link to post
Share on other sites

It does on me.

 

Here's the add to list code. Make sure the page is loaded first.

add list to list(%h1 tags, $find regular expression($document text, "(?<==[a-zA-Z0-9]\{3,8\}\">)[a-zA-Z0-9\\s\\W]\{3,50\}?(?=</a></h1>)"), "Delete", "Global")

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...