Jump to content
UBot Underground

Filtering Lists


Recommended Posts

I scraped a table of companies and now the information is in a list, however there are some bold tags and image tags that I need to filter out.  I just need the company name and that's it (please see attached image).  I've looked on the forum and can't find any wisdom regarding how to filter the list.

 

If you could provide some wisdom to point me in the right direction I would be very grateful!

 

Thanks for your help!

 

Here's the image:

 

post-27387-0-88405900-1452786993_thumb.png

 

Peace,

LJ

Link to post
Share on other sites

I'm not sure why you to put in the list.

 

not test

set list position(%List,0)
loop($list total(%List)) {
    set(#Data,$next list item(%List),"Global")
    if($comparison($find regular expression(#Data,"<[^>]*>"),"= Equals","")) {
        then {
            alert(#Data)
        }
        else {
        }
    }
}

Link to post
Share on other sites

LJ, the first thing to do is go back and try scraping the data differently.

 

Show me the command/process you used to scrape the dada into your list.

Thanks for the response Brutal and Pash, here's what I used to scrape the list:

navigate("http://www.theassemblyshow.com/index.php/attend/exhibitor-list","Wait")
add list to list(%companies,$find regular expression($read file("http://www.theassemblyshow.com/index.php/attend/exhibitor-list"),"(?<=\">(<strong>|)).*(?=(</strong>|)</a></td>)"),"Delete","Global")

Thanks again!

Link to post
Share on other sites

try

navigate("http://www.theassemblyshow.com/index.php/attend/exhibitor-list","Wait")
wait for browser event("Everything Loaded","")
wait(1)
set(#Html,$scrape attribute(<class="exhibitor">,"innerhtml"),"Global")
set(#Html,$replace regular expression(#Html,"<img.*(<strong>|png\">)",""),"Global")
clear list(%List)
add list to list(%List,$find regular expression(#Html,"(?<=_blank\">).*?(?=<\\/)"),"Delete","Global")
Edited by pash
Link to post
Share on other sites

or this but slow

navigate("http://www.theassemblyshow.com/index.php/attend/exhibitor-list","Wait")
wait for browser event("Everything Loaded","")
wait(1)
clear list(%List)
set(#Loop,0,"Global")
set(#MaxLoop,$divide($list total($scrape attribute(<tagname="td">,"innertext")),2),"Global")
loop(#MaxLoop) {
    add item to list(%List,$scrape attribute($element offset(<tagname="td">,#Loop),"innertext"),"Don\'t Delete","Global")
    set(#Loop,$add(#Loop,2),"Global")
}

Link to post
Share on other sites

Pash,

 

Thanks so much, I see what you did and more importantly I understand why you did it!  Can't thank you enough, thanks for your patience while I'm learning :-)

 

Respectfully,

LJ

Link to post
Share on other sites

navigate("http://www.theassemblyshow.com/index.php/attend/exhibitor-list","Wait")
set(#get_companies,$scrape attribute(<outerhtml=w"<td><a href=\"*\" target=\"_blank\">*</a></td>">,"innertext"),"Global")
add list to list(%companies,$list from text(#get_companies,$new line),"Delete","Global")
set(#get_companies,$nothing,"Global")

Link to post
Share on other sites

The beauty of ubot is that there are almost always multiple paths you can use to achieve your desired end result.

 

I always try to use whatever causes the least amount of code because my bots get pretty big so using the least amount of code helps me greatly.

 

The way to find those multiple paths is to just start poking around - Once you achieve your goal and you're looking at your code, let it play through your mind to see if there are any other commands/parameters that you think may achieve the same thing, then give it a try.

Link to post
Share on other sites

The beauty of ubot is that there are almost always multiple paths you can use to achieve your desired end result.

 

I always try to use whatever causes the least amount of code because my bots get pretty big so using the least amount of code helps me greatly.

 

The way to find those multiple paths is to just start poking around - Once you achieve your goal and you're looking at your code, let it play through your mind to see if there are any other commands/parameters that you think may achieve the same thing, then give it a try.

 

 

Thanks Brutal, admittedly my knowledge is pretty limited now.  Moving from front end development to programming is taking a bit of time but I will get there :-) 

 

There's nothing Front End wise that I cannot do, and it got very boring.  This however is very fun, I'm still having a blast, frustrating at times but still fun. 

 

I got pretty good at fixing PHP scripts and customizing them, but starting from scratch is another issue.  Starting to come together slowly!

 

Seriously, I can't thank you guys enough for your patience and willingness to share and help out!

 

Many thanks!

 

Peace,

LJ

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...