awesome sauce 0 Posted January 26, 2016 Report Share Posted January 26, 2016 Hi, I'm trying to scrape a website that shows a lot of ads around the listings. I'm only trying to scrape the organic listings. The way the site is set up it has different classes for the ads and the organic listings. So what I am trying to do is select the "organic" class, then select all of the "business" classes that are all inside of the "organic" class and save them to a list. Is this possible with UBot? P.S. The "business" class exists in the ads classes as well, so I need to do it this way I think. Quote Link to post Share on other sites
HelloInsomnia 1103 Posted January 26, 2016 Report Share Posted January 26, 2016 You can possibly use an element child inside of an scrape attribute for this. Or you may have to scrape the organic class and then use a regular expression. It's hard to say without seeing the code but it's likely one of those will work. Quote Link to post Share on other sites
deliter 203 Posted January 27, 2016 Report Share Posted January 27, 2016 no that would not work,scrape attribute functions cannot be nested,the "AND" selector may work by inputting both classes,although I've yet to really understand how the and selector works you could use child element selector too but from my experience it it can be hit or misstry this hard to see without the site,but the below code should remove the entire ads class so all the Business class that is inside the Ads class will be removed from the page,leaving the remaining business class to be those remaining in the organic business class change attribute(<class="adsListingHere">,"innerhtml","")add list to list(%business,$scrape attribute(<class="businessClass">,"innertext"),"Delete","Global") Quote Link to post Share on other sites
deliter 203 Posted January 28, 2016 Report Share Posted January 28, 2016 heres example code load html("<div class=\"ads\"> all content in this class will dissapear <a class=\"businessListings\" href=\"www.businessOne.com\">www.businessOne.com</a> </div> </br> <div class=\"OrganicBusiness\"> all content in this class will be scraped <a class=\"businessListings\" href=\"www.businessTwo.com\">www.businessTwo.com</a> </div> </br> <div class=\"ads\"> all content in this class will dissapear <a class=\"businessListings\" href=\"www.businessThree.com\">www.businessThree.com</a> </div> </br> <div class=\"OrganicBusiness\"> all content in this class will be scraped <a class=\"businessListings\" href=\"www.businessFour.com\">www.businessFour.com</a> </div> </br> ") wait for browser event("Everything Loaded","") change attribute(<class="ads">,"innerhtml","") add list to list(%organicCompanies,$scrape attribute(<class="businessListings">,"innertext"),"Delete","Global") Quote Link to post Share on other sites
awesome sauce 0 Posted February 1, 2016 Author Report Share Posted February 1, 2016 (edited) Hmmm... I still can't get this to work. If I use change attribute, seemingly nothing happens. EDIT: I got it to work using deliter's suggestion. However, instead of 'innerhtml', 'innertext' works on change attribute. Edited February 1, 2016 by awesome sauce Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.