stever 10 Posted April 7, 2017 Report Share Posted April 7, 2017 I am trying to scrape a table of URLs in expireddomains.net. The table provides a list of URLs each with a host of parameters like trust flow and backlinks associated with each URL, arranged in columns. I have found the elements I'm interested in, and am scraping the table. The problem is that it's hit and miss: I can scrape each of the URLs reliably, but some of the contents of other cells get missed sometimes for no obvious reason. So if there are 10 URLs (1 per row) in a table, I might get 8 of the associated backlinks and 7 of the other parameters. The items that get missed out are not in the same rows, and the results are consistent: run the script several times I get the same results. I've switched browsers in Ubot - no difference. There seems to be no rhyme nor reason why some of the elements are missed. I replicated the page in my Firefox browser and inspected the elements in the table - they're all as they should be. None of the cells in the table are empty. I put pauses between each add table to table command wondering if Ubot is trying to do too much too quickly. So why might UBot be missing elements out? It's not as if the elements are incorrect - they work some of the time, and the table is well structured as far as I can see. ThanksSteve Quote Link to post Share on other sites
deliter 203 Posted April 7, 2017 Report Share Posted April 7, 2017 your code is probably wrong, please post your code so someone can take a look, if I had to guess you probably have set your list to delete duplicates, but thats only guessing without looking at your code, although ubots child and sibling selectors are a disgrace, better off using the the add on I made probably 1 line of code for this table, but heres every parameter on that table working fine navigate("https://www.expireddomains.net/backorder-expired-domains/", "Wait") wait for browser event("Everything Loaded", "") add list to list(%name, $scrape attribute(<class="field_domain">, "textcontent"), "Don\'t Delete", "Global") add list to list(%BL, $scrape attribute(<class="field_bl">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_domainpop, $scrape attribute(<class="field_domainpop">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_abirth, $scrape attribute(<class="field_abirth">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_aentries, $scrape attribute(<class="field_aentries">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_similarweb, $scrape attribute(<class="field_similarweb">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_similarweb_countrycode, $scrape attribute(<class="field_similarweb_countrycode">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_dmoz, $scrape attribute(<class="field_dmoz">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_statuscom, $scrape attribute(<class="field_statuscom">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_statusnet, $scrape attribute(<class="field_statusnet">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_statusorg, $scrape attribute(<class="field_statusorg">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_statusde, $scrape attribute(<class="field_statusde">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_statustld_registered, $scrape attribute(<class="field_statustld_registered">, "textcontent"), "Don\'t Delete", "Global") add list to list(%field_enddate, $scrape attribute(<class="field_enddate">, "textcontent"), "Don\'t Delete", "Global") 1 Quote Link to post Share on other sites
stever 10 Posted April 8, 2017 Author Report Share Posted April 8, 2017 @deliter - yes, you guessed right! Sorry, such a rookie mistake. Anyway - that fixed it perfectly.Brilliant forum - thank you for all your help! 1 Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.