davidford76 12 Posted May 2, 2013 Report Share Posted May 2, 2013 I am working on scraping craigslist listings first then later on getting the emails. When I collect thehref of each listing on add list to list it has the delete duplicates already built in so when viewing thedebugger it comes up like it's supposed to with just 100 listings. But how do I delete duplicates for save to file function? Quote Link to post Share on other sites
AutomationNinja 194 Posted May 2, 2013 Report Share Posted May 2, 2013 first add list to list just like you did and save the new list that doesn't have any duplicates in it however you like. Quote Link to post Share on other sites
LordPorker 7 Posted June 20, 2016 Report Share Posted June 20, 2016 Sorry to raise an old thread, but I'm having the same issue. Using add list to list (which has it's own delete duplicate feature), is ok , but sometime you'll still end with duplicates in the main file. To simplify the steps, I'm scraping a site and saving all of the results straight to a file (via the append to file command). How can I target the duplicates in the actual file? Thanks. Quote Link to post Share on other sites
BobbyP 8 Posted June 26, 2016 Report Share Posted June 26, 2016 {$plugin function("Advanced.File.dll", "File Compare", "file a", "file b", "Remove duplicates")}", http://network.ubotstudio.com/forum/index.php/topic/15579-free-plugin-advanced-file-clipboardhandle-big-lists-in-text-files-and-more/ Quote Link to post Share on other sites
bangali_beta 11 Posted June 28, 2016 Report Share Posted June 28, 2016 Don't scrape directly to a file. Scrape to a list, delete duplicates then save to file. End of work. 1 Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.