allcapone1912 7 Posted January 20, 2016 Report Share Posted January 20, 2016 hi, have a problem with my email scraping script very often the script stop at Add list to list and i dont know why navigate("http://www.hksinc.com/","Wait") wait(5) add list to list(%emails,$find regular expression($document text,"(?i)\\b[!#$%&\'*+./0-9=?_`a-z\{|\}~^-]+@[.0-9a-z-]+\\.[a-z]\{2,6\}\\b"),"Delete","Local") script just stop on add list to list and dont go furtherProblem should be in webste but i dont understand how to pass it Quote Link to post Share on other sites
pash 504 Posted January 21, 2016 Report Share Posted January 21, 2016 try change regex to otherit stop at regex Quote Link to post Share on other sites
Varo 28 Posted January 21, 2016 Report Share Posted January 21, 2016 The problem its not the regex, but your site: hksinc.comTry to scrape other site.And once more your code : add list to list scope is set to local Quote Link to post Share on other sites
allcapone1912 7 Posted January 21, 2016 Author Report Share Posted January 21, 2016 The problem its not the regex, but your site: hksinc.comTry to scrape other site.And once more your code : add list to list scope is set to localby mistake ive add to localbut currently is with General and still not working that's the problem, i can't skip this site because there are a lot of site like this one(were my script stop) and i have to be available to scrape them all Quote Link to post Share on other sites
Varo 28 Posted January 21, 2016 Report Share Posted January 21, 2016 by mistake ive add to localbut currently is with General and still not working that's the problem, i can't skip this site because there are a lot of site like this one(were my script stop) and i have to be available to scrape them all Try socket mode.Work great for me. plugin command("SocketCommands.dll", "socket container") { plugin command("SocketCommands.dll", "socket set header", "User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36") plugin command("SocketCommands.dll", "socket navigate", "GET", "http://www.hksinc.com/") add list to list(%scket,$find regular expression($plugin function("SocketCommands.dll", "$socket page html"),"themes"),"Don\'t Delete","Global") } Quote Link to post Share on other sites
HelloInsomnia 1103 Posted January 21, 2016 Report Share Posted January 21, 2016 Give this regex a try: [a-zA-Z\d\.+-_]+\@[a-zA-Z\d-]+(\.[a-zA-Z]{2,4}\.[a-zA-Z]{2,4}|\.[a-zA-Z]{2,4}) Quote Link to post Share on other sites
allcapone1912 7 Posted January 23, 2016 Author Report Share Posted January 23, 2016 Give this regex a try: [a-zA-Z\d\.+-_]+\@[a-zA-Z\d-]+(\.[a-zA-Z]{2,4}\.[a-zA-Z]{2,4}|\.[a-zA-Z]{2,4}) i tried your regex but still script stop on regex and don't go next Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.