denrich 4 Posted January 23, 2014 Report Share Posted January 23, 2014 Hello ubotters, I need advice on how to create a bot that will scrape only the strings length that I want from a table (example: 3 to 15 letter texts)...Regex examples..I have tried different Regex scripts like ^.{1,15}$ which seems to working in a test website page but didn t work from the bot. Anyone has advice would be appreciated, thanks. Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted January 23, 2014 Report Share Posted January 23, 2014 Here use this to get any amount of characters. set(#string, "This text is a test of the substring function.", "Global")set(#first ten, $substring(#string, 0, 10), "Global")load html(#first ten) Quote Link to post Share on other sites
UBotDev 276 Posted January 23, 2014 Report Share Posted January 23, 2014 I think this is what you are looking for: add list to list(%TABLE Cells, $scrape attribute(<outerhtml=r"^<td>.\{1,15\}</td>$">, "innertext"), "Delete", "Global") I've tested the code on this page: http://www.w3schools.com/html/html_tables.asp Quote Link to post Share on other sites
denrich 4 Posted January 23, 2014 Author Report Share Posted January 23, 2014 Hey Guys, I should have been more specific..I want to get 1-15 strings from a list file but I only seem to get only the first strings from my list. Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted January 23, 2014 Report Share Posted January 23, 2014 Show us some of your code and then we can help more, hopefully. You can loop thru your list 15 times and add item to list...I guess??? set(#list, "1234567891011121314151617181920", "Global")clear list(%list a)add list to list(%list a, $list from text(#list, $new line), "Delete", "Global")loop(15) { add item to list(%list B, $next list item(%list a), "Delete", "Global")} 1 Quote Link to post Share on other sites
denrich 4 Posted January 25, 2014 Author Report Share Posted January 25, 2014 I appreciate the help..I seem to have a problem with asking the right question Simply put, I want to scrape only the 15 or less letter words from a file. I tried it with '$Find Regular Expression' and a Regex code with it.set(#var000, $list from file("C:\\Users\\Home\\Desktop\\something.txt"), "Global")set(#var000, $find regular expression("", "(^.\{1,15\}$)"), "Global") I hope this helps more. Quote Link to post Share on other sites
UBotDev 276 Posted January 25, 2014 Report Share Posted January 25, 2014 You shouldn't say scraping then, that is understood as getting data from a web page. However, I think this is what you are looking for: add list to list(%MATCHES, $find regular expression($read file("C:\\Users\\Home\\Desktop\\something.txt"), "(?<=(\\n|^)).\{1,13\}(?=(\\n|$))"), "Delete", "Global") Quote Link to post Share on other sites
denrich 4 Posted January 25, 2014 Author Report Share Posted January 25, 2014 Thanks for the help and you're right Ubotdev, about scraping thing, that code was what I was looking for, thanks Ubotdev. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.