mdc101 15 Posted December 12, 2011 Report Share Posted December 12, 2011 Hi Guys I have scraped a list of data and was wondering if it is possible to delete parts of each sentence in a list here is an example What are the best project management tools for SEO? Search Engine Optimization (SEO) and Project ManagementIs there any good cross-platform project management tool? Project ManagementIs there any good project management tool based on Django? Project ManagementIs there any project management tool like gitorious, redmine, trac, but using external tools for wiki and bugtracker? Project ManagementWhat is the best multi-user, online project management tool? Basically in each sentence I want to remove the category that has been scraped. you will see it after the "?"so What are the best project management tools for SEO? Search Engine Optimization (SEO) and Project Management will delete " Search Engine Optimization (SEO) and Project Management"and will leave What are the best project management tools for SEO? any suggestions will be appreciated Quote Link to post Share on other sites
JohnB 255 Posted December 12, 2011 Report Share Posted December 12, 2011 There are two possible ways to do this. 1) You can use regex, or 2) You can scrape the categories from the page and use them to compare agianst the list items and use the replace command (replace the category portion with $nothing). The simple regex solution would be: clear list(%list)clear list(%list2)set(#list, "What are the best project management tools for SEO? Search Engine Optimization (SEO) and Project ManagementIs there any good cross-platform project management tool? Project ManagementIs there any good project management tool based on Django? Project ManagementIs there any project management tool like gitorious, redmine, trac, but using external tools for wiki and bugtracker? Project ManagementWhat is the best multi-user, online project management tool?", "Global")add list to list(%list, $list from text(#list, $new line), "Delete", "Global")set(#position, 0, "Global")loop($list total(%list)) { set(#parseditem, $replace regular expression($list item(%list, #position), "\\?.*", "?"), "Global") add item to list(%list2, #parseditem, "Delete", "Global") increment(#position)} (you would just need the part starting with "set(#position...) John Quote Link to post Share on other sites
mdc101 15 Posted December 12, 2011 Author Report Share Posted December 12, 2011 Thanks JohnAppreciate the help. Would we be able to use the same code above to split the string, into two strings, then into two columns in a csv file?How would you go about doing this? Example column A: QuestionIs there any good cross-platform project management tool?Column B: CategoryProject Management Thanks Matt Quote Link to post Share on other sites
JohnB 255 Posted December 12, 2011 Report Share Posted December 12, 2011 Yes, I'm sorry I got sidetracked. Here you go: clear list(%list)clear list(%list2)clear list(%list3)clear table(&lists)set(#list, "What are the best project management tools for SEO? Search Engine Optimization (SEO) and Project ManagementIs there any good cross-platform project management tool? Project ManagementIs there any good project management tool based on Django? Project ManagementIs there any project management tool like gitorious, redmine, trac, but using external tools for wiki and bugtracker? Project ManagementWhat is the best multi-user, online project management tool?", "Global")add list to list(%list, $list from text(#list, $new line), "Delete", "Global")set(#position, 0, "Global")loop($list total(%list)) { set(#parseditem, $replace regular expression($list item(%list, #position), "\\?.*", "?"), "Global") add item to list(%list2, #parseditem, "Delete", "Global") increment(#position)}set(#position, 0, "Global")loop($list total(%list)) { set(#parseditem, $replace regular expression($list item(%list, #position), ".*\\?", $nothing), "Global") add item to list(%list3, #parseditem, "Don\'t Delete", "Global") increment(#position)}add list to table as column(&lists, 0, 0, %list2)add list to table as column(&lists, 0, 1, %list3)save to file("{$special folder("Desktop")}/lists.csv", &lists) John Keep in mind the second loop is saving categories which can be duplicates, so remember to set it to not delete duplicates (the add item to list in the 2nd loop) 1 Quote Link to post Share on other sites
mdc101 15 Posted December 12, 2011 Author Report Share Posted December 12, 2011 Thanks John, Man all I can say is the deeper I dig and the more I learn about this tool the better it gets. Thank you for you help and mentoring. RegardsMatt Quote Link to post Share on other sites
JohnB 255 Posted December 12, 2011 Report Share Posted December 12, 2011 Never a problem! http://ubotstudio.com/forum/public/style_emoticons/default/cool.gif John Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.