Sebastian Rooks 3 Posted February 10, 2016 Report Share Posted February 10, 2016 Hi, this may actually drive me insane. I've been working on what should have been a simple project for days now, and I've come to a sticking point I'd really appreciate some help getting through. I have a list populated with entries like this: Penelope Retweeted I'm yer Freckleberry @PicklesnPickles 46 mins46 minutes ago He adores my freckles. I think I'll let him count them. 22 retweets 32 likes Reply Retweet 22 Like 32 More This is my first project featuring regex. So far, I've been able to select the numbers of retweets and likes from the fourth line from the bottom and add it to a table. I'm pretty happy about that. But I can't figure out how to get the tweet itself.I need something like "everything after the first line containing the "@", but before the fourth line from the bottom". In some way or another. This worked in testing on EditPadPro, but I can't get anything to select an item after a line break in Ubot. (?<=@.{1,100})(\s\n.*\s\n?.*)(?=\s\n\d{0,6}\s+retweets?\s+\d{0,}\s+likes?\s+\nReply\s+Retweet?\s+\d{0,}\w{0,1}) I've tried shortening it down, cutting everything out of it, I just can't get it to cross a \n in Ubot. My head hurts and I believe that my brains might be melting, so please help a brother out here. Thanks guys. Quote Link to post Share on other sites
pftg4 102 Posted February 10, 2016 Report Share Posted February 10, 2016 (?<=ago)[\w\s\'\"\.\-\,\;\:\&\!\?]+(?=\.) 1 Quote Link to post Share on other sites
Sebastian Rooks 3 Posted February 10, 2016 Author Report Share Posted February 10, 2016 Thank you SOOO much! This is a big advancement in the right direction. That got me something, which is a lot more than I had before, which was nothing. I just need to figure out how to modify the beginning and the end. The first line doesn't always contain "ago", the only real constant I see that it contains an @ symbol somewhere towards the middle of that line. Also doesn't always end with ".", but it does always end with (?=\[d*\s]retweets?) or what I imagine looks something like that (but probably doesn't.) One weird thing though.....your code doesn't catch anything for me in the UBot regex editor. But when I ran it in my bot it works perfectly. It's going to be hard to learn this if I can't trust the editor. When I got something to work in EditPadPro, it didn't work in my bot. Is this actually a thing that happens or am I doing something wrong? Thanks again! Quote Link to post Share on other sites
Solution Code Docta (Nick C.) 638 Posted February 10, 2016 Solution Report Share Posted February 10, 2016 Hi, By replace regex comment("keep in mind if you use this the regex will see in the tweet too and if like more retweet etc is in the tweet the tweet they will be gone too") set(#tweat,"Penelope Retweeted I\'m yer Freckleberry @PicklesnPickles 46 mins46 minutes ago He adores my freckles. I think I\'ll let him count them. 22 retweets 32 likes Reply Retweet 22 Like 32 More","Global") set(#clean regex,$replace regular expression(#tweat,".*Retweeted|Reply.*|Like.*|More|.*retweets.*",$nothing),"Global") alert(#clean regex) by remove from list clear list(%clean) set(#tweat,"Penelope Retweeted I\'m yer Freckleberry @PicklesnPickles 46 mins46 minutes ago He adores my freckles. I think I\'ll let him count them. 22 retweets 32 likes Reply Retweet 22 Like 32 More","Global") add list to list(%clean,$list from text(#tweat,$new line),"Delete","Global") comment("remove last 4 items") loop(4) { remove from list(%clean,$subtract($list total(%clean),1)) } comment("remove first") remove from list(%clean,0) alert(%clean) stop script set(#new tweet,$replace regular expression(#tweat,".*Retweeted|Reply.*|Like.*|More",$nothing),"Global") alert(#new tweet) same as above in a function set(#tweat,"Penelope Retweeted I\'m yer Freckleberry @PicklesnPickles 46 mins46 minutes ago He adores my freckles. I think I\'ll let him count them. 22 retweets 32 likes Reply Retweet 22 Like 32 More","Global") alert($clean retweet(#tweat)) define $clean retweet(#TWEET TO CLEAN) { clear list(%clean) add list to list(%clean,$list from text(#tweat,$new line),"Delete","Global") comment("remove last 4 items") loop(4) { remove from list(%clean,$subtract($list total(%clean),1)) } comment("remove first") remove from list(%clean,0) return(%clean) } in ironpython set(#tweat,"Penelope Retweeted I\'m yer Freckleberry @PicklesnPickles 46 mins46 minutes ago He adores my freckles. I think I\'ll let him count them. 22 retweets 32 likes Reply Retweet 22 Like 32 More","Global") alert($run python with result("tweet = \'\'\'{#tweat}\'\'\' split_tweet = tweet.split(\'\\n\') clean = split_tweet[1:-4] joined = \'\\n\'.join(clean) joined")) take your pickthe two list approaches seem universally slow when I ran them more than onceI will report this in trackerit should run generally as fast as the python scriptHope this helps, CDparse tweet-example-bug.ubot 1 Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted February 10, 2016 Report Share Posted February 10, 2016 also I use this to play with regex http://regexhero.net/tester/ it is .Net style which is same as ubot Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted February 10, 2016 Report Share Posted February 10, 2016 tracker issue if you can confirm this +1 it please http://tracker.ubotstudio.com/issues/966 1 Quote Link to post Share on other sites
Sebastian Rooks 3 Posted February 10, 2016 Author Report Share Posted February 10, 2016 I honestly don't know how I could thank you enough for this. This has been sucking the life and productivity out of me for days now.You gave me options that I'm going to save and study, and surely apply to different things at different times in the future. I promise that when I get to the point that I've got more answers than questions, I'll take time to help people out too. I went with the remove from list option, and I made a couple of changes to meet my exact needs. Removing the last 4 lines is a consistent win. But the first two lines can vary. I made some changes in how they're handled, that seem to be working every time, providing me with additional lists of data for my table like whether the tweet was a retweet, and if so, who was it retweeted from. None of which would have been possible without your help. clear list(%clean) clear list(%retweet or not) set(#tweat,$next list item(%scrapedtweets),"Global") add list to list(%clean,$list from text(#tweat,$new line),"Delete","Global") comment("remove last 4 items") loop(4) { remove from list(%clean,$subtract($list total(%clean),1)) } comment("remove first") if($contains($list item(%clean,0),"Retweeted")) { then { add item to list(%retweeted from,$find regular expression($list item(%clean,1),"@\\w*"),"Don\'t Delete","Global") add item to list(%retweet or not,"retweet","Don\'t Delete","Global") remove from list(%clean,0) remove from list(%clean,0) } else { add item to list(%retweet or not,$nothing,"Don\'t Delete","Global") add item to list(%retweeted from,$nothing,"Don\'t Delete","Global") } } set(#new tweet,%clean,"Global") add item to list(%final tweet,#new tweet,"Don\'t Delete","Global") alert(#new tweet) also I use this to play with regex http://regexhero.net/tester/ it is .Net style which is same as ubotThank you, that is tremendously helpful. I read that there are different dialects of regex, but I didn't know which was which. tracker issue if you can confirm this +1 it please http://tracker.ubotstudio.com/issues/966 I can absolutely confirm this.It consistently takes just over 30 seconds to get through this loop, which is going to be a bit problematic with runs through hundreds of list items.I tried to confirm in the tracker, but it won't let me log in, won't let me register, won't recognize my info to send me a new password. I don't know what that's about. Thank you, Thank you, and Thank you again. Quote Link to post Share on other sites
Code Docta (Nick C.) 638 Posted February 10, 2016 Report Share Posted February 10, 2016 Awesome!Can never have enough helpers around here. Ask support and they will set u up in the tracker. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.