Cnotey 3 Posted December 9, 2012 Report Share Posted December 9, 2012 Hey everyone, Cnotey here. Luckily I haven't had to ask too many dumb noob questions since I got ubot. And I have not found a more supportive forum on the entire internet. However, I have one task that is giving me a huge headache. So I am having trouble verifying emails. The emails I need to verify have certain data that needs to be extracted, then on the confirmation link it takes me to a captcha. The emails look like this when converted into a table: Column1: Subject LineColumn2: Email sent to (xxx@gmail.com)Column3: Welcome!Column4: Hi and welcome to our site. We need you to verify the link. http://www.randomlink.com/4309483948 Please let us know if you have any questions You can find your username below Username: randomUsernameColumn5: random HTML So I need to be able to: 1. Extract the email address in column #22. Extract the username from column #43. Click the verification link, which will then take me to a webpage where I need to enter a captcha to finalize verification. Anyone have any suggestions on this? I don't even know if this is possible with ubot. I have read up a little bit on extracting info from tables created from emails, but the best I could do was extract a column, not specific values within the column. Thanks in advance, Cnotey. Quote Link to post Share on other sites
HelloInsomnia 1103 Posted December 9, 2012 Report Share Posted December 9, 2012 You should already have the email and username right? But if you don't for some reason you can try regex to get it: The first line is for the email and the second for the username. \((.*)\)$ Username\:\s(.*)$ Then for the link scrape it out using a wildcard, set it to a variable then navigate it to. 1 Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted December 9, 2012 Report Share Posted December 9, 2012 1.set(#temp url, $find regular expression($table cell(&test, 0, 4), "([a-zA-Z0-9_-]\{2,\})@([a-zA-Z0-9_-]\{2,\})\\.([a-zA-Z0-9_-]\{2,\})"), "Global") 2.Username\:\s(.*)$ 3.set(#temp url, $find regular expression($table cell(&test, 0, 4), "http://www.randomlink.com/[0-9]\{10,\}"), "Global") 1 Quote Link to post Share on other sites
Cnotey 3 Posted December 9, 2012 Author Report Share Posted December 9, 2012 You should already have the email and username right? But if you don't for some reason you can try regex to get it: The first line is for the email and the second for the username. \((.*)\)$ Username\:\s(.*)$ Then for the link scrape it out using a wildcard, set it to a variable then navigate it to. Well, I tried grabbing the username before it was submitted, but the table ended up all screwed up. I figure it's easier just to grab it out of the verification email. You guys are awesome, I will plug this in and see if it works. Quote Link to post Share on other sites
Cnotey 3 Posted December 10, 2012 Author Report Share Posted December 10, 2012 Hrm, doesn't seem to be working. Does it matter that I only have the standard version of ubot? Here is what I have for code: http://i1199.photobucket.com/albums/aa477/Cnotey/ubotissue4.png Quote Link to post Share on other sites
HelloInsomnia 1103 Posted December 10, 2012 Report Share Posted December 10, 2012 Oops, copy all of that into a replace. For the original text use that and then the search text is: Username: (don't forget the space after username) and for the replace text leave it blank. Quote Link to post Share on other sites
Cnotey 3 Posted December 10, 2012 Author Report Share Posted December 10, 2012 Well, I guess I'm just a dumbass. Still not working. http://i1199.photobucket.com/albums/aa477/Cnotey/ubotissue5.png Quote Link to post Share on other sites
HelloInsomnia 1103 Posted December 10, 2012 Report Share Posted December 10, 2012 When you did it the first way were you getting Username: randomusername? 1 Quote Link to post Share on other sites
Cnotey 3 Posted December 10, 2012 Author Report Share Posted December 10, 2012 When you did it the first way were you getting Username: randomusername? No, was getting (0) for the variable value. Thanks for the continued help man. It makes sense that I am doing something wrong with the syntax somewhere. Quote Link to post Share on other sites
Cnotey 3 Posted December 12, 2012 Author Report Share Posted December 12, 2012 So I guess no one has an answer for me? Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted December 12, 2012 Report Share Posted December 12, 2012 clear table(&test) set table cell(&test, 0, 2, "Email sent to (xxx@gmail.com)") set table cell(&test, 0, 4, "Hi and welcome to our site. We need you to verify the link. http://www.randomlink.com/4309483948 Please let us know if you have any questions You can find your username below Username: randomUsername") set(#email address, $find regular expression($table cell(&test, 0, 2), "([a-zA-Z0-9_-]\{2,\})@([a-zA-Z0-9_-]\{2,\})\\.([a-zA-Z0-9_-]\{2,\})"), "Global") set(#username, $replace($find regular expression($table cell(&test, 0, 4), "Username\\:\\s(.*)$"), "Username: ", $nothing), "Global") set(#verification link, $find regular expression($table cell(&test, 0, 4), "http://www.randomlink.com/[0-9]\{10,\}"), "Global") Then just use the variables as needed.. 1 Quote Link to post Share on other sites
Cnotey 3 Posted December 13, 2012 Author Report Share Posted December 13, 2012 Thanks for the help all. I will give it a shot and see if it works. Quote Link to post Share on other sites
AutomationNinja 194 Posted December 13, 2012 Report Share Posted December 13, 2012 any luck? Quote Link to post Share on other sites
Cnotey 3 Posted December 14, 2012 Author Report Share Posted December 14, 2012 any luck? Nope. Everyone keeps giving me the same answer too! Very frustrating, I swear I understand the directions correctly. Maybe standard version of ubot can't use regex? Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted December 14, 2012 Report Share Posted December 14, 2012 in the search bar of standard type findto see if find regex shows up. if it does then you have it, and im sure you do.. I gave exact code of what works in my last post. The everyone keeps giving the same answer, very frustrating'' The answers were giving is because the code is working.why not give the exact table data and I can help you further more with yanking from the exact information that your getting if you dont want to share publicly to get help then send me a private message 1 Quote Link to post Share on other sites
Cnotey 3 Posted December 15, 2012 Author Report Share Posted December 15, 2012 Wohoo, I solved the issue. So I did a little research on Regex. My first experience with it. I found that the correct regex code to get what I needed was: Username:\s(.*)? not Username\\:\\s(.*)$ However, I have no clue why the regex code I have in red above works. Once again thank you so much for your help!!!!!! I never would have figured this out on my own. I am also happy you guys didn't completely give me the correct answer and made me figure it out on my own. Quote Link to post Share on other sites
Cnotey 3 Posted December 15, 2012 Author Report Share Posted December 15, 2012 A little bit more info if another noob comes along (like me) who needs some help: Quote Link to post Share on other sites
Cnotey 3 Posted December 15, 2012 Author Report Share Posted December 15, 2012 Double post Quote Link to post Share on other sites
HelloInsomnia 1103 Posted December 15, 2012 Report Share Posted December 15, 2012 I think when you copy it out of Ubot you get the double slash like that. As for the $ vs the ? the $ indicates the end of the line which may have been a problem the ? means 0 or 1 of something. Anyways, glad you got it solved. Quote Link to post Share on other sites
VaultBoss 310 Posted December 15, 2012 Report Share Posted December 15, 2012 The double slash comes from the need to actually escape (using a slash) certain reserved symbols used by Regex itself. So whenever you have the need to use such a reserved symbol as an actual symbol in the string, you need to escape it; hence the single slash becomes double slash.For Regex that means you intended to really have a slash character there and not to instruct Regex through the means of its reserved symbol, to perform whatever the slash is supposed to do. When you input your Regex using the Node view, Ubot will automatically escape some characters within the underlying code. When you switch to Code view, you will see the double slash, while in Node view it only shows a single slash. However, IF you change the regex string in code view and eliminate a slash there, you will get an error and it won't function anymore. Username:\s(.*)? not Username\\:\\s(.*)$ TJ gave you the code he copy/pasted from his Code View (hence the double slash)However, you have only the Standard License of UBot, which means you cannot use the Code View, but only the Node View, hence the single slash that you had to use instead. 1 Quote Link to post Share on other sites
VaultBoss 310 Posted December 15, 2012 Report Share Posted December 15, 2012 Username:\s(.*)? not Username\\:\\s(.*)$ I think when you copy it out of Ubot you get the double slash like that. As for the $ vs the ? the $ indicates the end of the line which may have been a problem the ? means 0 or 1 of something. Anyways, glad you got it solved. So the regex string above basically instructs the Regex engine to start looking for a piece of text that contains the wordUsername: that you specified(pay attention that if the scraped page loses its capitalization or the : it won't work as you expect it)followed by a space as you instructed it with the \s symbol(pay attention it will render any type of space of any length, such as a TAB, not only a blank character) followed by ANYTHING(the dot symbol there . matches any single character except line break characters \r and \n ) of ANY length(the * symbol repeats the previous item zero or more times in a greedy fashion, so as many items as possible will be matched)In regards to the ending alternatives:? symbol makes the preceding item optional(so it may be something there, if found, or not at all if it doesn't exist) while on the other hand,$ symbol matches at the end of the string the regex pattern is applied to.(that means if multiple instances would be present that fall within the regex's string match, only the one at the end of the line will be returned (matched) while the rest will be dropped)HTH.. 1 Quote Link to post Share on other sites
AutomationNinja 194 Posted December 19, 2012 Report Share Posted December 19, 2012 Yeah I looked up regex and it looked really complex. That is something I'll have to get back to. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.