Rob JH 0 Posted August 15, 2013 Report Share Posted August 15, 2013 Hi Guys, How do I go about scraping just the values from the below div? So I want to end up with: peterjeffry86jeffrypeter456pjeffry27 Obviously suggestions are going to change so I think I need a regex? though i just cant figure out how to get going with regex in UBot. <div id="username-suggestions" class="username-suggestions" style="display: block;">Available: <a href="">peterjeffry86</a><a href="">jeffrypeter456</a><a href="">pjeffry27</a></div> Any help appreciated. Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted August 15, 2013 Report Share Posted August 15, 2013 are the href="" really href="" or do they have links? are there other links on the result? Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted August 15, 2013 Report Share Posted August 15, 2013 3 ways shown below you can do it.. load html("<div id=\"username-suggestions\" class=\"username-suggestions\" style=\"display: block;\">Available: <a href=\"\">peterjeffry86</a><a href=\"\">jeffrypeter456</a><a href=\"\">pjeffry27</a></div>")clear list(%names)add list to list(%names, $list from text($page scrape("<a href=\"\">", "</a>"), $new line), "Delete", "Global")comment("OR")clear list(%names)add list to list(%names, $list from text($find regular expression($document text, "(?=<a href=\"\">).*?(?=</a>)"), $new line), "Delete", "Global")comment("OR")set(#var, $scrape attribute(<id="username-suggestions">, "innerhtml"), "Global")clear list(%names)add list to list(%names, $list from text($find regular expression(#var, "(?=<a href=\"\">).*?(?=</a>)"), $new line), "Delete", "Global") 2 Quote Link to post Share on other sites
positivity13 4 Posted August 15, 2013 Report Share Posted August 15, 2013 Take the time to learn regex, it's the nuts once it clicks. Took me a little while but so glad took the time. I use it scrape most things now. Tj - doesn't the last bit of the regex need the < sign. I.e (?<=</a>) Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted August 15, 2013 Report Share Posted August 15, 2013 Take the time to learn regex, it's the nuts once it clicks. Took me a little while but so glad took the time. I use it scrape most things now. Tj - doesn't the last bit of the regex need the < sign. I.e (?<=</a>) nope should work as is 1 Quote Link to post Share on other sites
positivity13 4 Posted August 15, 2013 Report Share Posted August 15, 2013 Apologies, my bad Quote Link to post Share on other sites
Rob JH 0 Posted August 15, 2013 Author Report Share Posted August 15, 2013 Great thanks BotGuru for the in depth answer, I do know Regex in general just could not get it going in UBot, Ill read up on the docs when I get the time as it is something I need to do, though only been using Ubot a couple of weeks. And in answer to your question yes the hrefs are like that, that's the exact code from the page, the reasoning behind the scrape is to get the page to do the hard work of creating a username as I think that is the smallest footprint when signing up rather than trying to put a username together yourself like #firstname #lastname #dob etc, there are just so many variations you would have to do to hide the footprint. Quote Link to post Share on other sites
Rob JH 0 Posted August 20, 2013 Author Report Share Posted August 20, 2013 I found that the bottom 2 suggestions you gave were also showing the href before on the names list, this worked in the end based on your first suggestion: I actually realised I could just get away with: clear list(%SuggestedUsernames)add list to list(%SuggestedUsernames, $list from text($page scrape("<a href=\"\">", "</a>"), $new line), "Delete", "Global") As the page did not have any other empty hrefs though I could do with knowing why the Regex was bringing back the href before it and not just the username if possible please for future reference when I need the Regex. Quote Link to post Share on other sites
solstudioim 4 Posted December 3, 2014 Report Share Posted December 3, 2014 This thread is really damn helpful. Thanks everyone! Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.