Jump to content
UBot Underground

Google Scraping, How To Tell Ubot To Loop As Many Times As There Are Google Results Pages?


Recommended Posts

Guys, I am stuck on this and haven't been able to figure out how its done.  

 

How do you tell Ubot to loop as many times as "Next" appears at the bottom of google results pages?

 

This is what I mean:

http://i63.tinypic.com/4vljzm.jpg

Link to post
Share on other sites

this is still the use of defines from your previous thread, heres an example in one tab put these commands so its not a big pile of code

define searchGoogle(#query) {
    navigate("www.google.com","Wait")
    type text(<name="q">,#query,"Standard")
    click(<(tagname="button" AND value="Search")>,"Left Click","No")
    wait for browser event("Everything Loaded","")
    scrapeGoogle()
}
define scrapeGoogle {
    add list to list(%titles,$scrape attribute(<class="r">,"innertext"),"Delete","Global")
    if($exists(<(tagname="span" AND innertext="Next")>)) {
        then {
            click(<(tagname="span" AND innertext="Next")>,"Left Click","No")
            wait for browser event("Everything Loaded","")
            scrapeGoogle()
        }
        else {
        }
    }
}

then on another tab put this, and click run on this tab, so the code I'l explain below, but it is scraping all the google results from this single command, this tab is your execution tab, the other tabs are your defined tabs

searchGoogle("site:http://network.ubotstudio.com/forum/ http post")

so the searchGoogle command, goes to google.com, it googles the query and then calls the scrape google command,the scrapeGoogle command simply scrapes whatever results are on the page when it is called, and then checks if the next button exists, to decide its next action

 

if the next button exists, it clicks it, then calls the scrape google command (calls itself)  again ( like the goto thing you were asking) and it keeps clicking the next button then calling the scrape google command(itself) to scrape the new page until the next button is no longer there

 

Lets say you wanted to read a file of search queries, and you wanted to scrape the results of each query, you would simply just need to use a loop in your execution tab, searchGoogle next list item, that is all thats needed since the searchGoogle and scrapeGoogle commands will control the flow of each query, no confusing dump load of code with loops, you know that all you need to do is write something simple to get the query into the googleSearch command and theres nothing else to worry about, so a simple loop in the execution tab of searchGoogle next list item would suffice

 

OR the bad "easy" way, have a loop to go through each item, then have a loop while loop inside of that loop that checks if the button exists, if it does, break out of that loop and continue on the first loop , thats not even trying to sound funny that is literally the instructions you've set for that script, although its the same as my code above, it is needlessly complicated,messy and I promise it takes longer to write, and in 2 months time when you forget how it works and you go to fix it or upgrade it, you'l slap yourself too

 

So I hope you just didnt need to know of the "exists" function and your problem was controlling the flow of your bot, if not Im sure this might help someone else out, but theres way to work without the exists function so I assume it is a flow issue, so without defines(or literally goto's) is actually MORE difficult to plan the scripts flow than organizing it with defines, I hope this makes sense

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...