Scrape URLs of Google Search Results

goat · December 26, 2009

Hello, I've been struggling with with trying to pull the urls from a google search request through the browser for keywords like "insurance" which bring up a ton of ads and things like maps. I have not been able to find a way to grab the url's using imacros because I cannot target the links properly since the number of ads keep changing or pictures results appear. I was recommended to try Ubot; however, I am skeptical ubot has the capacity to do what I'm looking for. If you can convince me it can and provide some guidance on how to filter/target just the search result urls I will definitely give this program a shot.

tooltrainer · December 27, 2009

Should be perfectly doable I believe, though I haven't made a bot to do this personally. With UBot's ability to flexibly scrape by attribute and parse a larger subset of scraped data to extract just what you need, I would think this to be a fairly simple task.

Anyone else already done this?

Jonathan

Mike D · December 27, 2009

I'm new to Ubot and I'm trying to do this as well and I'm having a difficult time getting it to work. Any help would be appreciated.

Thanks

tooltrainer · December 27, 2009

What *exactly* happens when you try? Do you get specific errors or other failures in the script? The more detail you give, the better the chances that someone here can lend a hand...

Jonathan

crazyflx · December 27, 2009

@ Goat >> What specifically are you trying to extract? The URL's from the paid ads on the right hand side of google or the actual search results returned to you from google?

Walk us through EXACTLY what you're trying to do. For instance:

Go to google.com

Search for "insert keyword"

Scrape URLS from search results or Scrape URLS from paid ads or etc, etc, etc.

crazyflx · December 27, 2009

I re-read your description of what you'd like to know if uBot could do (that iMacros can't) and I threw something together for you. I'm uploading the video to YouTube now, and when it's finished, I'll post it here.

crazyflx · December 27, 2009

Here you go (best viewed at YouTube in full screen so you can read what's on the screen)

http://www.youtube.com/watch?v=imaJYFWJ0jU

goat · December 27, 2009

Crazyfix that is SPOT on for what I'm trying to do, you are definitely a valuable asset to the ubot community. And thanks for the good video which shows it skips the fluff.

Aaron Nimocks · December 27, 2009

Beat me to it

Was going to throw something together real quick but it would be the same way Crazyflx did it. But yes UBot is way better than imarcros in terms of functionality.

crazyflx · December 27, 2009

Beat me to it

I'm trying to get up there in terms as being as helpful as you've been Aaron (it's definitely going to take me awhile, haha).

Mike D · December 28, 2009

I'm trying to scrape Google organic only results and then check page rank for top 10 sites. The problem i'm running into is that when I scrape Google, some URLs are shortened.

Example:

http://www.techrescueme.com/image1.jpg

Here's what I'm getting: inventors.about.com/od/.../a/sewing_machine.htm without the correct url I can't get correct page rank.

Here's how I'm scraping:

http://www.techrescueme.com/image2.jpg

Any ideas on how I can get the full URLs? I've tried scraping using other attributes but I've had no luck so far. Any help deeply appreciated.

Thanks

PS sorry to hijack thread :>(

bluegoat · December 29, 2009

Here's how I'm scraping:
http://www.techrescueme.com/image2.jpg

Any ideas on how I can get the full URLs? I've tried scraping using other attributes but I've had no luck so far. Any help deeply appreciated.

Thanks

PS sorry to hijack thread :>(

Try this instead:

Mike D · December 29, 2009

I didn't know you could change the $scrape by attribute to href. Learn something new everyday.

the problem with this is that it will scrape every URL and I only need the organic results. How can I limit the scrape to only the organic results?

Thanks for the help.

slash30 · December 29, 2009

I didn't know you could change the $scrape by attribute to href. Learn something new everyday.

the problem with this is that it will scrape every URL and I only need the organic results. How can I limit the scrape to only the organic results?

Thanks for the help.

That should still work to only gather the organic results. The ads use "<A id=an1". Also I just use $scrape instead of $scrape attrib. Not sure if that's any faster.

Mike D · December 30, 2009

You are right! I had a typo. That's why I'm not a programmer for a living.

Thanks a bunch that saved me a ton of time.

Mike

evyta · January 5, 2010

I have a problem to scrape google blog search

http://diamonds-jewelry.net/tesblogsearch.JPG

any help ?

slash30 · January 6, 2010

This is how get the blogs off of googs related search.

FeebleOne · August 30, 2012

So Have I missed somethig ere.....Like where is the code example for this ???

Would be handy and appreciated

Grah

rotem · September 1, 2012

Thats thread from 2009 mate, for ubot 3

Scrape URLs of Google Search Results

Recommended Posts

goat 0

Link to post

Share on other sites

tooltrainer 12

Link to post

Share on other sites

Mike D 0

Link to post

Share on other sites

tooltrainer 12

Link to post

Share on other sites

crazyflx 22

Link to post

Share on other sites

crazyflx 22

Link to post

Share on other sites

crazyflx 22

Link to post

Share on other sites

goat 0

Link to post

Share on other sites

Aaron Nimocks 19

Link to post

Share on other sites

crazyflx 22

Link to post

Share on other sites

Mike D 0

Link to post

Share on other sites

bluegoat 24

Link to post

Share on other sites

Mike D 0

Link to post

Share on other sites

slash30 2

Link to post

Share on other sites

Mike D 0

Link to post

Share on other sites

evyta 0

Link to post

Share on other sites

slash30 2

Link to post

Share on other sites

FeebleOne 0

Link to post

Share on other sites

rotem 4

Link to post

Share on other sites

Join the conversation