Jump to content
UBot Underground

Google Serp Unencoded Hrefs


Recommended Posts

I'm pretty new to Ubot Studio and am attempting to scrape google serps using the following code (I'm sure this is familiar to most of you):

 

add list to list(%scraped urls,$scrape attribute($element child(<class="r">),"href"),"Delete","Global")

 

Typically, the href in class="r" results were listed as the url of each serp result, but with some user agent / ip combinations the hrefs on the serp page are the following, unencoded string:

 

Old: http://www.kohls.com/catalog/barbie-dolls-doll-houses-toys.jsp

 

New: http://www.google.com/url?url=http://www.kohls.com/catalog/barbie-dolls-doll-houses-toys.jsp%3FCN%3D4294732270%2B4294719501%2B4294719592&rct=j&frm=1&q=&esrc=s&sa=U&ei=QyGHVNOHDs7hoAT_14C4Aw&ved=0CFYQFjAL&usg=AFQjCNGb8RP1e5sUaZxBWxCwZ9aSQd8WSg

 

If I click this new link from within ubots browser, the redirect does not occur. I have tried to click both with commands and manually from within the ubot browser. If I paste this same link into a browser outside of ubot, the redirect works fine. So, I'm assumig the issue is that the ubot browser does not handle this type of unecoded link like the rest of modern browsers, but perhaps I'm wrong?

 

Anyone else seeing this?

 

Thanks

Link to post
Share on other sites

But I don't want to just navigate to that url, or just record the position, I'd like google to see it as a click on the serp page. So I need to be able to click that link and have it take me to the page.

 

Cool regex site you linked to, btw. That will come in handy.

Link to post
Share on other sites

But I don't want to just navigate to that url, or just record the position, I'd like google to see it as a click on the serp page. So I need to be able to click that link and have it take me to the page.

 

Cool regex site you linked to, btw. That will come in handy.

 

You said if you paste it outside of Ubot it works fine, I would grab that user agent from here: http://whatsmyuseragent.com/ and use that user agent in Ubot. Then you can start trying others to determine which ones work with the redirect and you can add them all into a list and randomly set one with each search.

Link to post
Share on other sites

Thanks HelloInsomnia, good idea.

 

I guess I will have to weed out user agents from my list that return the google redirect url vs. the real url of the site. I assume that ubot browser limitations will prevent me from using serps with the non-encoded redirect url at all.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...