Jump to content
UBot Underground

Recommended Posts

Hey guys/gals,

 

Haven't been able to figure out how to get the search results from google correctly.

 

I keep getting stuff with .... because its obviously not getting the whole URL

 

ex: http://adsabs.cnn.com/abs/1987ApJ...321..280T

 

I am using this scrape code:

add list to list(%scrape url, $list from text($find regular expression($scrape attribute(<class="r">, "outerhtml"), "(?<=<h3 class=\"r\"><a href=\").*?(?=\" onm)"), $new line), "Delete", "Global")

Any ideas?

 

I appreciate any help I can get on this.

 

BTW, I know its probably something small BUT man, I'll tell you, sometimes the more you look at the problem, the harder it is ....LOL

 

Thanks again!

Link to post
Share on other sites

site:ubotstudio.com scraping google

 

put that in google

 

I guarantee you find several answers on this forum

 

I know I have answered this a few times and others as well

 

google is better search than forum search

and now you know how to search well

 

CD

Link to post
Share on other sites

well then, try this

 

sites may change soooo... we shall adapt

add list to list(%scrape url, $list from text($scrape attribute(<tagname="cite">, "innertext"), $new line), "Delete", "Global")
alert($scrape attribute(<tagname="cite">, "innertext"))


Link to post
Share on other sites

 

well then, try this

 

sites may change soooo... we shall adapt

add list to list(%scrape url, $list from text($scrape attribute(<tagname="cite">, "innertext"), $new line), "Delete", "Global")
alert($scrape attribute(<tagname="cite">, "innertext"))


Haha..... Yes We Are BORG Resistance is Futile! :D

  • Like 1
Link to post
Share on other sites

Nope, same results. Keep getting this "..."

 

http://money.cnn.com/.../user/agg/.../tablet.h...

 

Thanks for trying

Must be something wrong on your end man.

 

works here even in python  using xpath

 

download the "large data" plugin and I will so a script using xpath here in a bit

Link to post
Share on other sites

Hear is the current working class to scrape:

 

clear list(%urls)
add list to list(%urls$scrape attribute(<class="_Rm">"innertext"), "Delete""Global")

 

 

https://drive.google.com/file/d/0B0hSg60eXoShRHB0UG1yeUU0dkE/view?usp=sharing
Link to post
Share on other sites

This should work on any page for google

plugin command("Bigtable.dll", "Clear all large list")
ui text box("Search Term", #UI_search_term)
navigate("https://www.google.com/search?q={$replace regular expression(#UI_search_term, "\\s", "+")}&ie=utf-8&oe=utf-8", "Wait")
wait for element(<innertext="Help">, 10, "Appear")
plugin command("Bigtable.dll", "large List from Xpath", "text and urls", $document text, "//h3[@class=\'r\']/a/@href", "replace")
alert($plugin function("Bigtable.dll", "Large list return", "text and urls"))
plugin command("Bigtable.dll", "large List from Regex", "urls", $plugin function("Bigtable.dll", "Large list return", "text and urls"), "(?<=href=\").*?(?=\")", "replace")
alert($plugin function("Bigtable.dll", "Large list return", "urls"))
plugin command("Bigtable.dll", "large List from Regex", "h3 text", $plugin function("Bigtable.dll", "Large list return", "text and urls"), "(?<=>).*?(?=</a>)", "replace")
alert($plugin function("Bigtable.dll", "Large list return", "h3 text"))

If  that doesnt work I dont know what to tell ya

Large data PI is free

 

CD

Link to post
Share on other sites

giganut thanks for that, however, I ran into the same issues. If there was a long URL, then Google truncates it with "..."

Code Docta F*@#in BINGO!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

 

Awesome job! Works brilliantly !!!!!!!!!!!!!!

 

Thank you!

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...