Jump to content
UBot Underground

help me scrap url's from google


Recommended Posts

I try to

 

- scrap vine.co/v/ urls + description of them

- scrap a selected number of urls from google search result and save them to mysql db

 

and my code till now is here

ui text box("Enter Search", #src)
navigate("https://www.google.com/", "Wait")
wait(3)
type text(<name="q">, #src, "Standard")
click(<name="btnG">, "Left Click", "No")
wait(3)
add list to list(%urls, $scrape attribute(<(before="<a href="\" AND after="\" onmousedown=")>, %urls), "Delete", "Global")
save to file("", %urls)

what's next ?

Link to post
Share on other sites

I try to

 

- scrap vine.co/v/ urls + description of them

- scrap a selected number of urls from google search result and save them to mysql db

 

and my code till now is here

ui text box("Enter Search", #src)
navigate("https://www.google.com/", "Wait")
wait(3)
type text(<name="q">, #src, "Standard")
click(<name="btnG">, "Left Click", "No")
wait(3)
add list to list(%urls, $scrape attribute(<(before="<a href="\" AND after="\" onmousedown=")>, %urls), "Delete", "Global")
save to file("", %urls)

what's next ?

Link to post
Share on other sites

try using regex to scrape google.

 

add list to list(%results, $find regular expression($scrape attribute(<class="r">, "innerhtml"), "(?<=href\\=\\\")http.*?(?=\\\")"), "Delete", "Global")

  • Like 1
Link to post
Share on other sites

and how do i set the script to go to next page  and do same as i did in first page and so on .. then .. to same them in a results.txt

 

for example I want to select how many pages to navigate .. let's say 10 pages and after this to save what he found there in results.txt

Link to post
Share on other sites

ui drop down("Pages to navigate", "10,20,30,40,50,100,200,300,500,1000", #nav)
ui text box("Enter Search", #src)
navigate("https://www.google.com/", "Wait")
wait(3)
type text(<name="q">, #src, "Standard")
click(<name="btnG">, "Left Click", "No")
wait(3)
add list to list(%results, $find regular expression($scrape attribute(<class="r">, "innerhtml"), "(?<=href\\=\\\")http.*?(?=\\\")"), "Delete", "Global")
wait(3)

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...