Gnana Sambandam 1 Posted June 27, 2011 Report Share Posted June 27, 2011 This is the page i want to scrape. http://www.alexa.com/site/linksin/exercise-equipment-review.com I want to scrape the domain urls in the page. But can't seem to scrape the urls, which ever method i try. Quote Link to post Share on other sites
UBotBuddy 331 Posted June 27, 2011 Report Share Posted June 27, 2011 I see 20 links on that page for the report. Which one are you looking at trying to scrape? Quote Link to post Share on other sites
Gnana Sambandam 1 Posted June 27, 2011 Author Report Share Posted June 27, 2011 I see 20 links on that page for the report. Which one are you looking at trying to scrape? I am trying to extract all 20 links on the page and save it to a text file. Quote Link to post Share on other sites
JohnB 255 Posted June 27, 2011 Report Share Posted June 27, 2011 Ok, try this: Alexa Scraper.ubot John Quote Link to post Share on other sites
Gnana Sambandam 1 Posted June 27, 2011 Author Report Share Posted June 27, 2011 Ok, try this: Alexa Scraper.ubot John The ubot your showing only extracts title's of the links. What i am looking for is the hyperlink of the titles you have extracted using your example bot. Quote Link to post Share on other sites
k1lv9h 76 Posted June 27, 2011 Report Share Posted June 27, 2011 Try this:alexa-links-001.ubot Kevin Quote Link to post Share on other sites
JohnB 255 Posted June 27, 2011 Report Share Posted June 27, 2011 The ubot your showing only extracts title's of the links. What i am looking for is the hyperlink of the titles you have extracted using your example bot. That's because I probably forgot to choose href as the scraped attribute. Just change that (outertext to href) and it will scrape what you want. John Quote Link to post Share on other sites
Gnana Sambandam 1 Posted June 27, 2011 Author Report Share Posted June 27, 2011 Try this:alexa-links-001.ubot Kevin It seem like its not working either. Below is the scrape result of bot you have shown:(Only one url is returned) "http://www.alexa.com/favicon.ico" Quote Link to post Share on other sites
JohnB 255 Posted June 27, 2011 Report Share Posted June 27, 2011 here... Alexa Scraper.ubot Quote Link to post Share on other sites
Gnana Sambandam 1 Posted June 27, 2011 Author Report Share Posted June 27, 2011 That's because I probably forgot to choose href as the scraped attribute. Just change that (outertext to href) and it will scrape what you want. John I don't know why but, "herf" field is empty for the alexa page. Any idea. Quote Link to post Share on other sites
Gnana Sambandam 1 Posted June 27, 2011 Author Report Share Posted June 27, 2011 here... Alexa Scraper.ubot Now its working. Thanks for help man. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.