pokerdawg 2 Posted January 6, 2017 Report Share Posted January 6, 2017 Good morning - I'm trying to spider my own site, to get a list of all internal links. On each I want both the href and the innertext So far I have this: navigate("http://www.xxxxxxxxx.com/locations/this-page/","Wait") add list to list(%my innertext list,$scrape attribute(<href=w"/locations/*">,"innertext"),"Delete","Global") alert(%my innertext list) add list to list(%my href list,$scrape attribute(<href=w"/locations/*">,"href"),"Delete","Global") alert(%my href list) This does what I expect - gives me one list of innertext and another list of href What I want is more of a table / array, where each "row" has both an href and an innertext I'm not sure how I would either: 1. Loop over these separately and add them to a table (array) type structure, or 2. Whether I should be scraping the outerhtml instead, then loop over the outerhtml which is a list of this: <a hef="/locations/location-one">This is the first text</a> <a hef="/locations/location-two">This is the second text</a> <a hef="/locations/location-three">This is the third text</a> ... and then try to re-parse them individually. And if I should reparse them individually, how would I do that, since I can't (as far as I know) $scrape attribute of an item on a list? Thanks! Quote Link to post Share on other sites
abbas786 78 Posted January 6, 2017 Report Share Posted January 6, 2017 check these 2 commands 1) add list to table as column2) add list to table as row Quote Link to post Share on other sites
pokerdawg 2 Posted January 6, 2017 Author Report Share Posted January 6, 2017 Will do, thanks Quote Link to post Share on other sites
abbas786 78 Posted January 6, 2017 Report Share Posted January 6, 2017 Here is another way set(#counter,0,"Global") loop($list total(%links)) { set table cell(&result,#counter,0,$next list item(%links)) set table cell(&result,#counter,0,$next list item(%titles)) increment(#counter) } Quote Link to post Share on other sites
pokerdawg 2 Posted January 6, 2017 Author Report Share Posted January 6, 2017 What is the default "delimiter" for a list? I'm not sure if it is a comma, I'm concerned some anchor text may have commas in them. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.