steelersfan 38 Posted October 18, 2016 Report Share Posted October 18, 2016 What is the best way to scrape data from a website like craigslist using the http plugin? Should I use http get, or the xpath parser? And what are the advantages/disadvantages of using either method? I am trying to wrap my brain around this plugin and learn how to best handle scraping practices, so any help and or push in the right direction would be greatly appreciated. Thanks! Quote Link to post Share on other sites
deliter 203 Posted October 18, 2016 Report Share Posted October 18, 2016 PM Aymen for a copy of his Documentation for HTTP Post plugin,he has a pretty in depth level of documentation on how to use his plugin,and a few tutorial videos too http get and xpath parser are complimentary items you use the http get to retrieve the document,so the "<html><body><p>hello</p></body></html>" is now a variable in your debugger the Xpath scraper can then parse this simple string above so that you can scrape from this string Ive also made a few tutorial videos and posts for using my CSS Selector plugin,which is basically an Xpath alternative,it has around 6 functions so it is a more functional scraper set(#htmlString,"<html><body><p>hello</p></body></html>","Global") alert($plugin function("HTTP post.dll", "$xpath parser", #htmlString, "/html/body/p", "InnerText", "HTML")) Dan also has (Ive heard I havnt purchased) a very in depth guide with hours of tutorials for using HTTP Post plugin,try read Aymans Documents,or purchase Dans Video Guide,you wont regret it Quote Link to post Share on other sites
deliter 203 Posted October 19, 2016 Report Share Posted October 19, 2016 heres a bit of help with my CSS Selector,I would do it in Xpath but Ive come from a JS background I dont really know how to select multiple elements set(#columns,$plugin function("DeliterCSS.dll", "Deliter CSS Selector", $plugin function("HTTP post.dll", "$http get", "http://london.craigslist.co.uk/", "", "", "", ""), ".col .ban", "TextContent"),"Global") add list to list(%hrefs,$plugin function("DeliterCSS.dll", "Deliter CSS Selector", $plugin function("HTTP post.dll", "$http get", "http://london.craigslist.co.uk/", "", "", "", ""), ".cats ul li a", "href"),"Delete","Global") I would recommend that tutorial video series,Ive heard from a friend its awesome Quote Link to post Share on other sites
steelersfan 38 Posted October 19, 2016 Author Report Share Posted October 19, 2016 Thanks Deliter! I have heard that the tutorials by Dan are a bit outdated, but will look into them. Also, all of the videos I found from Aymen himself are long outdated, sadly. I thought he gave me all the documentation with the purchase of http post plugin, but if that doesn't come with it, I will see if I can get what you mentioned. I love documentation! Thanks also for the explanation, I get it now! Now if only I could figure out how to scrape more pages in a loop or something, I would be set! Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.