Scraping archive.org

RayMue · September 23, 2011

Hey Guys,

I've got a great problem. My wordpress databasecrashed and now I have to rebuild my page.

I thought about scraping archive.org. For example scrape the content of

I tried:

set(#content, $page scrape("<div class=\"sale-title\">", "<!--<div class=\"body-right\">"), "Global")

save to file("C:\\content.txt", #content)

to scrape the HTML Code between the htm-tag <div class=\"sale-title\"> and <!--<div class=\"body-right\">

but the content.txt file is empty :-( I'm going crazy because I can't find a solution :-(((((

Eddie Waller · September 23, 2011

It looks like there might be an issue with $page scrape that I can look into.

You can use $scrape attribute though, to scrape the outerhtml of the <div class="body-middle">

Here's the code I used that seems to work fine for me:

set(#content, $scrape attribute(<class="body-middle">, "outerhtml"), "Global")

LoWrIdErTJ - BotGuru · September 23, 2011

Eddie in the same accord you cant use the scrape table on it selecting from that same body-middle area.

Recommended Posts