Jump to content
UBot Underground

Need A Bot To Extract Data From Static Html Files And Transform The Data Into Table Format


Recommended Posts

Calling all uBot coders for hire!

 

I have a bunch of static html files and I need a bot that will extract all of the data and then save it in table format. All of the text files are formatted in the same way, although there are some subtle differences between the files. Some of the files have more details than others, but in each case I need to ensure that all of the data is scraped and saved. 

I believe that this task should be done using regex to extract the data, but if you have a better idea for how to scrape the data, I am open to your suggestions. As long as it works properly, I will be happy.  I have attached 5 sample files for your review.

 

Here is a partial snapshot of how the data looks:

 

http://i827.photobucket.com/albums/zz196/Daniel_Attard/ubot_zpshqaxlxxb.png

W3275917.html

N3232705.html

E3281631.html

N3240566.html

E3275145.html

Edited by APTS
Link to post
Share on other sites

this is prolly really close to what you want

 

just put those html files in a folder called data

 

and run the bot from a folder

 

 

need large data plugin (free xpath)

 

 

clear list(%page scrape)
clear list(%files)
add list to list(%files,$get files("{$special folder("Application")}\\Data","Yes"),"Delete","Global")
loop($list total(%files)) {
    navigate($next list item(%files),"Wait")
    plugin command("Bigtable.dll""Clear large list""scraped")
    plugin command("Bigtable.dll""large List from Xpath""scraped"$document text"//span[@class=\'formitem formfield\']//text()""replace")
    plugin command("Bigtable.dll""Large list Remove duplicates""scraped")
    add item to list(%table rows,"--{$plugin function("Bigtable.dll""Large list total""scraped")}--{$text from list($list from text($plugin function("Bigtable.dll""Large list return""scraped"),$new line),",")}","Don\'t Delete","Global")
}

 

 

I can send you PP info and suggested price(tip) if you like.

 

Hope this helps,

CD

  • Like 1
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...