wbingham 0 Posted January 21, 2021 Report Share Posted January 21, 2021 I am new to uBot, but not new to JS, PHP and scraping with dom html. What I'm trying to do seems simple in uBot, yet I'm having a tough time with it. The built-in table scrape works fine (entire table dumps to a *.csv super easily and quickly) - but that doesn't scale well for what I need. Here is the basic outline; in my script I connect to a mysql database and retrieve a url and a target_id. so lets say I'm scraping target_id #15, - uBot will store "15" in a "id" variable and navigate to www.webaddress.com/details.php?item=123456789. On that page is a small table - about 10 rows total, and the two columns I am interested in are "description (varchar) " and "cost (decimal with a $ and ,) - there are other columns but I do not need those. Assume column 3 for "description" and column 6 for "cost". Sometimes there are more than 10 rows, sometimes less like 6 or 8. One value outside the table is a div the does not have an div id, but always has the words "Grand Total:" before a decimal value with again a $ and a , -- What I would like to insert is all 10 rows (or however many there are) as id:15, description: wonderful red item, cost: 10.99 and then a final insert into another mysql table -> id:15, total:89.65 Thanks in advance for any assistance!! Quote Link to post Share on other sites
SourceUltra 10 Posted January 24, 2021 Report Share Posted January 24, 2021 if you know the ending point of the data you want to scrape, then page scrape the page and then use heopas get before to grab everything before that Grand Total. Parse that result and add it to a list, then add that list to a row in a table. Parsing that data will obviously be the biggest challenge. Good luck! You already know your target id's so it should be easy. Quote Link to post Share on other sites
wbingham 0 Posted January 25, 2021 Author Report Share Posted January 25, 2021 I guess my first challenge is scraping the table without using the built in table scrape function- that function only outputs to a file, which isn't scalable. I'm pretty sure I can figure out the grand total thing, but the table scrape directly into mysql is imperative. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.