runsoftware 14 Posted April 1, 2014 Report Share Posted April 1, 2014 (edited) Case scenario: I am trying to scrape a website that has 15 items, inside of these items we have another items. But there are items that are blank, that means the user didnt gave enough information about something.. So basically lets say we have two things: Square A: -color: red-size:big Square B:-color: <--- the tag color actually is not found in the page-size:small I would end up with two lists (%color and %size) %color(1 item)(0): red %size(2 items)(0):big(1):small THE 1 MILLION DOLLAR QUESTION IS: how to make the list %color get two items like this: %color(2 items)(0): red(1): (1) gets a blank item because it didnt find any regex. sounds confusing right? Edited April 1, 2014 by KardoseR Quote Link to post Share on other sites
UBotDev 276 Posted April 1, 2014 Report Share Posted April 1, 2014 Do you have a million dollars? Since you didn't provide any code and regex, neither you mention the page you are scraping it's really hard to help you, but I'll try with 2 answers. If you are scraping the container and item innertext is blank you should still get a blank item inside a list, so make sure you are also scraping it when it's empty (also make sure that you don't remove duplicates from list when you are doing it). If you are using regex to extract data you should always scrape only 1 container (1 color + 1 size), and use regex on that string vs. using regex on all 15 containers what I guess you are doing now. Btw, the problem in your case goes even deeper....If "Square A" container wouldn't have color attribute and "Square B" would have color red, you would even get wrong data (since in your list "Square A" wold have color red, and B wouldn't have it). Quote Link to post Share on other sites
runsoftware 14 Posted April 3, 2014 Author Report Share Posted April 3, 2014 I mean there are pages that contain a regex and others not, i just wanted to do it like the pages that doesnt contain the desired regex would just add a blank item to list or "Not specified" item Quote Link to post Share on other sites
Bot-Factory 602 Posted April 3, 2014 Report Share Posted April 3, 2014 set(#stuff, "testdatata", "Global")set(#regexResult, $find regular expression(#stuff, "xxx"), "Global")if($comparison(#regexResult, "=", $nothing)) { then { set(#regexResult, "PLACEHOLDER", "Global") } else { }} CheersDan Quote Link to post Share on other sites
runsoftware 14 Posted April 7, 2014 Author Report Share Posted April 7, 2014 (edited) thanks dan. your solution worked (i think) Edited April 7, 2014 by KardoseR Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.