Chris M 55 Posted July 23, 2013 Report Share Posted July 23, 2013 I have a webpage that I'm trying to scrape the multiple dates listed in a drop down on a web page. I have tried to use scrape attribute to get the dates and I have been able to do that but it actually pulls inthe list twice instead of once. So I then went to try and scrape the page and I can't figure this out for the lifeof me. This is what the html of that drop down looks like: <select name="fenddate"><option value="0">-</option><option value="2013-07-23">2013-07-23</option><option value="2013-07-24" selected="selected">2013-07-24 (16616)</option><option value="2013-07-25">2013-07-25 (18812)</option><option value="2013-07-26">2013-07-26 (19446)</option><option value="2013-07-27">2013-07-27 (15781)</option><option value="2013-07-28">2013-07-28 (14578)</option><option value="2013-07-29">2013-07-29 (16213)</option><option value="2013-07-30">2013-07-30 (17898)</option><option value="2013-07-31">2013-07-31 (15733)</option><option value="2013-08-01">2013-08-01 (19123)</option><option value="2013-08-02">2013-08-02 (17264)</option><option value="2013-08-03">2013-08-03 (14857)</option><option value="2013-08-04">2013-08-04 (13243)</option><option value="2013-08-05">2013-08-05 (15079)</option><option value="2013-08-06">2013-08-06 (15120)</option><option value="2013-08-07">2013-08-07 (12314)</option><option value="2013-08-08">2013-08-08 (15000)</option><option value="2013-08-09">2013-08-09 (17077)</option><option value="2013-08-10">2013-08-10 (4092)</option><option value="2013-08-11">2013-08-11 (2911)</option><option value="2013-08-12">2013-08-12 (2945)</option><option value="2013-08-13">2013-08-13 (3039)</option><option value="2013-08-14">2013-08-14 (2721)</option><option value="2013-08-15">2013-08-15 (3251)</option><option value="2013-08-16">2013-08-16 (2817)</option><option value="2013-08-17">2013-08-17 (3001)</option><option value="2013-08-18">2013-08-18 (3014)</option><option value="2013-08-19">2013-08-19 (707)</option><option value="2013-08-20">2013-08-20 (476)</option><option value="2013-08-21">2013-08-21 (538)</option><option value="2013-08-22">2013-08-22 (672)</option></select> End Date<br> These dates change daily so I'm having a tough time with it. Any ideas? Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 Page scrape and use the left and right options. Add list to list. List from text. Or regex Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 I already tried the page scrape and it didn't work on that particular element. I don't know anything about regex. Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 Ill be on pc in the morning pm me the link to this thread ill code something for you Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 Cool...thanks man. I found some sample regex that picks out the dates: ([0-9-()]+) I tested it on http://regexhero.net/tester/ But I don't know where to insert and use that on the scrape attribute element. Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 Add list to list(list from text( find regex($document text, regex here))) Not actual code but shows ya how to drop everything in Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 Cool...I'll give that a shot. Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 I'm getting an error when I add the regex Add list to list(list from text( find regex($document text, ([0-9-()]+)))) Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 So first off I have to thank HelloInsomnia because I bought his Regex Builder andI was able to get what I need from that...its not everything but it's more than I hadbefore the bot so I'm really excited.I was able to pull the data with this regex: \d+\-\d+\-\d+\s\(\d+\) That left me with a list like this: 2013-07-24 (16616) 2013-07-25 (18812) 2013-07-26 (19446) 2013-07-27 (15781) 2013-07-28 (14578) 2013-07-29 (16213) 2013-07-30 (17898) 2013-07-31 (15733) 2013-08-01 (19123) 2013-08-02 (17264) 2013-08-03 (14857) 2013-08-04 (13243) 2013-08-05 (15079) 2013-08-06 (15120) 2013-08-07 (12314) 2013-08-08 (15000) 2013-08-09 (17077) 2013-08-10 (4092) 2013-08-11 (2911) 2013-08-12 (2945) 2013-08-13 (3039) 2013-08-14 (2721) 2013-08-15 (3251) 2013-08-16 (2817) 2013-08-17 (3001) 2013-08-18 (3014) 2013-08-19 (707) 2013-08-20 (476) 2013-08-21 (538) 2013-08-22 (672) Now there is two different choices I can't figure out just yet. One is there is a choice that just has a "-" without the quotes which tells the bot to pulleverything instead of a single date... and the other issue is that there is also a choice without the space and the brackets with numbers in them like this 2013-07-23 and I can't pull thatyet. Any ideas on how to pull the other two and keep the ones I'm already getting? Quote Link to post Share on other sites
north_star 16 Posted July 23, 2013 Report Share Posted July 23, 2013 try this : (\d+\-\d+\-\d+)|\s\(\d+\) Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 set(#var, "<select name=\"fenddate\"><option value=\"0\">-</option><option value=\"2013-07-23\">2013-07-23</option><option value=\"2013-07-24\" selected=\"selected\">2013-07-24 (16616)</option><option value=\"2013-07-25\">2013-07-25 (18812)</option><option value=\"2013-07-26\">2013-07-26 (19446)</option><option value=\"2013-07-27\">2013-07-27 (15781)</option><option value=\"2013-07-28\">2013-07-28 (14578)</option><option value=\"2013-07-29\">2013-07-29 (16213)</option><option value=\"2013-07-30\">2013-07-30 (17898)</option><option value=\"2013-07-31\">2013-07-31 (15733)</option><option value=\"2013-08-01\">2013-08-01 (19123)</option><option value=\"2013-08-02\">2013-08-02 (17264)</option><option value=\"2013-08-03\">2013-08-03 (14857)</option><option value=\"2013-08-04\">2013-08-04 (13243)</option><option value=\"2013-08-05\">2013-08-05 (15079)</option><option value=\"2013-08-06\">2013-08-06 (15120)</option><option value=\"2013-08-07\">2013-08-07 (12314)</option><option value=\"2013-08-08\">2013-08-08 (15000)</option><option value=\"2013-08-09\">2013-08-09 (17077)</option><option value=\"2013-08-10\">2013-08-10 (4092)</option><option value=\"2013-08-11\">2013-08-11 (2911)</option><option value=\"2013-08-12\">2013-08-12 (2945)</option><option value=\"2013-08-13\">2013-08-13 (3039)</option><option value=\"2013-08-14\">2013-08-14 (2721)</option><option value=\"2013-08-15\">2013-08-15 (3251)</option><option value=\"2013-08-16\">2013-08-16 (2817)</option><option value=\"2013-08-17\">2013-08-17 (3001)</option><option value=\"2013-08-18\">2013-08-18 (3014)</option><option value=\"2013-08-19\">2013-08-19 (707)</option><option value=\"2013-08-20\">2013-08-20 (476)</option><option value=\"2013-08-21\">2013-08-21 (538)</option><option value=\"2013-08-22\">2013-08-22 (672)</option></select> End Date<br>", "Global") add list to list(%values, $list from text($replace($find regular expression(#var, "(?=\">).*?(?=</option>)"), "\">", ""), $new line), "Delete", "Global") remove from list(%values, 0) load html($replace(%values, $new line, "<BR>")) Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 I haven't tried your code LoWrIdErTJ because you're adding them as a static choicewhen these values update dynamically daily and change daily. I need something thatcan scrape the values and then add them to a UI drop down dynamically. Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 the above will do that.. its scraping the innertext of each possible option its the same way i populate a drop down in your dropdown use$replace($list name, $new line, ",") that will take the list, and convert it to comma delimited to separate them as indivigual list items in your drop down to select. Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 Where is it scraping anything? You're setting the variable as static choices? The values I need are pulled from scraping the drop down on a webpage wherethe values are dynamic and change daily. I can't see how your code does thatwhen you're setting the variables and not scraping anything. I'm SOOO confused Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 so change the variable that has them in it static to the scrape attribute, and scrape the item on the page first. im just going by what you supplied. if you give a link to where it shows up i can make the scrape attribute for you as well. 1 Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 No it's my bad LoWrIdErTJ you've given me enough to at least get pointed in the right direction. If I have any more issues I'll post again. Thank you for your continued support here man...you're awesome man! Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 We're almost there guys, here is the code I'm using... set(#month, $scrape attribute(<name="fenddate">, "innerhtml"), "Global") add list to list(%month, $find regular expression(#month, "(\\d+\\-\\d+\\-\\d+)|\\s\\(\\d+\\)"), "Delete", "Global") set(#month, %month, "Global") set(#month, $replace(#month, $new line, ","), "Global") set(#Choose Your Month, $plugin function("File Management.dll", "$dropdown dialog", "Choose Your Month", "Choose Your Month:", #month), "Global") And that produces the list but the values are separated like this: #month: 2013-07-23,2013-07-24, (16616),2013-07-25, (18812),2013-07-26, (19446),2013-07-27, (15781),2013-07-28, (14578),2013-07-29, (16213),2013-07-30, (17898),2013-07-31, (15733),2013-08-01, (19123),2013-08-02, (17264),2013-08-03, (14857),2013-08-04, (13243),2013-08-05, (15079),2013-08-06, (15120),2013-08-07, (12314),2013-08-08, (15000),2013-08-09, (17077),2013-08-10, (4092),2013-08-11, (2911),2013-08-12, (2945),2013-08-13, (3039),2013-08-14, (2721),2013-08-15, (3251),2013-08-16, (2817),2013-08-17, (3001),2013-08-18, (3014),2013-08-19, (707),2013-08-20, (476),2013-08-21, (538),2013-08-22, (672) So first off there is a choice that is just a "-" character with no quotes that still isn't pulled yetAnd then the choices are not formatted properly... They should look like this:- (This is the first default choice)2013-07-232013-07-24 (16616)etc... But for some reason it's putting the (16616) etc on a new line. How can I fix that to pull in the rest of the string and not put thoseas new choices in the drop down? Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 Well I figured out why the dates were messed up...Got that fixed but I still can't see why I can't pull in the first default choice of "-" without the quotes. Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 set(#month, $scrape attribute(<name="fenddate">, "innerhtml"), "Global") add list to list(%month, $find regular expression(#month, "(\\d+\\-\\d+\\-\\d+)|\\s\\(\\d+\\)"), "Delete", "Global") set(#month, %month, "Global") set(#month, $replace(#month, $new line, ","), "Global") set(#Choose Your Month, "-,{$plugin function("File Management.dll", "$dropdown dialog", "Choose Your Month", "Choose Your Month:", #month)}", "Global") Quote Link to post Share on other sites
LoWrIdErTJ - BotGuru 904 Posted July 23, 2013 Report Share Posted July 23, 2013 just put it there by default. Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 Hum...adding that in breaks the bot again. It doesn't add it in and it breaks the lines again. Quote Link to post Share on other sites
Chris M 55 Posted July 23, 2013 Author Report Share Posted July 23, 2013 I just used add item to list before the add list to list and it seems to be working fine now Thank you for all your help man! I appreciate it! A LOT! Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.