HarryPotter 9 Posted April 4, 2010 Report Share Posted April 4, 2010 hi! i am trying to scrape some proxies from samair.ru after watching aaron's vid, i successful made one for ip-adress.com and was pretty happy about that! anyway, now i am trying to scrape the ones on samair.ru. the difficult thing i found was that table is a little more complex than ip-adress.com. here is an example:<TR><TD>121.58.184.12<script type=text/javascript>document.write(":"+g+r+g+r)</SCRIPT>:8080</TD><TD>anonymous proxy</TD> if i am useing the $page scrape, i have match on the left: <TD>but match on the right, i am confused... the line <script type=text/javascript>document.write(":"+g+r+g+r)</SCRIPT> is in the middle of what i am trying to get, because i want 121.58.184.12:8080 in the above example any ideas? thanks! Quote Link to post Share on other sites
Aaron Nimocks 19 Posted April 4, 2010 Report Share Posted April 4, 2010 You need to do this one by choosing by attribute and placing wildcards for the IP and the Port. Then scrape chosen attribute for inner text. Attached is the bot to do it.HarryPotter.ubot Quote Link to post Share on other sites
HarryPotter 9 Posted April 4, 2010 Author Report Share Posted April 4, 2010 hey aaron, thanks for the response! wow, i didn't know we could use choose and then $scrape chosen attribute. one more problem though that i found after playing with the above code and "http://www.samair.ru/proxy/proxy-01.htm". the <script type=text/javascript>document.write(":"+g+r+g+r)</SCRIPT> actually changes for every line... the (":"+g+r+g+r) is different for every line. some are (":"+y+m+y+m), some (":"+y+m). right now we have<TD>*<script type=text/javascript>document.write(":"+g+r+g+r)</SCRIPT>*</TD> we need to change it to somehow i am guessing?<TD>*<script ...ignore... </SCRIPT>*</TD> thanks for the help! Quote Link to post Share on other sites
Aaron Nimocks 19 Posted April 4, 2010 Report Share Posted April 4, 2010 <TD>*<script type=text/javascript>document.write(*)</SCRIPT>*</TD> Should work Quote Link to post Share on other sites
HarryPotter 9 Posted April 5, 2010 Author Report Share Posted April 5, 2010 <TD>*<script type=text/javascript>document.write(*)</SCRIPT>*</TD> Should work wow. thanks. didn't know a SINGLE space would make such a big difference! Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.