SmileyBot 13 Posted September 30, 2015 Report Share Posted September 30, 2015 hey guys I use to run a HTTP Get scrape on this address http://best-proxy.ru/feedand get around 8k of proxies the website seems to have changed format and now all the proxy address get cocked up because some IP posts have 4 digits and some have 5 digits and there is no gap 74.122.192.166:808024.146.151.199:15889216.25.32.242:3128108.38.69.85:4744364.79.89.66:53108.171.245.163:312954.243.196.214:8050.18.152.183:44369.241.26.102:80107.20.218.207:8067.23.164.34:312854.247.98.212:80173.230.144.98:3128108.6.53.45:312876.107.254.66:3030796.43.130.70:312875.109.182.175:40154216.157.222.25:808071.189.47.2:808175.129.155.43:4590867.216.65.253:808074.122.192.166:3128208.68.37.137:3128174.34.168.242:1080173.213.108.111:808067.55.121.222:811872.64.146.136:8080198.144.176.54:808074.112.203.107:808075.125.40.18:312898.222.169.158:47045184.169.176.213:80173.213.108.112:8080173.244.161.84:3128198.145.120.198:3128199.192.153.54:108096.43.130.70:808068.48.21.91:52059173.213.108.111:3128173.65.254.183:1663754.248.43.129:80107.21.235.56:312874.94.44.210:808068.63.115.11:651554.243.138.45:8075.125.40.19:312854.243.193.17:80100.42.231.109:80808.21.6.225:8054.245.122.198:8054.251.63.7:808071.179.101.41:38067208.68.37.137:808098.192.103.79:54713207.241.164.68:80173.213.108.113:808054.245.108.234:808096.44.168.147:8080216.244.71.143:3128199.116.118.68:8080142.54.173.168:312871.165.232.138:2940754.247.92.45:312850.59.47.233:8076.73.26.76:3128173.45.108.66:3129216.17.106.16:312874.112.203.107:312867.208.247.132:312898.109.199.166:808068.71.76.242:8082173.213.108.114:3128173.213.108.113:312823.23.169.18:80199.116.118.69:8080108.170.19.214:108074.95.120.77:808076.73.26.75:312864.186.149.43:8000198.8.80.76:3128198.144.176.54:3128192.211.49.210:3128192.211.49.210:8080192.198.83.197:8080216.244.71.143:808054.248.146.26:312823.20.1.5:8080107.20.69.173:3128173.213.108.112:3128198.8.80.76:8080199.119.76.111:808071.207.80.223:1854771.116.205.165:3128192.198.83.197:312850.19.77.137:8080216.118.70.13:8098.109.199.166:312868.42.58.76:4601923.21.92.241:312896.44.168.147:3128108.192.136.193:312854.248.43.129:811824.156.69.149:8099.61.177.194:8080173.236.174.22:80199.119.76.111:312876.73.26.74:312866.134.118.78:808223.21.173.215:8080108.166.52.33:443I'm using the below regex to scrape [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\:[0-9]{1,5} i have found that the data in the script has is <p>200.0.209.118:8080<br> any ideas on how i can add the ">" & "<" to the ends to scrape thx SOLVED (?<=>)[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\:[0-9]{1,5}(?=<) Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.