WarrenIndLLC 0 Posted December 26, 2013 Report Share Posted December 26, 2013 Hey guys, Here's the problem. I'm trying to scrape data from URLs. The problem is that the URLs all have a slight variation. So I was able to create a list with the data that I want but there's also pieces that I don't want in that list as follows. 4853/DataThatIWant.html"0123/DataThatIWant.html"6879/DataThatIWant.html"3748/DataThatIWant.html"6859/DataThatIWant.html"0794/DataThatIWant.html" I was wondering if there's a way that I can pull everything between the / and the " in that list? Kind of like page scraping that list to create a new list with clean Data that I want. Thanks Quote Link to post Share on other sites
bestmacros 60 Posted December 26, 2013 Report Share Posted December 26, 2013 you need regex, try this(?<=\d{4,4}/).*(?=\") set(#qqq, "0794/DataThatIWant.html\"", "Global") alert($find regular expression(#qqq, "(?<=\\d\{4,4\}/).*(?=\\\")")) Quote Link to post Share on other sites
mk21 1 Posted December 27, 2013 Report Share Posted December 27, 2013 Set a variable to 0Do a loop of each list itemSet a variable as the substring of the the list item corresponding to the original variable. Starting position will be find index of /. Number of characters will be find index of " minus find index of /.Add that variable to the list.Remove position 0 from the list This should give you the list the way you want it Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.