Kev 69 Posted May 10, 2013 Report Share Posted May 10, 2013 Hi all, I am working through a regex problem and when I think I have it working it crops up with another problem. Essentially I want to remove any apostrophes from a variable - ('213.94.251.236','kwoconnor@gmail.com','car service','sydney','google.com.au','4','94-98 O'Riordan St, Alexandria NSW 2015','(02) 9318 8800','johnnewellmazda.com.au')In this example I only want to delete the apostrohpes at O'Riordain and another one I saw was Bishop's Rd so I just need them to look like:ORiordanBishops Rd How can I regex this without removing the apostrophes that are dotted throughout the variable? Quote Link to post Share on other sites
Legend 181 Posted May 10, 2013 Report Share Posted May 10, 2013 One way would be to (a) convert it to a text string, ( do a simple replace of "," with any obscure string (e.g., #$#), © do a simple replace of ' with $nothing (d) do a simple replace of #$# with ","... or something like that... not sure if regex will do it otherwise... Quote Link to post Share on other sites
HelloInsomnia 1103 Posted May 10, 2013 Report Share Posted May 10, 2013 I see what you are saying, you want it to match anyone and you do not know the names beforehand right? (?<=[a-zA-Z0-9])\'(?=[a-zA-Z0-9]) Quote Link to post Share on other sites
Pete 121 Posted May 10, 2013 Report Share Posted May 10, 2013 Many ways to skin a cat \b\'(?=\w) 1 Quote Link to post Share on other sites
Kev 69 Posted May 10, 2013 Author Report Share Posted May 10, 2013 I see what you are saying, you want it to match anyone and you do not know the names beforehand right? (?<=[a-zA-Z0-9])\'(?=[a-zA-Z0-9]) Works a treat thank you! Quote Link to post Share on other sites
Kev 69 Posted May 10, 2013 Author Report Share Posted May 10, 2013 One way would be to (a) convert it to a text string, ( do a simple replace of "," with any obscure string (e.g., #$#), © do a simple replace of ' with $nothing (d) do a simple replace of #$# with ","... or something like that... not sure if regex will do it otherwise... Many ways to skin a cat \b\'(?=\w) Thank you guys for your help too, it's appreciated. Quote Link to post Share on other sites
Kev 69 Posted May 12, 2013 Author Report Share Posted May 12, 2013 Guys, Im having a problem trying to extract two pieces of data from this line: <div class="text vcard indent block" id="panel_A_2"> <div class="name lname" id="link_A_2"> <span class="pp-place-title"><span>Mercedes-Benz of Beverly Hills</span></span> <span class="actbar-local-wrapper"><span id="actbar-A" class="actbar" markerid="A" panelid="actbar-panel-A-pp" iscompact="1" jsvalues="@unique-id: $this.uniqueId;@id: 'actbar-'+$this.uniqueId;@markerid: $this.markerId;@panelId: 'actbar-panel-'+$this.uniqueId;"><span id="actbar-btns-A" jsvalues="@id:'actbar-btns-'+$this.uniqueId" jsdisplay="$this.visible!='none'" jsskip="1"><span jsaction="ab.topLevelClick" action="actbar-more" class="actbar-cmpct"><img src="http://maps.gstatic.com/intl/en_ALL/mapfiles/transparent.png" class="arrow-down"></span></span></span></span> <div class="pp-coverphoto"> <div> <a href="http://maps.google.com/local_url?dq=car+service+90210&q=https://plus.google.com/101849307966960802391/about%3Fgl%3DUS%26hl%3Den&ved=0CF0QhgU&sa=X&ei=vO2PUYKeJYG8iAb1loHAAQ&s=ANYYN7kUNOevIr2Bic1OQQ5iy_VXsd0sdg" target="_blank"> <div> <div id="photo-stack"> <img class="photo-stack-background" src="//maps.gstatic.com/mapfiles/thumbnail.png"> <div class="photo-shadow"> <div class="photo-border"> <img src="https://lh3.googleusercontent.com/-tOBfSgrZH1Q/T5R7taYJVUI/AAAAAAAhNzo/C6K6G8GAqTk/s85/Mercedes%2BBenz%2Bof%2BBeverly%2BHills" alt="Photo" title="Photo" class="pp-linked-photo"> </div> </div> </div> </div> </a> </div> <div class="pp-source-name pp-cover-source"> </div> </div> </div> <div> <div></div> <span dir="ltr" class="pp-headline-item pp-headline-address"><span>9250 Beverly Blvd, Beverly Hills, CA</span></span> <div> <div></div> <span class="pp-headline-item pp-headline-phone"> <span class="telephone" dir="ltr"> <nobr>(310) 659-2980</nobr> <span class="pp-headline-phone-label" style="display:none"> ()</span> </span> </span> <span> · </span> <span class="pp-headline-item pp-headline-authority-page"> <span><a href="http://maps.google.com/local_url?dq=car+service+90210&q=http://www.bhbenz.com/&ved=0CGIQ5AQ&sa=X&ei=vO2PUYKeJYG8iAb1loHAAQ&s=ANYYN7mLHKml8bdFWLylwRfGL4-_YLhP8w" target="_blank"><span>bhbenz.com</span></a></span> </span> </div> <div class="rescat-lhp2"><span class="cats-teaser">Category: <b>Car</b> Leasing <b>Service</b> </span> </div> <div> <span jshover="zagat-hover-vO2PUYKeJYG8iAb1loHAAQ-0" jsaction="mouseover:pp.hover;mouseout:pp.hover"> <b class="zagat-score">7</b> <div class="zagat-hover" id="zagat-hover-vO2PUYKeJYG8iAb1loHAAQ-0" reposition="false" jsaction="mouseover:pp.hover;mouseout:pp.hover" style="display:none"> <span class="zagat-hover-score">7</span> <span style="color:gray">/ 30</span> <span class="zagat-hover-explanation">Poor to fair</span> </div> </span> <span id="pp-reviews-headline"> <span><a class="pp-more-content-link" href="http://maps.google.com/local_url?dq=car+service+90210&q=https://plus.google.com/101849307966960802391/about%3Fgl%3DUS%26hl%3Den&ved=0CGEQlQU&sa=X&ei=vO2PUYKeJYG8iAb1loHAAQ&s=ANYYN7kUNOevIr2Bic1OQQ5iy_VXsd0sdg" target="_blank"><span>32 reviews</span></a></span></span> </div> <div><div class="pp-headline-item pp-knownforterms" dir="ltr"><span><span>service dept</span><span> · </span></span><span><span>service advisor</span><span> · </span></span><span><span>maybach</span><span> · </span></span><span><span>mbz</span></span></div></div> <div align="left"><span> <div class="pp-story pp-description" id="pp-desc-ssj"> <div> <span>"About what you would expect from a dealer <b>service</b>. All my work has been <b>...</b>"</span> <nobr> <span class="pp-hover-attribution"> - <span></span> </span> </nobr> </div> </div> </span></div> </div> <div class="actbar-local-wrapper"> <span><span id="actbar-panel-A-pp" class="actbar" jsvalues="@unique-id: $this.uniqueId;@markerid: $this.markerId;@id: 'actbar-panel-' + $this.uniqueId;@panelId: 'actbar-panel-' + $this.uniqueId;"><span jsvalues="@id:'actbar-sn-' + $this.uniqueId;"><span jsdisplay="$this.visible=='actbar-sn'"></span></span><span jsvalues="@id:'actbar-saveto-' + $this.uniqueId;"><span jsdisplay="$this.visible=='actbar-saveto'"></span></span></span></span> </div> </div> What I need is the Business Name (in this case Mercedes-Benz of Beverly Hills and also the Category (which is car dealer). I have the other data such as url, phone etc. Thanks in advance for the help. I am using Replace Regular Expression to find the other data. Quote Link to post Share on other sites
HelloInsomnia 1103 Posted May 12, 2013 Report Share Posted May 12, 2013 This should get you the title: (?<=\=\"pp-place-title\"\>\<span\>)[a-zA-Z0-9\s\S]*?(?=[\<]) for the category are you asking for the text after "Category:" ? Quote Link to post Share on other sites
Kev 69 Posted May 12, 2013 Author Report Share Posted May 12, 2013 Thanks for that. Yes, after Category: Thanks Quote Link to post Share on other sites
HelloInsomnia 1103 Posted May 12, 2013 Report Share Posted May 12, 2013 Okay this should work: (?<=\=\"cats-teaser\"\>Category\:\s).*?(?=(\<\/span)) Quote Link to post Share on other sites
Kev 69 Posted May 13, 2013 Author Report Share Posted May 13, 2013 This should get you the title: (?<=\=\"pp-place-title\"\>\<span\>)[a-zA-Z0-9\s\S]*?(?=[\<]) for the category are you asking for the text after "Category:" ? Just tried this out now - I thought it was working fine, but there's a slight problem. Sometimes the results are "bolded" and it will only regex the first word of the results. For example, go to https://maps.google.co.uk and use the search term emergency plumber london It will grab just the word "Lambeth" rather than Lambeth Plumbers (plumbers being in Bold). Thanks for the help on this. Category works perfectly Quote Link to post Share on other sites
HelloInsomnia 1103 Posted May 13, 2013 Report Share Posted May 13, 2013 Try to replace the [a-zA-Z0-9\s\S] with a period and see if that helps it. Quote Link to post Share on other sites
Kev 69 Posted May 13, 2013 Author Report Share Posted May 13, 2013 Hi, It's still coming back without the bold results unfortunately. Quote Link to post Share on other sites
HelloInsomnia 1103 Posted May 13, 2013 Report Share Posted May 13, 2013 Yup I see the problem, try this: (?<=\=\"pp-place-title\"\>\<span\>).*?(?=(\<\/span)) Quote Link to post Share on other sites
Kev 69 Posted May 13, 2013 Author Report Share Posted May 13, 2013 Works a treat, thank you Any chance regex would also remove the <b> and </b> found in those results? If not it's fine I can loop through them and replace them anyhow, but wondered if it could be added to the regex line you provided, if possible Thanks so much for this Kev Quote Link to post Share on other sites
HelloInsomnia 1103 Posted May 13, 2013 Report Share Posted May 13, 2013 I couldn't come up with anything off the top of my head but I might try again later. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.