Brutal 164 Posted November 17, 2016 Report Share Posted November 17, 2016 Hey guys, how would I parse this out. example: sub.domain.com I would want to be left either with just: domain.com or another way to look at it is how can I tell if a domain name in a variable contains 2 dots? .. like the example above... sub[dot] & domain[dot] Quote Link to post Share on other sites
pash 504 Posted November 17, 2016 Report Share Posted November 17, 2016 you want only domain?try alert($find regular expression("sub.domain1.com","\\w+\\.\\w+$")) alert($find regular expression("sub.domain.com","\\w+\\.\\w+$")) alert($find regular expression("domain.com","\\w+\\.\\w+$")) Quote Link to post Share on other sites
Brutal 164 Posted November 17, 2016 Author Report Share Posted November 17, 2016 Pash - You Are A Total Rock Star Man!! Thank you! Quote Link to post Share on other sites
deliter 203 Posted November 17, 2016 Report Share Posted November 17, 2016 Pash's answer is correct,as you asked that the sites end with .com,but to make a generic server URL parser is pretty much impossible sites like amazon.co.uk,shopping.amazon.co.uk Quote Link to post Share on other sites
zozo31 10 Posted November 18, 2016 Report Share Posted November 18, 2016 if you really need to parse gTLD or ccTLD with 100% accuracy, you can use TLDextract as the creator says: TLDExtract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and subdomains of a URL, e.g. domain parser. For example, say you want just the 'google' part of 'http://www.google.com'.Everybody gets this wrong. Splitting on the '.' and taking the last 2 elements goes a long way only if you're thinking of simple e.g. .com domains. Think parsing http://forums.bbc.co.ukfor example: the naive splitting method above will give you 'co' as the domain and 'uk' as the TLD, instead of 'bbc' and 'co.uk' respectively.TLDExtract on the other hand knows what all gTLDs and ccTLDs look like by looking up the currently living ones according to the Public Suffix List. So, given a URL, it knows its subdomain from its domain, and its domain from its country code.You can install it via composer, write simple form then put it online so your bot can access it or try to use tld extract directly.. 1 Quote Link to post Share on other sites
HelloInsomnia 1103 Posted November 20, 2016 Report Share Posted November 20, 2016 Try this: [a-zA-Z0-9-]+\.([a-zA-Z]{2,10}\.[a-zA-Z]{2,2}|[a-zA-Z]+)$ Quote Link to post Share on other sites
pash 504 Posted November 21, 2016 Report Share Posted November 21, 2016 Pash's answer is correct,as you asked that the sites end with .com,but to make a generic server URL parser is pretty much impossible sites like amazon.co.uk,shopping.amazon.co.uk try define $GetDomainName(#domain) { set(#domainOut,$find regular expression(#domain,"\\w+\\.\\w+$"),"Local") if($text length($replace regular expression(#domainOut,"\\..*","")) <= 2) { then { set(#domainOut,$find regular expression(#domain,"\\w+\\.\\w+\\.\\w+$"),"Local") } } return(#domainOut) } alert($GetDomainName("test.mydomain.com")) alert($GetDomainName("amazon.co.uk")) alert($GetDomainName("shopping.amazon.co.uk")) 1 Quote Link to post Share on other sites
kev123 132 Posted November 21, 2016 Report Share Posted November 21, 2016 I don't have this written in ubot but this is what I do in .net to extract domain extension with 100% accuracy 1.add a list of domain extensions to a list2.split the subdomain.yourdomain.co.uk into a list by dot.3. in reverse order check if contained in list of extensions each time adding next list item so ukco.ukyourdomain.co.uksubdomain.yourdomain.co.uk 4.if exists in domain extension list add to new list.5. check new list for item with most chars6. you now have domain extension in the above example would be .co.uk 7.if you want to get subdomain replace domain extension with nothing in original domain variable.leaving subdomain.yourdomain 8.if contains dot is a subdomain split again by the dot and first item is subdomain and second is domain. if you only want to check against a few top level domains.com .org .co.uk etc then this is overkill but if you want to check for every domain extension possible this works very well. thankskev123 Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.