Jump to content
UBot Underground

How To Delete Sub Domains


Recommended Posts

Hey guys, how would I parse this out.

 

example: sub.domain.com

 

I would want to be left either with just: domain.com

 

or another way to look at it is how can I tell if a domain name in a variable contains 2 dots? .. like the example above... sub[dot] & domain[dot]

Link to post
Share on other sites

you want only domain?
try

alert($find regular expression("sub.domain1.com","\\w+\\.\\w+$"))
alert($find regular expression("sub.domain.com","\\w+\\.\\w+$"))
alert($find regular expression("domain.com","\\w+\\.\\w+$"))
Link to post
Share on other sites

Pash's answer is correct,as you asked that the sites end with .com,but to make a generic server URL parser is pretty much impossible 

 

sites like amazon.co.uk,shopping.amazon.co.uk

Link to post
Share on other sites

if you really need to parse gTLD or ccTLD with 100% accuracy, you can use TLDextract
 
as the creator says:
 

TLDExtract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and subdomains of a URL, e.g. domain parser. For example, say you want just the 'google' part of 'http://www.google.com'.

Everybody gets this wrong. Splitting on the '.' and taking the last 2 elements goes a long way only if you're thinking of simple e.g. .com domains. Think parsing http://forums.bbc.co.ukfor example: the naive splitting method above will give you 'co' as the domain and 'uk' as the TLD, instead of 'bbc' and 'co.uk' respectively.

TLDExtract on the other hand knows what all gTLDs and ccTLDs look like by looking up the currently living ones according to the Public Suffix List. So, given a URL, it knows its subdomain from its domain, and its domain from its country code.


You can install it via composer, write simple form then put it online so your bot can access it :D

 

or try to use tld extract directly..  :)

  • Like 1
Link to post
Share on other sites

Pash's answer is correct,as you asked that the sites end with .com,but to make a generic server URL parser is pretty much impossible 

 

sites like amazon.co.uk,shopping.amazon.co.uk

 

try

define $GetDomainName(#domain) {
    set(#domainOut,$find regular expression(#domain,"\\w+\\.\\w+$"),"Local")
    if($text length($replace regular expression(#domainOut,"\\..*","")) <= 2) {
        then {
            set(#domainOut,$find regular expression(#domain,"\\w+\\.\\w+\\.\\w+$"),"Local")
        }
    }
    return(#domainOut)
}
alert($GetDomainName("test.mydomain.com"))
alert($GetDomainName("amazon.co.uk"))
alert($GetDomainName("shopping.amazon.co.uk"))
  • Like 1
Link to post
Share on other sites

I don't have this written in ubot but this is what I do in .net to extract domain extension with 100% accuracy

 

1.add a list of domain extensions to a list

2.split the subdomain.yourdomain.co.uk into a list by dot.

3. in reverse order check if contained in list of extensions each time adding next list item so

 

uk

co.uk

yourdomain.co.uk

subdomain.yourdomain.co.uk

 

4.if exists in domain extension list add to new list.

5. check new list for item with most chars

6. you now have domain extension in the above example would be .co.uk

 

7.if you want to get subdomain replace domain extension with nothing in original domain variable.

leaving subdomain.yourdomain

 

8.if contains dot is a subdomain split again by the dot and first item is subdomain and second is domain.

 

if you only want to check against a few top level domains.com .org .co.uk etc then this is overkill but if you want to check for every domain extension possible this works very well.

 

thanks

kev123

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...