\b # Match the leading part (proto://hostname, or just hostname) ( # ftp://, http://, or https:// leading part (ftp|https?)://[-\w]+(\.\w[-\w]*)+ | # or, try to find a hostname with our more specific sub-expression (?i: [a-z0-9] (?:[-a-z0-9]*[a-z0-9])? \. )+ # sub domains # Now ending .com, etc. For these, require lowercase (?-i: com\b | edu\b | biz\b | gov\b | in(?:t|fo)\b # .int or .info | mil\b | net\b | org\b | [a-z][a-z]\b # two-letter country codes ) ) # Allow an optional port number ( : \d+ )? # The rest of the URL is optional, and begins with / . . . ( / # The rest are heuristics for what seems to work well [^.!,?;"'<>()\[\]{}\s\x7F-\xFF]* (?: [.!,?]+ [^.!,?;"'<>()\[\]{}\s\x7F-\xFF]+ )* )? ----------------------------------------------------------------------------- Copyright 1997-2024 Jeffrey Friedl