logo       

Re: PHP Regular expression help: msg#00017

php.tcphp

Subject: Re: PHP Regular expression help

If you use mine, do the right thing & change all references to
"domain.com" to "example.com", as you're supposed to.

Also, sorry about the line breaks.
Daniel J. Post

On Mon, 21 Mar 2005 15:32:12 -0600, Daniel J. Post
<danieljpost-Re5JQEeQqe8AvxtiuMwx3w@xxxxxxxxxxxxxxxx> wrote:
> I've seen and used weirder and more obfuscated, but nothing more
> complex than this one. I worked it out by hand using the RFCs. It is
> probably not complete, but was as close as I could stand to get it.
>
> // returns false if URL contains spaces or illegal characters.
> // ignores the http:// (programming should strip or require)
> // second parameter is whatever you want to return if $val is empty
> function djp_is_valid_url($val,$allow_empty=0)
> {
> if ($val)
> {
> $http='http(s)?://'; // http(s) is optional.
>
> // usernames and passwords, theoretically can be transmitted
> thru the url,
> // as username:password-9IKiO1iGCm/QT0dZR+AlfA@xxxxxxxxxxxxxxxx, etc.
> $userpass='[a-z0-9]{1,}:[a-z0-9]{1,}@';
>
> // domain is alphanumeric with hyphens,
> // any number of iterations, separated by periods.
> // followed by at least two alphanumeric with hyphens,
> // followed by a period, then 2-6 alphas (.com, .edu, .tv,
> .net, .museum, et al)
> $domain='((([0-9a-z-]*\.)*)?[0-9a-z-]{2,})+\.[a-z]{2,6}';
>
> // an ip address is 4 numerics delimited by periods.
> $ip='([0-9]{1,3}.){3}([0-9]{1,3}){1}';
>
> // and very optional port #,
> $port=':[0-9]{1,5}';
>
> // with an optional following slash.
> $slash='/?';
>
> // path must start with a slash, followed by optional tilde,
> // then alphanumeric with underscores, slashes and periods.
> $path='/(~?[0-9a-z/\._-])*';
>
> // "GET" strings. There's an easier way BEGGING to come out of
> this....
> $getparams='('.
> // the first get string starts with ?,
> // followed by Alphanumeric,'=',optional Alphanumeric or
> %(2hex characters)
> '(\?[_0-9a-z]{1,}=(([0-9a-z+/_*.-]*)|(%[a-f0-9]{2}))*)*'.
> // the second and subsequent get parameters start with &,
> // followed by Alphanumeric,'=',optional Alphanumeric or
> %(2hex characters)
> '(\&[_0-9a-z]{1,}=(([0-9a-z+/_*.-]*)|(%[a-f0-9]{2}))*)*'.
> // allow zero or one of the preceding.
> ')?';
>
> $fragment='(#([0-9a-z_.,]*))?';
> // assemble the parts & check.
>
> $validurl="^(($http)?(($userpass)?($domain)|($ip))($port)?){1}$slash($path$getparams$fragment)?$";
> // for extra credit, uncomment the next line and read the output
> // print("The regular expression that approximates a valid URL
> is:\n$validurl");
> if (eregi($validurl,$val))
> {
> return 1;
> }
> else
> {
> return 0;
> }
> }
> else return $allow_empty;
> }
>


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise