logo       
Google Custom Search
    AddThis Social Bookmark Button

Re: Stripping HTML: msg#00021

Subject: Re: Stripping HTML
Le lundi 19 mai 2003 à 21:50, A. Pagaltzis écrivait:
> * Daniel Cutter <dcutter@xxxxxxxx> [2003-05-19 19:30]:
> > s/<.+?>//g;             # step 1
> > s/<.*?script//g;        # step 2
> > s/</&lt;/g; s/>/&gt;/g; # step 3
> > 
> > Step 2 removes the nasty, step three removes the unknown.
> 
> Step 1 and 2 are both broken. (Examples as to how left as an
> excercise for the reader.) Why not leave it at step three
> and call it a day?

Broken by stuff like:

<
body>
<
script>

. does not match \n, unless /s is given.

-- 
 Philippe "BooK" Bruhat

 The shortest distance between two points is not always the safest.
                                    (Moral from Groo The Wanderer #69 (Epic))




Try Searching:
servers, voip, java, networking, microsoft ...
<Prev in Thread] Current Thread [Next in Thread>