logo       

Parsing an HTML doc in to DOM: msg#00049

mozilla.devel.dom

Subject: Parsing an HTML doc in to DOM

Hi,
I'm trying to parse an HTML doc(given it's URL) into an
nsIDOMHTMLDocument using the mozilla htmlparser.
The main question is - where can I QI nsIHTMLContentSink from?
This seems like the biggest problem right now, beacuse all the other
code is pretty much done.

What I actually need is extracting all the URL links from the given
HTML doc. My first idea was to simply inherit a class from
nsIHTMLContentSink and override it's OpenXXX/CloseXXX routines and
intercept the data I need from there. Unfortunately, my attempt to do
that ended with the parser only invoking OpenHTML() & OpenBody() and
nothing else, so I'm now trying to simply turn the HTML into a DOM and
work with it.

Any suggestions?
Thanks in advance.


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise