osdir.com
mailing list archive
Mozy Online Backup: 2GB Free. Automatic. Secure.

Subject: Re: Parser for restricted GML... - msg#00411

List: gis.jump.devel

Date: Prev Next Index Thread: Prev Next Index
This would be especially important if you had multiple
FeatureCollections stored in one file, and you wanted to load only one
of them.

Sunburned Surveyor wrote:
> Paul is correct. The pull parser does not reduce the memory of the
> parsing results, but it does reduce the memory used during the parsing
> process. That is because an in-memory representation of the entire XML
> document is not constructed.
>
> One advantage of this is using the parser to select only data within
> the XML file that meets specific criteria. For example, if we had a
> 50MB SGF file representing the city of Stockton, I could parse the
> file and create only building features, even thought the file might
> contain road features, landmark features, park features. etc.
> In fact, I could even parse the file and only create features for
> buildings whose "building type" attribute was set to "Public". This
> allows me to extract the information I want without reading all 50 MB
> into memory.
>
> The Sunburned Surveyor
>
> On 8/30/07, Paul Austin <mail-lists@xxxxxxxxxxxx> wrote:
>
>> Hi Larry,
>>
>> You are correct that the resulting data set will take up a lot of memory
>> at the end, the advantage with the pull parser is that you don't take up
>> a whole bunch of extra memory for the XML DOM structures which typically
>> get loaded into memory for the whole document. So with the pull parser
>> there is little memory overhead where as for DOM you have probably at
>> least 2x memory required to load if not more
>>
>> Paul
>>
>> Larry Becker wrote:
>>
>>> It isn't the parser that takes up the memory except temporarily), but
>>> the memory resident dataset after loading. This will still limit the
>>> size.
>>>
>>> Larry
>>>
>>> On 8/30/07, Sunburned Surveyor <sunburned.surveyor@xxxxxxxxx> wrote:
>>>
>>>
>>>> Yup. It makes you wonder why they didn't use pull parsers from the
>>>> very beginning, doesn't it.
>>>>
>>>> SS
>>>>
>>>> On 8/30/07, Paul Austin <mail-lists@xxxxxxxxxxxx> wrote:
>>>>
>>>>
>>>>> Agreed the pull parser is the only way to go for large XML files
>>>>>
>>>>> Paul
>>>>>
>>>>> Sunburned Surveyor wrote:
>>>>>
>>>>>
>>>>>> Martin,
>>>>>>
>>>>>> If we decide to support a restricted form of GML 2 we could build our
>>>>>> reader and writer on top of the XML Pull Parser from Sun. This would
>>>>>> help us to avoid memory problems when reading in large files.
>>>>>>
>>>>>> https://sjsxp.dev.java.net/
>>>>>>
>>>>>> Just a thought.
>>>>>>
>>>>>> The Sunburned Surveyor
>>>>>>
>>>>>> -------------------------------------------------------------------------
>>>>>> This SF.net email is sponsored by: Splunk Inc.
>>>>>> Still grepping through log files to find problems? Stop.
>>>>>> Now Search log events and configuration files using AJAX and a browser.
>>>>>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>>>>>> _______________________________________________
>>>>>> Jump-pilot-devel mailing list
>>>>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx
>>>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>>>>
>>>>>>
>>>>>>
>>>>> -------------------------------------------------------------------------
>>>>> This SF.net email is sponsored by: Splunk Inc.
>>>>> Still grepping through log files to find problems? Stop.
>>>>> Now Search log events and configuration files using AJAX and a browser.
>>>>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>>>>> _______________________________________________
>>>>> Jump-pilot-devel mailing list
>>>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx
>>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>>>
>>>>>
>>>>>
>>>> -------------------------------------------------------------------------
>>>> This SF.net email is sponsored by: Splunk Inc.
>>>> Still grepping through log files to find problems? Stop.
>>>> Now Search log events and configuration files using AJAX and a browser.
>>>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>>>> _______________________________________________
>>>> Jump-pilot-devel mailing list
>>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx
>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>>
>>>>
>>>>
>>>
>>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems? Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>> _______________________________________________
>> Jump-pilot-devel mailing list
>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx
>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>
>>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Jump-pilot-devel mailing list
> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>
>

--
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/


Was this page helpful?
Yes No
Thread at a glance:

Previous Message by Date: click to view message preview

Re: Parser for restricted GML...

I agree with both Paul and Larry! Another good reason for having a pull parser (or in general a streaming parser) is that it increases it's reusability for clients which can operate in a streaming fashion. This isn't OJ (at least currently), but I have other applications for which this was a necessity. At one point I actually rewrote the Shapefile parser to be streaming as well... Paul Austin wrote: > Hi Larry, > > You are correct that the resulting data set will take up a lot of memory > at the end, the advantage with the pull parser is that you don't take up > a whole bunch of extra memory for the XML DOM structures which typically > get loaded into memory for the whole document. So with the pull parser > there is little memory overhead where as for DOM you have probably at > least 2x memory required to load if not more > > Paul > > Larry Becker wrote: > >> It isn't the parser that takes up the memory except temporarily), but >> the memory resident dataset after loading. This will still limit the >> size. >> >> Larry >> >> On 8/30/07, Sunburned Surveyor <sunburned.surveyor@xxxxxxxxx> wrote: >> >> >>> Yup. It makes you wonder why they didn't use pull parsers from the >>> very beginning, doesn't it. >>> >>> SS >>> >>> On 8/30/07, Paul Austin <mail-lists@xxxxxxxxxxxx> wrote: >>> >>> >>>> Agreed the pull parser is the only way to go for large XML files >>>> >>>> Paul >>>> >>>> Sunburned Surveyor wrote: >>>> >>>> >>>>> Martin, >>>>> >>>>> If we decide to support a restricted form of GML 2 we could build our >>>>> reader and writer on top of the XML Pull Parser from Sun. This would >>>>> help us to avoid memory problems when reading in large files. >>>>> >>>>> https://sjsxp.dev.java.net/ >>>>> >>>>> Just a thought. >>>>> >>>>> The Sunburned Surveyor >>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.net email is sponsored by: Splunk Inc. >>>>> Still grepping through log files to find problems? Stop. >>>>> Now Search log events and configuration files using AJAX and a browser. >>>>> Download your FREE copy of Splunk now >> http://get.splunk.com/ >>>>> _______________________________________________ >>>>> Jump-pilot-devel mailing list >>>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx >>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >>>>> >>>>> >>>>> >>>> ------------------------------------------------------------------------- >>>> This SF.net email is sponsored by: Splunk Inc. >>>> Still grepping through log files to find problems? Stop. >>>> Now Search log events and configuration files using AJAX and a browser. >>>> Download your FREE copy of Splunk now >> http://get.splunk.com/ >>>> _______________________________________________ >>>> Jump-pilot-devel mailing list >>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx >>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >>>> >>>> >>>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Splunk Inc. >>> Still grepping through log files to find problems? Stop. >>> Now Search log events and configuration files using AJAX and a browser. >>> Download your FREE copy of Splunk now >> http://get.splunk.com/ >>> _______________________________________________ >>> Jump-pilot-devel mailing list >>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >>> >>> >>> >> >> > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Jump-pilot-devel mailing list > Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > > -- Martin Davis Senior Technical Architect Refractions Research, Inc. (250) 383-3022 ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/

Next Message by Date: click to view message preview

Re: Parser for restricted GML...

Hi, I would prefer not to have a need for one data exchange format for small datasets, and some other for big ones :) I suppose that it is just this high memory consumption during parsing that is limiting the size of WFS request in OpenJUMP. The memory is freed once the parsing is done, but the peak value is what matters. Just with WFS it is not such a big problem because the data can be collected with several small requests, but what if you received the data on CD or something? -Jukka- -----Original Message----- From: jump-pilot-devel-bounces@xxxxxxxxxxxxxxxxxxxxx on behalf of Larry Becker >True, if you have the case of one very large GML layer for your whole >map, but this is far from normal GIS. >Larry >On 8/30/07, Sunburned Surveyor <sunburned.surveyor@xxxxxxxxx> wrote: >> Paul is correct. The pull parser does not reduce the memory of the >> parsing results, but it does reduce the memory used during the parsing >> process. That is because an in-memory representation of the entire XML >> document is not constructed. > >> One advantage of this is using the parser to select only data within >> the XML file that meets specific criteria. For example, if we had a >> 50MB SGF file representing the city of Stockton, I could parse the >> file and create only building features, even thought the file might >> contain road features, landmark features, park features. etc. >> In fact, I could even parse the file and only create features for >> buildings whose "building type" attribute was set to "Public". This >> allows me to extract the information I want without reading all 50 MB >> into memory. > ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/

Previous Message by Thread: click to view message preview

Re: Parser for restricted GML...

Hi, I would prefer not to have a need for one data exchange format for small datasets, and some other for big ones :) I suppose that it is just this high memory consumption during parsing that is limiting the size of WFS request in OpenJUMP. The memory is freed once the parsing is done, but the peak value is what matters. Just with WFS it is not such a big problem because the data can be collected with several small requests, but what if you received the data on CD or something? -Jukka- -----Original Message----- From: jump-pilot-devel-bounces@xxxxxxxxxxxxxxxxxxxxx on behalf of Larry Becker >True, if you have the case of one very large GML layer for your whole >map, but this is far from normal GIS. >Larry >On 8/30/07, Sunburned Surveyor <sunburned.surveyor@xxxxxxxxx> wrote: >> Paul is correct. The pull parser does not reduce the memory of the >> parsing results, but it does reduce the memory used during the parsing >> process. That is because an in-memory representation of the entire XML >> document is not constructed. > >> One advantage of this is using the parser to select only data within >> the XML file that meets specific criteria. For example, if we had a >> 50MB SGF file representing the city of Stockton, I could parse the >> file and create only building features, even thought the file might >> contain road features, landmark features, park features. etc. >> In fact, I could even parse the file and only create features for >> buildings whose "building type" attribute was set to "Public". This >> allows me to extract the information I want without reading all 50 MB >> into memory. > ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/

Next Message by Thread: click to view message preview

Re: Parser for restricted GML...

>At one point I actually rewrote the Shapefile parser to be streaming as well... I guess I hadn't actually realized it wasn't until now. Do you remember what happened to the code, or why you didn't stay with the streaming version? A very large shape file seems like a more likely scenario that I actually care about. Larry On 8/30/07, Martin Davis <mbdavis@xxxxxxxxxxxxxxx> wrote: > This would be especially important if you had multiple > FeatureCollections stored in one file, and you wanted to load only one > of them. > > Sunburned Surveyor wrote: > > Paul is correct. The pull parser does not reduce the memory of the > > parsing results, but it does reduce the memory used during the parsing > > process. That is because an in-memory representation of the entire XML > > document is not constructed. > > > > One advantage of this is using the parser to select only data within > > the XML file that meets specific criteria. For example, if we had a > > 50MB SGF file representing the city of Stockton, I could parse the > > file and create only building features, even thought the file might > > contain road features, landmark features, park features. etc. > > In fact, I could even parse the file and only create features for > > buildings whose "building type" attribute was set to "Public". This > > allows me to extract the information I want without reading all 50 MB > > into memory. > > > > The Sunburned Surveyor > > > > On 8/30/07, Paul Austin <mail-lists@xxxxxxxxxxxx> wrote: > > > >> Hi Larry, > >> > >> You are correct that the resulting data set will take up a lot of memory > >> at the end, the advantage with the pull parser is that you don't take up > >> a whole bunch of extra memory for the XML DOM structures which typically > >> get loaded into memory for the whole document. So with the pull parser > >> there is little memory overhead where as for DOM you have probably at > >> least 2x memory required to load if not more > >> > >> Paul > >> > >> Larry Becker wrote: > >> > >>> It isn't the parser that takes up the memory except temporarily), but > >>> the memory resident dataset after loading. This will still limit the > >>> size. > >>> > >>> Larry > >>> > >>> On 8/30/07, Sunburned Surveyor <sunburned.surveyor@xxxxxxxxx> wrote: > >>> > >>> > >>>> Yup. It makes you wonder why they didn't use pull parsers from the > >>>> very beginning, doesn't it. > >>>> > >>>> SS > >>>> > >>>> On 8/30/07, Paul Austin <mail-lists@xxxxxxxxxxxx> wrote: > >>>> > >>>> > >>>>> Agreed the pull parser is the only way to go for large XML files > >>>>> > >>>>> Paul > >>>>> > >>>>> Sunburned Surveyor wrote: > >>>>> > >>>>> > >>>>>> Martin, > >>>>>> > >>>>>> If we decide to support a restricted form of GML 2 we could build our > >>>>>> reader and writer on top of the XML Pull Parser from Sun. This would > >>>>>> help us to avoid memory problems when reading in large files. > >>>>>> > >>>>>> https://sjsxp.dev.java.net/ > >>>>>> > >>>>>> Just a thought. > >>>>>> > >>>>>> The Sunburned Surveyor > >>>>>> > >>>>>> ------------------------------------------------------------------------- > >>>>>> This SF.net email is sponsored by: Splunk Inc. > >>>>>> Still grepping through log files to find problems? Stop. > >>>>>> Now Search log events and configuration files using AJAX and a browser. > >>>>>> Download your FREE copy of Splunk now >> http://get.splunk.com/ > >>>>>> _______________________________________________ > >>>>>> Jump-pilot-devel mailing list > >>>>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx > >>>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > >>>>>> > >>>>>> > >>>>>> > >>>>> ------------------------------------------------------------------------- > >>>>> This SF.net email is sponsored by: Splunk Inc. > >>>>> Still grepping through log files to find problems? Stop. > >>>>> Now Search log events and configuration files using AJAX and a browser. > >>>>> Download your FREE copy of Splunk now >> http://get.splunk.com/ > >>>>> _______________________________________________ > >>>>> Jump-pilot-devel mailing list > >>>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx > >>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > >>>>> > >>>>> > >>>>> > >>>> ------------------------------------------------------------------------- > >>>> This SF.net email is sponsored by: Splunk Inc. > >>>> Still grepping through log files to find problems? Stop. > >>>> Now Search log events and configuration files using AJAX and a browser. > >>>> Download your FREE copy of Splunk now >> http://get.splunk.com/ > >>>> _______________________________________________ > >>>> Jump-pilot-devel mailing list > >>>> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx > >>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > >>>> > >>>> > >>>> > >>> > >>> > >> ------------------------------------------------------------------------- > >> This SF.net email is sponsored by: Splunk Inc. > >> Still grepping through log files to find problems? Stop. > >> Now Search log events and configuration files using AJAX and a browser. > >> Download your FREE copy of Splunk now >> http://get.splunk.com/ > >> _______________________________________________ > >> Jump-pilot-devel mailing list > >> Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx > >> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > >> > >> > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. > > Still grepping through log files to find problems? Stop. > > Now Search log events and configuration files using AJAX and a browser. > > Download your FREE copy of Splunk now >> http://get.splunk.com/ > > _______________________________________________ > > Jump-pilot-devel mailing list > > Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx > > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > > > > > > -- > Martin Davis > Senior Technical Architect > Refractions Research, Inc. > (250) 383-3022 > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Jump-pilot-devel mailing list > Jump-pilot-devel@xxxxxxxxxxxxxxxxxxxxx > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > -- http://amusingprogrammer.blogspot.com/ ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/
Sign up for updates to this mailing list. email:
Loading Comments...
Home | News | Patents | Sitemap | FAQ | advertise

Advertising by