[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Large CSV files with headers


The tokenize language has a skipFirst you can use to skip the header line

There are a number of different CSV data formats / components you can
use, if the <csv> is not good enough for you.

On Tue, Apr 17, 2018 at 6:11 PM, Frizz <frizzthecat@xxxxxxxxxxxxxx> wrote:
> I have some CSV files with a header line, so setting useMaps="true" would
> be the natural thing to do. Works great.
> My CSV files are very big, so using streaming/parallelProcessing would be
> the natural thing to do. Also works great.
> Unfortunately using useMaps="true" AND streaming/parallelProcessing does
> not work: It results in lots of empty Lists/Maps. Which is understandable,
> but not nice.
>>> So the question remains: How to efficiently process large CSV files that
> have a header line? <<
> By the way, this is my route:
> <route id="CSVRoute">
>     <from uri="file:/tmp/data/" />
>     <split streaming="true" parallelProcessing="true">
>         <tokenize token="\n" />
>         <unmarshal>
>             <csv delimiter=";" useMaps="true" />
>         </unmarshal>
>         <log message="Got ${body}"/>
>         <to uri="mock:nextStageProcessor"/>
>     </split>
> </route>

Claus Ibsen
http://davsclaus.com @davsclaus
Camel in Action 2: https://www.manning.com/ibsen2