[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

how to fast processing one million strings to remove quotes

On 08/02/2017 10:05 AM, Daiyue Weng wrote:
> Hi, I am trying to removing extra quotes from a large set of strings (a
> list of strings), so for each original string, it looks like,
> """str_value1"",""str_value2"",""str_value3"",1,""str_value4"""
> I like to remove the start and end quotes and extra pairs of quotes on each
> string value, so the result will look like,
> "str_value1","str_value2","str_value3",1,"str_value4"


This part can also be done fairly efficiently with sed:

time cat hugequote.txt | sed 's/"""/"/g;s/""/"/g' >/dev/null

real    0m2.660s
user    0m2.635s
sys     0m0.055s

hugequote.txt is a file with 1M copies of your test string above in it.

Run on a quad core i5 on FreeBSD 10.3-STABLE.