osdir.com
mailing list archive F.A.Q. -since 2001!



Subject: Re: [PATCH] sort: Add --threads option, which
parallelizes internal sort. - msg#00028

List: bug-coreutils-gnu

Mail Archive Navigation:
by Date: Prev Next Date Index by Thread: Prev Next Thread Index

Ralf Wildenhues wrote:
> Hi Paul, all,
> Paul Eggert writes:
>>
>> This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky
>> Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism
>> within an internal sort. On our simple tests on a 2-core desktop x86,
>> overall performance improved by roughly a factor of 1.6.
>
> This is too interesting to pass up.
>
> Example run, on an 8-way, and with cat'ed instances of the dictionary,
> on tmpfs, timings best of three:
>
> runtime [s] threads
> file size [MB] 1 2 4 8
> 1 0.06 0.04 0.03 0.04
> 2 0.13 0.09 0.07 0.07
> 4 0.28 0.20 0.16 0.15
> 8 0.61 0.43 0.34 0.32
> 16 1.34 0.94 0.74 0.72
> 32 3.00 2.06 1.63 1.57
> 64 6.36 4.38 3.44 3.32
> 128 13.49 9.30 7.13 7.24
> 256 28.62 19.49 15.17 15.18
>
> Here's the abbreviated 'time' output for the last row:
>
> 26.95user 1.67system 0:28.62elapsed 100%CPU
> 30.78user 1.98system 0:19.49elapsed 168%CPU
> 37.41user 2.04system 0:15.17elapsed 260%CPU
> 60.68user 2.79system 0:15.18elapsed 417%CPU
>
> It suggests to me that too much time is spent busy-waiting in pthread_join,
> or that sort is computing too much (I haven't looked at the patch in detail).
>
> Also, I'd have expected the rate going from 1 to 2 threads to get at least
> a bit better with bigger file size, but it remains remarkably constant,
> around 1.45 for this setup. What am I missing?

Thanks for doing that, Ralf.
Just to give everyone a heads-up,
I'm a little reluctant to apply this patch in its current
state, since it's not achieving reasonable efficiency.

I noticed that running the sole test,

env time make RUN_VERY_EXPENSIVE_TESTS=yes check -C tests \
TESTS=misc/sort-benchmark-random VERBOSE=yes

took about 36s on a pretty fast system.

changing the inner loop to this
print ((map {chr(32+rand(94))} (1..100)) , "\n");
reduced that to 25s

I changed it to write "exp" from within the script
that generated the data, but that didn't make a big difference.

Finally I realized that those 50M calls to "rand" were the
problem, and only about 4% of them were actually ever used
while sorting, since 94^4 is far more than 500k.

Adjusting accordingly brought the test's run time down to ~8s.
However, the resulting sort invocation took barely 1 second here,
and that's obviously too small to be useful. Ramping up to 5M lines,
the resulting test takes almost 2 minutes and the sort itself took 34s
on this particular quad-core system.

A more interesting test would be to ensure that when
run on a multi-core system sorting with --threads=2
is at least X% faster than sorting with --threads=1.

Also, it'd be nice to add the following:
- a test that uses --threads=N to exercise the new option-parsing code,
though if you adjust as suggested to compare --threads=N for
N=1&2, that's not needed.
- a NEWS item


>From b1332b9b429dfa9a8bba7286a78bbc7e37a2e419 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering@xxxxxxxxxx>
Date: Thu, 2 Apr 2009 12:35:08 +0200
Subject: [PATCH] tests: make sort-benchmark-random more efficient, then use a
larger test

* tests/misc/sort-benchmark-random: cut 75% off run time, then
increase input size by a factor of 10, so that sort runs for more
than 1 second on fast hardware.
---
tests/misc/sort-benchmark-random | 42 +++++++++++++++++++-------------------
1 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/tests/misc/sort-benchmark-random b/tests/misc/sort-benchmark-random
index 2bace4f..0c4d61d 100755
--- a/tests/misc/sort-benchmark-random
+++ b/tests/misc/sort-benchmark-random
@@ -34,33 +34,33 @@ export LC_ALL
fail=0

perl -e '
-my $num_lines = 500000;
+my $num_lines = 5000000;
my $length = 100;

+# Only the first few bytes of each line need to be random.
+# the remaining bytes are solely to simulate a more realistic work load.
+my $rand_bytes = 5; # per line, so that 94^5 >> $num_lines
+
+my @line;
for (my $i=0; $i < $num_lines; $i++)
{
- for (my $j=0; $j < $length; $j++)
- {
- printf "%c", 32 + rand(94);
- }
- print "\n";
-}' > in || framework_failure
-
-# We need to generate a lot of data for sort to show a noticeable
-# improvement in performance. Sorting it in PERL may take awhile.
-
-perl -e '
-open (FILE, "<in");
-my @list = <FILE>;
-print sort(@list);
-close (FILE);
-' > exp || framework_failure
-
-#start=$(date +%s)
+ my $line = join ("", map {chr(32+rand(94))} (1..$rand_bytes))
+ . ("0" x ($length - $rand_bytes)) . "\n";
+ print $line;
+ push @line, $line;
+}
+END {
+ open FH, ">", "exp" or die;
+ print FH sort @line;
+ close FH or die;
+}
+' > in || framework_failure
+
+start=$(date +%s)
time sort in > out || fail=1
-#duration=$(expr $(date +%s) - $start)
+duration=$(expr $(date +%s) - $start)

-#echo sorting the random data took $duration seconds
+echo sorting the random data took $duration seconds

compare out exp || fail=1

--
1.6.2.rc1.285.gc5f54


_______________________________________________
Bug-coreutils mailing list
Bug-coreutils@xxxxxxx
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Thread at a glance:

Previous Message by Date:

Re: git version and automake 1.10b question

hggdh wrote: ... > Thank you, both Eric and Andreas. > > I guess, then, it would be correct to update the references to > automake in README-{prereq|hacking}, and to update bootstrap.conf. > > Patch attached. Please keep in mind this is my first one following the > rules -- I may have made some mistakes. Also, I do not have a Fedora > system, so I do not know if the instructions in README-prereq are > correct in this regard. > > From 771fb4f10ea57b8797a37b2064e88df4630ae421 Mon Sep 17 00:00:00 2001 > From: C de-Avillez <hggdh2@xxxxxxxxx> > Date: Wed, 1 Apr 2009 09:35:28 -0500 > Subject: [PATCH] updated prereqs to automake-1.10b or later. Changed > README-hacking, > README-prereq, and bootstrap.conf Thanks. I've pushed the change below. Small changes: - add newline after summary line in commit log - omit your README-hacking change, since 1.10a is new enough for xz - use a slightly modified ChangeLog style for commit log entries. That means "* FILE: description..." for each file changed. As I write this, I realize that I should have mentioned your addition of the "next"-tracking git checkout command in the commit log. >From 2ade7643de7a5d85a9c029dc6b6e06d339ec26e1 Mon Sep 17 00:00:00 2001 From: C de-Avillez <hggdh2@xxxxxxxxx> Date: Wed, 1 Apr 2009 09:35:28 -0500 Subject: [PATCH] build: require automake-1.10b or newer * bootstrap.conf: Require at least automake-1.10b. * README-prereq: Mention 1.10b, not 1.10a. --- README-prereq | 10 ++++++---- bootstrap.conf | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/README-prereq b/README-prereq index d65fe48..f6ff48e 100644 --- a/README-prereq +++ b/README-prereq @@ -16,16 +16,18 @@ getting the prerequisites for particular systems. # yum install emacs #autoconf build requires emacs (20MB) # rpmbuild --rebuild http://download.fedora.redhat.com/pub/fedora/linux/development/source/SRPMS/autoconf-2.63-1.fc10.src.rpm # rpm -Uvh /usr/src/redhat/RPMS/noarch/autoconf-2.63-1.fc8.noarch.rpm - Note Autoconf 2.61a-341 or newer is needed to build automake-1.10a in step 3. + Note Autoconf 2.61a-341 or newer is needed to build automake-1.10b in step 3. Apply the same method to install the lzma package. [FIXME: mention xz when it's packaged] - 3. The latest released automake (1.10.1) was not new enough, so we download - and build automake-1.10a from its repository and make it available + 3. The latest stable automake (1.10.1) was not new enough, so we download + and build automake-1.10b or newer from its repository and make it available just to coreutils: # yum install help2man #required to build automake fully $ git clone git://git.sv.gnu.org/automake.git - $ cd automake && ./configure --prefix=$HOME/coreutils/deps + $ cd automake + $ git checkout --track origin/next -b next + $ ./configure --prefix=$HOME/coreutils/deps $ make install Now we can build coreutils as described in README-hacking diff --git a/bootstrap.conf b/bootstrap.conf index 0747bb8..4baebb0 100644 --- a/bootstrap.conf +++ b/bootstrap.conf @@ -177,7 +177,7 @@ gnulib_tool_option_extras="--tests-base=$bt/gnulib-tests --with-tests" # Build prerequisites buildreq="\ autoconf 2.61 -automake 1.10a +automake 1.10b autopoint - bison - gettext - -- 1.6.2.rc1.285.gc5f54 _______________________________________________ Bug-coreutils mailing list Bug-coreutils@xxxxxxx http://lists.gnu.org/mailman/listinfo/bug-coreutils

Next Message by Date:

Re: git version and automake 1.10b question

Jim Meyering wrote: > diff --git a/README-prereq b/README-prereq > - $ cd automake && ./configure --prefix=$HOME/coreutils/deps > + $ cd automake > + $ git checkout --track origin/next -b next > + $ ./configure --prefix=$HOME/coreutils/deps I needed to switch the git checkout parameters around with my version of git (1.5.3.6), i.e.: git checkout -b next --track origin/next Pádraig. _______________________________________________ Bug-coreutils mailing list Bug-coreutils@xxxxxxx http://lists.gnu.org/mailman/listinfo/bug-coreutils

Previous Message by Thread:

cat.c

Hi, in the coreutils source of cat /* cat -- concatenate files and print on the standard output. Copyright (C) 88, 90, 91, 1995-2009 Free Software Foundation, Inc. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. */ /* Differences from the Unix cat: * Always unbuffered, -u is ignored. * Usually much faster than other versions of cat, the difference is especially apparent when using the -v option. By tege@xxxxxxx, Torbjorn Granlund, advised by rms, Richard Stallman. */ ... many understandable lines ... while (++argind < argc); ... some lines... coreutil 7.2 file cat.c line 776 I realy don't understand why U use a while loop in this case... Ive test that : #include <stdio.h> #include <stdlib.h> int ok(int i, int j){ while(++ i < j); return i; } int ok2(int i, int j){ if (i < j) i = j; else i++; return i; } void test(int i, int j){ printf("%d %d\n", ok(i, j), ok2(i, j)); } int main(){ test(1,1); test(1,3); test(3,1); test(2,1); test(1,2); return 0; } the functions ok and ok2 returns same values, but ok2 is better. plz say me why U write a loop to do that... _______________________________________________ Bug-coreutils mailing list Bug-coreutils@xxxxxxx http://lists.gnu.org/mailman/listinfo/bug-coreutils

Next Message by Thread:

Re: [PATCH] sort: Add --threads option, which parallelizes internal sort.

>> Paul Eggert writes: >>> >>> This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky >>> Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism >>> within an internal sort. On our simple tests on a 2-core desktop x86, >>> overall performance improved by roughly a factor of 1.6. In my local branch with this change, I'm about to merge this into the primary change set. >From 2f4edf2e997e3354ca8179b181b1f95037834473 Mon Sep 17 00:00:00 2001 From: Jim Meyering <meyering@xxxxxxxxxx> Date: Fri, 3 Apr 2009 07:39:31 +0200 Subject: [PATCH] remove obsolete comment --- src/sort.c | 7 ++----- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/src/sort.c b/src/sort.c index b4f0dfb..b6e0ee8 100644 --- a/src/sort.c +++ b/src/sort.c @@ -1,5 +1,5 @@ /* sort - sort lines of text (with all kinds of options). - Copyright (C) 1988, 1991-2008 Free Software Foundation, Inc. + Copyright (C) 1988, 1991-2009 Free Software Foundation, Inc. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -2538,10 +2538,7 @@ sortlines_thread (void *data) singlethreaded, use the optimization suggested by Knuth exercise 5.2.4-10, which requires room for only 1.5*N lines, rather than the usual 2*N lines. Knuth writes that this memory optimization was - originally published by D. A. Bell, Comp J. 1 (1958), 75. - - This function is inline so that its tests for multthreadedness and - inplacedness can be optimized away in common cases. */ + originally published by D. A. Bell, Comp J. 1 (1958), 75. */ static void sortlines (struct line *restrict lines, size_t nlines, -- 1.6.2.rc1.285.gc5f54 _______________________________________________ Bug-coreutils mailing list Bug-coreutils@xxxxxxx http://lists.gnu.org/mailman/listinfo/bug-coreutils
blog comments powered by Disqus

Home | News | Sitemap | FAQ | advertise | OSDir is an Inevitable website. GBiz is too!