|
|
Subject: Re: DDN hints? - msg#00070
List: file-systems.lustre.user
Makia Minich wrote:
How much luck did you have with this tuning and OFED's SRP? What performance
are you seeing? We had done quite a bit of testing playing with this option,
but saw very little improvement in performance (if I remember correctly, the
block sizes did increase, but performance was still down).
That's what I saw as well. I eventually got great performance writing
with /dev/sg* devices by tuning srp_sg_tablesize (it defaults to 12
which sent 48KB io's to the array) the but I could never get /dev/sd*
devices to perform and reading was always stuck at 128KB io's no matter
what I passed into to srp_sg_tablesize.
On Friday 18 May 2007 10:38:29 am chas williams - CONTRACTOR wrote:
In message
<17997.38024.295869.482039-Ta083ccyJzMiZlmDfzyYUtBPR1lH4CV8@xxxxxxxxxxxxxxxx>,"John
R. Dunning"
wri
tes:
I tried incorporating the blkdev-max-io-size-selection and
increase-sglist-size patches from cfs, but that didn't really help, my
reads are still maxing out at 256K.
the srp initator creates a virtual scsi device driver. this virtual
device driver has a .max_sectors paramters associated with it. you can
tune this with the max_sect= during login for the openfabrics stack.
no idea, how this is tuned on ibgold.
take a look at
/sys/block/sd<whatever>/queue/{max_hw_sectors_kb,max_sectors_kb}
if you arent using direct i/o, use direct i/o. you could just tune
the page size of the ddn to 256k.
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss-KYPl3Ael/zSakBO8gow8eQ@xxxxxxxxxxxxxxxx
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: DDN hints?
How much luck did you have with this tuning and OFED's SRP? What performance
are you seeing? We had done quite a bit of testing playing with this option,
but saw very little improvement in performance (if I remember correctly, the
block sizes did increase, but performance was still down).
On Friday 18 May 2007 10:38:29 am chas williams - CONTRACTOR wrote:
> In message
> <17997.38024.295869.482039-Ta083ccyJzMiZlmDfzyYUtBPR1lH4CV8@xxxxxxxxxxxxxxxx>,"John
> R. Dunning"
> wri
>
> tes:
> >I tried incorporating the blkdev-max-io-size-selection and
> >increase-sglist-size patches from cfs, but that didn't really help, my
> > reads are still maxing out at 256K.
>
> the srp initator creates a virtual scsi device driver. this virtual
> device driver has a .max_sectors paramters associated with it. you can
> tune this with the max_sect= during login for the openfabrics stack.
> no idea, how this is tuned on ibgold.
>
> take a look at
> /sys/block/sd<whatever>/queue/{max_hw_sectors_kb,max_sectors_kb}
>
> if you arent using direct i/o, use direct i/o. you could just tune
> the page size of the ddn to 256k.
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss-KYPl3Ael/zSakBO8gow8eQ@xxxxxxxxxxxxxxxx
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
--
Makia Minich <minich-1Heg1YXhbW8@xxxxxxxxxxxxxxxx>
National Center for Computation Science
Oak Ridge National Laboratory
Phone: 865.574.7460
--*--
Imagine no possessions
I wonder if you can
- John Lennon
Next Message by Date:
click to view message preview
Re: DDN hints?
well... i suspect tuning the i/o sizes to be larger didnt make a big
difference on reads. it helps to get the write to atleast match the page
size on the ddn's memory cache (2MB as i recall, but this can be tuned to
a smaller value). this will let most devices "write through" the memory
cache directly to disk. as you get farther and farther away from your
storage, you need to increase the message size to offset bandwidth*delay.
after a bit of fiddling, we managed to get:
Using Minimum Record Size 1024 KB
Auto Mode 2. This option is obsolete. Use -az -i0 -i1
O_DIRECT feature enabled
Command line used: /data1/iozone.ia64 -f testfile -y 1024k -A -I
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
KB reclen write rewrite read reread
524288 1024 270275 430822 354823 355126
524288 2048 412733 545186 673329 679848
524288 4096 533884 619260 1048551 1053483
524288 8192 606596 665102 1201478 1192968
524288 16384 662077 698136 1333838 1334341
this was a single host using 2 ddr adapters striped across 8 luns on
the ddn. each lun on the ddn was across 14 tiers. obviously a 512MB
test file fits inside the ddn cache. the ddn should be able to go faster,
but my single host couldnt push harder.
In message <200705181114.50119.minich-1Heg1YXhbW8@xxxxxxxxxxxxxxxx>,Makia
Minich writes:
>How much luck did you have with this tuning and OFED's SRP? What performance
>are you seeing? We had done quite a bit of testing playing with this option,
>but saw very little improvement in performance (if I remember correctly, the
>block sizes did increase, but performance was still down).
>
>On Friday 18 May 2007 10:38:29 am chas williams - CONTRACTOR wrote:
>> In message
>> <17997.38024.295869.482039-Ta083ccyJzMiZlmDfzyYUtBPR1lH4CV8@xxxxxxxxxxxxxxxx>,"John
>> R. Dunning"
>> wri
>>
>> tes:
>> >I tried incorporating the blkdev-max-io-size-selection and
>> >increase-sglist-size patches from cfs, but that didn't really help, my
>> > reads are still maxing out at 256K.
>>
>> the srp initator creates a virtual scsi device driver. this virtual
>> device driver has a .max_sectors paramters associated with it. you can
>> tune this with the max_sect= during login for the openfabrics stack.
>> no idea, how this is tuned on ibgold.
>>
>> take a look at
>> /sys/block/sd<whatever>/queue/{max_hw_sectors_kb,max_sectors_kb}
>>
>> if you arent using direct i/o, use direct i/o. you could just tune
>> the page size of the ddn to 256k.
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss-KYPl3Ael/zSakBO8gow8eQ@xxxxxxxxxxxxxxxx
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
>--
>Makia Minich <minich-1Heg1YXhbW8@xxxxxxxxxxxxxxxx>
>National Center for Computation Science
>Oak Ridge National Laboratory
>Phone: 865.574.7460
>--*--
>Imagine no possessions
>I wonder if you can
>- John Lennon
>
Previous Message by Thread:
click to view message preview
Re: DDN hints?
How much luck did you have with this tuning and OFED's SRP? What performance
are you seeing? We had done quite a bit of testing playing with this option,
but saw very little improvement in performance (if I remember correctly, the
block sizes did increase, but performance was still down).
On Friday 18 May 2007 10:38:29 am chas williams - CONTRACTOR wrote:
> In message
> <17997.38024.295869.482039-Ta083ccyJzMiZlmDfzyYUtBPR1lH4CV8@xxxxxxxxxxxxxxxx>,"John
> R. Dunning"
> wri
>
> tes:
> >I tried incorporating the blkdev-max-io-size-selection and
> >increase-sglist-size patches from cfs, but that didn't really help, my
> > reads are still maxing out at 256K.
>
> the srp initator creates a virtual scsi device driver. this virtual
> device driver has a .max_sectors paramters associated with it. you can
> tune this with the max_sect= during login for the openfabrics stack.
> no idea, how this is tuned on ibgold.
>
> take a look at
> /sys/block/sd<whatever>/queue/{max_hw_sectors_kb,max_sectors_kb}
>
> if you arent using direct i/o, use direct i/o. you could just tune
> the page size of the ddn to 256k.
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss-KYPl3Ael/zSakBO8gow8eQ@xxxxxxxxxxxxxxxx
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
--
Makia Minich <minich-1Heg1YXhbW8@xxxxxxxxxxxxxxxx>
National Center for Computation Science
Oak Ridge National Laboratory
Phone: 865.574.7460
--*--
Imagine no possessions
I wonder if you can
- John Lennon
Next Message by Thread:
click to view message preview
Re: DDN hints?
well... i suspect tuning the i/o sizes to be larger didnt make a big
difference on reads. it helps to get the write to atleast match the page
size on the ddn's memory cache (2MB as i recall, but this can be tuned to
a smaller value). this will let most devices "write through" the memory
cache directly to disk. as you get farther and farther away from your
storage, you need to increase the message size to offset bandwidth*delay.
after a bit of fiddling, we managed to get:
Using Minimum Record Size 1024 KB
Auto Mode 2. This option is obsolete. Use -az -i0 -i1
O_DIRECT feature enabled
Command line used: /data1/iozone.ia64 -f testfile -y 1024k -A -I
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
KB reclen write rewrite read reread
524288 1024 270275 430822 354823 355126
524288 2048 412733 545186 673329 679848
524288 4096 533884 619260 1048551 1053483
524288 8192 606596 665102 1201478 1192968
524288 16384 662077 698136 1333838 1334341
this was a single host using 2 ddr adapters striped across 8 luns on
the ddn. each lun on the ddn was across 14 tiers. obviously a 512MB
test file fits inside the ddn cache. the ddn should be able to go faster,
but my single host couldnt push harder.
In message <200705181114.50119.minich-1Heg1YXhbW8@xxxxxxxxxxxxxxxx>,Makia
Minich writes:
>How much luck did you have with this tuning and OFED's SRP? What performance
>are you seeing? We had done quite a bit of testing playing with this option,
>but saw very little improvement in performance (if I remember correctly, the
>block sizes did increase, but performance was still down).
>
>On Friday 18 May 2007 10:38:29 am chas williams - CONTRACTOR wrote:
>> In message
>> <17997.38024.295869.482039-Ta083ccyJzMiZlmDfzyYUtBPR1lH4CV8@xxxxxxxxxxxxxxxx>,"John
>> R. Dunning"
>> wri
>>
>> tes:
>> >I tried incorporating the blkdev-max-io-size-selection and
>> >increase-sglist-size patches from cfs, but that didn't really help, my
>> > reads are still maxing out at 256K.
>>
>> the srp initator creates a virtual scsi device driver. this virtual
>> device driver has a .max_sectors paramters associated with it. you can
>> tune this with the max_sect= during login for the openfabrics stack.
>> no idea, how this is tuned on ibgold.
>>
>> take a look at
>> /sys/block/sd<whatever>/queue/{max_hw_sectors_kb,max_sectors_kb}
>>
>> if you arent using direct i/o, use direct i/o. you could just tune
>> the page size of the ddn to 256k.
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss-KYPl3Ael/zSakBO8gow8eQ@xxxxxxxxxxxxxxxx
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
>--
>Makia Minich <minich-1Heg1YXhbW8@xxxxxxxxxxxxxxxx>
>National Center for Computation Science
>Oak Ridge National Laboratory
>Phone: 865.574.7460
>--*--
>Imagine no possessions
>I wonder if you can
>- John Lennon
>
|
|