osdir.com
mailing list archive
Mozy Online Backup: 2GB Free. Automatic. Secure.

Subject: Re: zpool io to 6140 is really slow - msg#00079

List: os.solaris.opensolaris.performance

Date: Prev Next Index Thread: Prev Next Index
Asif Iqbal wrote:
> On Nov 19, 2007 11:47 PM, Richard Elling <Richard.Elling@xxxxxxx> wrote:
>
>> Asif Iqbal wrote:
>>
>>> I have the following layout
>>>
>>> A 490 with 8 1.8Ghz and 16G mem. 6 6140s with 2 FC controllers using
>>> A1 anfd B1 controller port 4Gbps speed.
>>> Each controller has 2G NVRAM
>>>
>>> On 6140s I setup raid0 lun per SAS disks with 16K segment size.
>>>
>>> On 490 I created a zpool with 8 4+1 raidz1s
>>>
>>> I am getting zpool IO of only 125MB/s with zfs:zfs_nocacheflush = 1 in
>>> /etc/system
>>>
>>> Is there a way I can improve the performance. I like to get 1GB/sec IO.
>>>
>>>
>> I don't believe a V490 is capable of driving 1 GByte/s of I/O.
>>
>
> Well I am getting ~190MB/s right now. I sure not hitting any where close
> to that ceiling
>
>
>> The V490 has two schizos and the schizo is not a full speed
>> bridge. For more information see Section 1.2 of:
>> http://www.sun.com/processors/manuals/External_Schizo_PRM.pdf
>>

[err - see Section 1.3]

You will notice from Table 1-1, the read bandwidth limit for a schizo
PCI leaf is
204 MBytes/s. With two schizos, you can expect to max out at 816
MBytes/s or
less, depending on resource contention. It makes no difference that a 4
Gbps FC
card could read 400 MBytes/s, the best you can do for the card is 204
MBytes/s.
1 GBytes/s of read throughput will not be attainable with a V490.
-- richard


Was this page helpful?
Yes No
Thread at a glance:

Previous Message by Date: click to view message preview

Re: [perf-discuss] [storage-discuss] zpool io to 6140 is really slow

On Nov 20, 2007 10:40 AM, Andrew Wilson <Andrew.W.Wilson@xxxxxxx> wrote: > > What kind of workload are you running. If you are you doing these > measurements with some sort of "write as fast as possible" microbenchmark, Oracle database with blocksize 16K .. populating the database as fast I can > once the 4 GB of nvram is full, you will be limited by backend performance > (FC disks and their interconnect) rather than the host / controller bus. > > Since, best case, 4 gbit FC can transfer 4 GBytes of data in about 10 > seconds, you will fill it up, even with the backend writing out data as fast > as it can, in about 20 seconds. Once the nvram is full, you will only see > the backend (e.g. 2 Gbit) rate. > > The reason these controller buffers are useful with real applications is > that they smooth the bursts of writes that real applications tend to > generate, thus reducing the latency of those writes and improving > performance. They will then "catch up" during periods when few writes are > being issued. But a typical microbenchmark that pumps out a steady stream of > writes won't see this benefit. > > Drew Wilson > > > > Asif Iqbal wrote: > On Nov 20, 2007 7:01 AM, Chad Mynhier <cmynhier@xxxxxxxxx> wrote: > > > On 11/20/07, Asif Iqbal <vadud3@xxxxxxxxx> wrote: > > > On Nov 19, 2007 1:43 AM, Louwtjie Burger <zabermeister@xxxxxxxxx> wrote: > > > On Nov 17, 2007 9:40 PM, Asif Iqbal <vadud3@xxxxxxxxx> wrote: > > > (Including storage-discuss) > > I have 6 6140s with 96 disks. Out of which 64 of them are Seagate > ST3300007FC (300GB - 10000 RPM FC-AL) > > Those disks are 2Gb disks, so the tray will operate at 2Gb. > > > That is still 256MB/s . I am getting about 194MB/s > > 2Gb fibre channel is going to max out at a data transmission rate > > But I am running 4GB fiber channels with 4GB NVRAM on a 6 tray of > 300GB FC 10K rpm (2Gb/s) disks > > So I should get "a lot" more than ~ 200MB/s. Shouldn't I? > > > > > around 200MB/s rather than the 256MB/s that you'd expect. Fibre > channel uses an 8-bit/10-bit encoding, so it transmits 8-bits of data > in 10 bits on the wire. So while 256MB/s is being transmitted on the > connection itself, only 200MB/s of that is the data that you're > transmitting. > > Chad Mynhier > > > > > > > -- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu

Next Message by Date: click to view message preview

Re: [perf-discuss] zpool io to 6140 is really slow

I recently tweaked Oracle (8K blocks, log_buffer gt 2M) on a Solaris AMD64 system for max performance on a Sun 6140 with one tray of 73 GB 15K RPM drives. Definitely needed to place the datafiles and redo logs on isolated RAID groups. Wasn't sure how many blocks Oracle batches for IO. Used dtrace's bitesize script to generate the distributions below. Based on the dtrace output, and after testing multiple segment sizes, finally settled on Segment Size (stripe size) 256K for both datafiles and redo logs. Also observed performance boost by using forcedirectio and noatime on the 6140 mount points and observed smoother performance by using 2M pagesize (MPSS) by adding the line below to Oracle's .profile (and verified with pmap -s [ORACLE PID]|grep 2M). Oracle MPSS .profile LD_PRELOAD=$LD_PRELOAD:mpss.so.1 MPSSHEAP=2M MPSSSTACK=2M export LD_PRELOAD MPSSHEAP MPSSSTACK MPSSERRFILE=~/mpsserr export MPSSERRFILE Here's the final 6140 config: Oracle datafiles => 12 drives RAID 10 Sement Size 256 Oracle redo log A => 2 drives RAID 0 Sement Size 256 Oracle redo log B => 2 drives RAID 0 Sement Size 256 ./bitesize.d 1452 ora_dbw2_prf02\0 value ------------- Distribution ------------- count 16384 | 0 32768 |@@@@@@@@@@@@@@@@@@@@ 1 65536 | 0 131072 |@@@@@@@@@@@@@@@@@@@@ 1 262144 | 0 1454 ora_dbw3_prf02\0 value ------------- Distribution ------------- count 4096 | 0 8192 |@@@@@@@@@@@@@@@@@@@@@@@ 4 16384 |@@@@@@ 1 32768 |@@@@@@ 1 65536 | 0 131072 |@@@@@@ 1 262144 | 0 1448 ora_dbw0_prf02\0 value ------------- Distribution ------------- count 4096 | 0 8192 |@@@@@@@@@@@@@@@@@@@@@@ 5 16384 |@@@@@@@@@@@@@ 3 32768 | 0 65536 | 0 131072 |@@@@ 1 262144 | 0 1450 ora_dbw1_prf02\0 value ------------- Distribution ------------- count 65536 | 0 131072 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2 262144 | 0 1458 ora_ckpt_prf02\0 value ------------- Distribution ------------- count 8192 | 0 16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 43 32768 | 0 1456 ora_lgwr_prf02\0 value ------------- Distribution ------------- count 256 | 0 512 |@@@@@@@@ 24 1024 |@@@@ 12 2048 |@@@@@ 15 4096 |@@@@@ 14 8192 | 0 16384 | 1 32768 |@ 4 65536 | 0 131072 |@ 4 262144 |@@ 6 524288 |@@@@@@@@@@@@@@ 42 1048576 | 0 This email message is for the sole use of the intended recipient(s) and may contain GDC4S confidential or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not an intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----Original Message----- From: perf-discuss-bounces-xZgeD5Kw2fzokhkdeNNY6A@xxxxxxxxxxxxxxxx [mailto:perf-discuss-bounces-xZgeD5Kw2fzokhkdeNNY6A@xxxxxxxxxxxxxxxx] On Behalf Of Asif Iqbal Sent: Tuesday, November 20, 2007 3:08 PM To: Andrew.W.Wilson-xsfywfwIY+M@xxxxxxxxxxxxxxxx Cc: zfs-discuss-xZgeD5Kw2fzokhkdeNNY6A@xxxxxxxxxxxxxxxx; perf-discuss-xZgeD5Kw2fzokhkdeNNY6A@xxxxxxxxxxxxxxxx; storage-discuss-xZgeD5Kw2fzokhkdeNNY6A@xxxxxxxxxxxxxxxx Subject: Re: [perf-discuss] [storage-discuss] zpool io to 6140 is really slow On Nov 20, 2007 10:40 AM, Andrew Wilson <Andrew.W.Wilson-xsfywfwIY+M@xxxxxxxxxxxxxxxx> wrote: > > What kind of workload are you running. If you are you doing these > measurements with some sort of "write as fast as possible" > microbenchmark, Oracle database with blocksize 16K .. populating the database as fast I can > once the 4 GB of nvram is full, you will be limited by backend > performance (FC disks and their interconnect) rather than the host / controller bus. > > Since, best case, 4 gbit FC can transfer 4 GBytes of data in about 10 > seconds, you will fill it up, even with the backend writing out data > as fast as it can, in about 20 seconds. Once the nvram is full, you > will only see the backend (e.g. 2 Gbit) rate. > > The reason these controller buffers are useful with real applications > is that they smooth the bursts of writes that real applications tend > to generate, thus reducing the latency of those writes and improving > performance. They will then "catch up" during periods when few writes > are being issued. But a typical microbenchmark that pumps out a steady > stream of writes won't see this benefit. > > Drew Wilson > > > > Asif Iqbal wrote: > On Nov 20, 2007 7:01 AM, Chad Mynhier > <cmynhier-Re5JQEeQqe8AvxtiuMwx3w@xxxxxxxxxxxxxxxx> wrote: > > > On 11/20/07, Asif Iqbal <vadud3-Re5JQEeQqe8AvxtiuMwx3w@xxxxxxxxxxxxxxxx> > wrote: > > > On Nov 19, 2007 1:43 AM, Louwtjie Burger > <zabermeister-Re5JQEeQqe8AvxtiuMwx3w@xxxxxxxxxxxxxxxx> wrote: > > > On Nov 17, 2007 9:40 PM, Asif Iqbal > <vadud3-Re5JQEeQqe8AvxtiuMwx3w@xxxxxxxxxxxxxxxx> wrote: > > > (Including storage-discuss) > > I have 6 6140s with 96 disks. Out of which 64 of them are Seagate > ST3300007FC (300GB - 10000 RPM FC-AL) > > Those disks are 2Gb disks, so the tray will operate at 2Gb. > > > That is still 256MB/s . I am getting about 194MB/s > > 2Gb fibre channel is going to max out at a data transmission rate > > But I am running 4GB fiber channels with 4GB NVRAM on a 6 tray of > 300GB FC 10K rpm (2Gb/s) disks > > So I should get "a lot" more than ~ 200MB/s. Shouldn't I? > > > > > around 200MB/s rather than the 256MB/s that you'd expect. Fibre > channel uses an 8-bit/10-bit encoding, so it transmits 8-bits of data > in 10 bits on the wire. So while 256MB/s is being transmitted on the > connection itself, only 200MB/s of that is the data that you're > transmitting. > > Chad Mynhier > > > > > > > -- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu _______________________________________________ perf-discuss mailing list perf-discuss-xZgeD5Kw2fzokhkdeNNY6A@xxxxxxxxxxxxxxxx

Previous Message by Thread: click to view message preview

Re: [zfs-discuss] zpool io to 6140 is really slow

On Nov 19, 2007 11:47 PM, Richard Elling <Richard.Elling-xsfywfwIY+M@xxxxxxxxxxxxxxxx> wrote: > Asif Iqbal wrote: > > I have the following layout > > > > A 490 with 8 1.8Ghz and 16G mem. 6 6140s with 2 FC controllers using > > A1 anfd B1 controller port 4Gbps speed. > > Each controller has 2G NVRAM > > > > On 6140s I setup raid0 lun per SAS disks with 16K segment size. > > > > On 490 I created a zpool with 8 4+1 raidz1s > > > > I am getting zpool IO of only 125MB/s with zfs:zfs_nocacheflush = 1 in > > /etc/system > > > > Is there a way I can improve the performance. I like to get 1GB/sec IO. > > > > I don't believe a V490 is capable of driving 1 GByte/s of I/O. Well I am getting ~190MB/s right now. I sure not hitting any where close to that ceiling > The V490 has two schizos and the schizo is not a full speed > bridge. For more information see Section 1.2 of: > http://www.sun.com/processors/manuals/External_Schizo_PRM.pdf > > -- richard > > > Currently each lun is setup as primary A1 and secondary B1 or vice versa > > > > I also have write cache eanble according to CAM > > > > > > -- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu

Next Message by Thread: click to view message preview

How to estimate system performance

A general question. For an new application, it will do some calculation, also will do some I/O e.g. receiving UDP packets, do some calculation and processing, send them out. How to estimate the system resource usgae for this application before it is implimented, such CPU usage and I/O bottle neck potential issue ? From what perspective I can consider or analyse this ? Thanks. This message posted from opensolaris.org
Sign up for updates to this mailing list. email:
Loading Comments...
Home | News | Patents | Sitemap | FAQ | advertise

Advertising by