osdir.com
mailing list archive F.A.Q. -since 2001!



Subject: Question about 2.6.5 kernel scalability -
msg#00019

List: linux.smp

Mail Archive Navigation:
by Date: Prev Next Date Index by Thread: Prev Next Thread Index

I am currently measuring the performance of a directory server, OpenLDAP
on IBM xSeries 455(8-way).
First see the performance result(operations/sec) with increase in the
number of CPUs.

1 : 6,100
2: 10,000
4: 17,600
5: 14,200
6: 13,200
7: 13,100
8: 11,500

As you can see, the throughput starts degrading when the number of CPUs is
4.
when the number of CPUs is 8, the throughput is about 65% of 4 CPUs.
During experiment, CPU utilization of all cases are about 93%.
The directory server are multithreaded and in the experiment 16~32 work
threads were used.
I am currently investigating to find the reason of degradation.
I guess that there might be scheduling anomaly on SMP.
If any one knows any strange behavior of Linux 2.6.5 scheduler, please let
share the knowledge.
It'll will help me a lot.

Sang Seok
-
To unsubscribe from this list: send the line "unsubscribe linux-smp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html



Thread at a glance:

Previous Message by Date:

"Unexpected IO-APIC was found" message

Hi - I just updated one of my servers to 2.4.29 and I'm getting a message at boot time saying: An unexpected IO-APIC was found. If this kernel release is less than three months old please report this to linux-smp@xxxxxxxxxxxxxxx The full APIC message is: [snip] ENABLING IO-APIC IRQs Setting 2 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 2 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-5, 2-10, 2-11, 2-12, 2-17, 2-18, 2-19, 2-22 not connected. ..TIMER: vector=0x31 pin1=2 pin2=0 number of MP IRQ sources: 17. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00178014 ....... : max redirection entries: 0017 ....... : PRQ implemented: 1 ....... : IO APIC version: 0014 An unexpected IO-APIC was found. If this kernel release is less than three months old please report this to linux-smp@xxxxxxxxxxxxxxx .... register #02: 02000000 ....... : arbitration: 02 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 003 03 0 0 0 0 0 1 1 39 02 003 03 0 0 0 0 0 1 1 31 03 003 03 0 0 0 0 0 1 1 41 04 003 03 0 0 0 0 0 1 1 49 05 000 00 1 0 0 0 0 0 0 00 06 003 03 0 0 0 0 0 1 1 51 07 003 03 0 0 0 0 0 1 1 59 08 003 03 0 0 0 0 0 1 1 61 09 003 03 0 0 0 0 0 1 1 69 0a 000 00 1 0 0 0 0 0 0 00 0b 000 00 1 0 0 0 0 0 0 00 0c 000 00 1 0 0 0 0 0 0 00 0d 003 03 0 0 0 0 0 1 1 71 0e 003 03 0 0 0 0 0 1 1 79 0f 003 03 0 0 0 0 0 1 1 81 10 003 03 1 1 0 1 0 1 1 89 11 000 00 1 0 0 0 0 0 0 00 12 000 00 1 0 0 0 0 0 0 00 13 000 00 1 0 0 0 0 0 0 00 14 003 03 1 1 0 1 0 1 1 91 15 003 03 1 1 0 1 0 1 1 99 16 000 00 1 0 0 0 0 0 0 00 17 003 03 1 1 0 1 0 1 1 A1 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 IRQ16 -> 0:16 IRQ20 -> 0:20 IRQ21 -> 0:21 IRQ23 -> 0:23 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 3014.5721 MHz. ..... host bus clock speed is 200.9712 MHz. cpu: 0, clocks: 2009712, slice: 669904 CPU0<T0:2009712,T1:1339808,D:0,S:669904,C:2009712> cpu: 1, clocks: 2009712, slice: 669904 CPU1<T0:2009712,T1:669904,D:0,S:669904,C:2009712> checking TSC synchronization across CPUs: passed. Waiting on wait_init_idle (map = 0x2) All processors have done init_idle [snip] If any more information is needed, mail me (assuming this is a list I'm not subscribed so you may need to CC me). Thanks Marcus -- Marcus Williams -- http://www.cad-schroer.co.uk CAD Schroer UK, 39 Newnham Road, Cambridge, UK - To unsubscribe from this list: send the line "unsubscribe linux-smp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html

Next Message by Date:

Re: Question about 2.6.5 kernel scalability

A common bottle neck for systems is, memory bandwidth. It could be that by adding more processors your maxing out the memory bus. Remember each processor must go through the same ASIC when accessing memory. This is where NUMA is supposed to enhance memory bandwidth, also Unisys ES7000 systems have 4 ASIC's to increase memory bandwidth and reduce memory contention. There are many ways to reduce memory contention but it usually comes in the form of special hardware(i.e. large ThirdLevelCache or even FourthLevelCache) and/or applications that are specifically designed for this purpose (i.e. special programming techniques like processor affinity for bound IO). This is also the same reason why you don't generally see intel servers with more than 8 physical CPU's. Intel actually usues a technology called "Fusion Bar" to connect 2 groups of 4 processors to reach a total of 8 processors and its primary purpose is to reduce contention on IO buses (primarily the memory bus). Here is link about measuring memory bandwidth, http://www.streambench.org/ Just my 2 cents, Earle Sang s Lim wrote: I am currently measuring the performance of a directory server, OpenLDAP on IBM xSeries 455(8-way). First see the performance result(operations/sec) with increase in the number of CPUs. 1 : 6,100 2: 10,000 4: 17,600 5: 14,200 6: 13,200 7: 13,100 8: 11,500 As you can see, the throughput starts degrading when the number of CPUs is 4. when the number of CPUs is 8, the throughput is about 65% of 4 CPUs. During experiment, CPU utilization of all cases are about 93%. The directory server are multithreaded and in the experiment 16~32 work threads were used. I am currently investigating to find the reason of degradation. I guess that there might be scheduling anomaly on SMP. If any one knows any strange behavior of Linux 2.6.5 scheduler, please let share the knowledge. It'll will help me a lot. Sang Seok - To unsubscribe from this list: send the line "unsubscribe linux-smp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-smp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html

Previous Message by Thread:

"Unexpected IO-APIC was found" message

Hi - I just updated one of my servers to 2.4.29 and I'm getting a message at boot time saying: An unexpected IO-APIC was found. If this kernel release is less than three months old please report this to linux-smp@xxxxxxxxxxxxxxx The full APIC message is: [snip] ENABLING IO-APIC IRQs Setting 2 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 2 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-5, 2-10, 2-11, 2-12, 2-17, 2-18, 2-19, 2-22 not connected. ..TIMER: vector=0x31 pin1=2 pin2=0 number of MP IRQ sources: 17. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00178014 ....... : max redirection entries: 0017 ....... : PRQ implemented: 1 ....... : IO APIC version: 0014 An unexpected IO-APIC was found. If this kernel release is less than three months old please report this to linux-smp@xxxxxxxxxxxxxxx .... register #02: 02000000 ....... : arbitration: 02 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 003 03 0 0 0 0 0 1 1 39 02 003 03 0 0 0 0 0 1 1 31 03 003 03 0 0 0 0 0 1 1 41 04 003 03 0 0 0 0 0 1 1 49 05 000 00 1 0 0 0 0 0 0 00 06 003 03 0 0 0 0 0 1 1 51 07 003 03 0 0 0 0 0 1 1 59 08 003 03 0 0 0 0 0 1 1 61 09 003 03 0 0 0 0 0 1 1 69 0a 000 00 1 0 0 0 0 0 0 00 0b 000 00 1 0 0 0 0 0 0 00 0c 000 00 1 0 0 0 0 0 0 00 0d 003 03 0 0 0 0 0 1 1 71 0e 003 03 0 0 0 0 0 1 1 79 0f 003 03 0 0 0 0 0 1 1 81 10 003 03 1 1 0 1 0 1 1 89 11 000 00 1 0 0 0 0 0 0 00 12 000 00 1 0 0 0 0 0 0 00 13 000 00 1 0 0 0 0 0 0 00 14 003 03 1 1 0 1 0 1 1 91 15 003 03 1 1 0 1 0 1 1 99 16 000 00 1 0 0 0 0 0 0 00 17 003 03 1 1 0 1 0 1 1 A1 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 IRQ16 -> 0:16 IRQ20 -> 0:20 IRQ21 -> 0:21 IRQ23 -> 0:23 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 3014.5721 MHz. ..... host bus clock speed is 200.9712 MHz. cpu: 0, clocks: 2009712, slice: 669904 CPU0<T0:2009712,T1:1339808,D:0,S:669904,C:2009712> cpu: 1, clocks: 2009712, slice: 669904 CPU1<T0:2009712,T1:669904,D:0,S:669904,C:2009712> checking TSC synchronization across CPUs: passed. Waiting on wait_init_idle (map = 0x2) All processors have done init_idle [snip] If any more information is needed, mail me (assuming this is a list I'm not subscribed so you may need to CC me). Thanks Marcus -- Marcus Williams -- http://www.cad-schroer.co.uk CAD Schroer UK, 39 Newnham Road, Cambridge, UK - To unsubscribe from this list: send the line "unsubscribe linux-smp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html

Next Message by Thread:

Re: Question about 2.6.5 kernel scalability

A common bottle neck for systems is, memory bandwidth. It could be that by adding more processors your maxing out the memory bus. Remember each processor must go through the same ASIC when accessing memory. This is where NUMA is supposed to enhance memory bandwidth, also Unisys ES7000 systems have 4 ASIC's to increase memory bandwidth and reduce memory contention. There are many ways to reduce memory contention but it usually comes in the form of special hardware(i.e. large ThirdLevelCache or even FourthLevelCache) and/or applications that are specifically designed for this purpose (i.e. special programming techniques like processor affinity for bound IO). This is also the same reason why you don't generally see intel servers with more than 8 physical CPU's. Intel actually usues a technology called "Fusion Bar" to connect 2 groups of 4 processors to reach a total of 8 processors and its primary purpose is to reduce contention on IO buses (primarily the memory bus). Here is link about measuring memory bandwidth, http://www.streambench.org/ Just my 2 cents, Earle Sang s Lim wrote: I am currently measuring the performance of a directory server, OpenLDAP on IBM xSeries 455(8-way). First see the performance result(operations/sec) with increase in the number of CPUs. 1 : 6,100 2: 10,000 4: 17,600 5: 14,200 6: 13,200 7: 13,100 8: 11,500 As you can see, the throughput starts degrading when the number of CPUs is 4. when the number of CPUs is 8, the throughput is about 65% of 4 CPUs. During experiment, CPU utilization of all cases are about 93%. The directory server are multithreaded and in the experiment 16~32 work threads were used. I am currently investigating to find the reason of degradation. I guess that there might be scheduling anomaly on SMP. If any one knows any strange behavior of Linux 2.6.5 scheduler, please let share the knowledge. It'll will help me a lot. Sang Seok - To unsubscribe from this list: send the line "unsubscribe linux-smp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-smp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
blog comments powered by Disqus

Home | News | Sitemap | FAQ | advertise | OSDir is an Inevitable website. GBiz is too!