logo       

[summary] What does my Kernel do?: msg#00071

os.solaris.managers.summaries

Subject: [summary] What does my Kernel do?

Summary: How to figure what my Solaris Kernel does

Usual Suspects
--------------
* It is serving NFS ... this can use a lot of CPU. Make sure you
are running version 3.

* A fast (Gigabit) interface can almost fill a cpu if it is busy

* It is swapping. If the kernel runs out of memory it will spend most of its
time moving pages back and forth between disk and ram.

- run "vmstat 5" the sr (scan rate) column should be very low (<100) this
means the system is not scanning for free memory pages

- It may make sense to have a lot of swap space configured, as Solaris
does conservative memory allocation. When a process forks it
will immediately allocate all the memory necessary even though
it does not use it. Solaris does "copy on write" so why not
have this extra memory allocated in swap instead of real ram,
assuming it is never going to be used anyway. (correct me if I
am wrong here.)

* It is forking ... this does not have to be a real fork bomb, but just some
process quitting and being restarted immediately. Pidentd running
non multi-threaded may be such a software. Some cgi process could
also be it. This is detectable by looking at the 'last process
id' with a tool like top.

* It is running veritas volume manager and a disk has failed.

Useful Tools
------------

* lockstat

lockstat -gkIW sleep 60

gives a 60 second profile of the kernel


* iftop

http://www.ex-parrot.com/~pdw/iftop

will show which box is sending how much traffic through your interface

* se toolkit

www.setoolkit.com

virtual adrian may be able to give some hints onto where the performance
issues lie

* prstat

prstat -m

will show user vs system time for each process, so if it is a process
causing the problem it should show here

* truss

truss -c -p PID

can help to identify which system calls a problematic process is spending
its time on. A summary is printerd on ctrl-c

* iostat

iostat -xnP 30 30

shows where the system is writing and reading data and how much

* vmstat

vmstat 5

shows paging activity (check the sr column)

* kstat

Displays kernel statistics. Did not get any useful hints on what could be
discovered here ... but sure gives a lot of numbers

* prex

prex -k

Part of the solaris tracing architecture. Note, that this will just open
a shell where you are expected to enter commands to activate the tracing. I
got
the following example ... (reading the output is another issue)

# prex -k 1)
Type "help" for help ...
prex> buffer alloc 10m 2)
Buffer of size 10485760 bytes allocated
prex> enable $all 3)
prex> trace $all 4)
prex> ktrace on 5)
... wait a bit ...
prex> ktrace off
prex> untrace $all
prex> disable $all
prex> quit
# tnfxtract ./tnf.result 6)
# prex -k
Type "help" for help ...
prex> buffer dealloc 7)
prex> quit
# tnfdump ./tnf.result 8)

1) Issue prex command with kernel trace mode
2) You should allocate kernel in-core buffer to trace kernel activity.
3) Enable trace set named $all. You can specify your own trace facility
(tnf_name) set. (ie. all I/O operation) Refer prex man page.
4) Trace $all set.
5) Start kernel trace. Immediately kernel starts to collect tnf_probe and
store it kernel in-core buffer.
6) Extract contents of kernel buffer to file system.
7) Deallocate kernel in-core buffer. You should extract contents of buffer
before deallocate buffer. Contents of buffer will be erased immediately
when you issue "deallocate"
8) Convert raw tnf data to readable ASCII format.

Reading List
------------

Sun Performance and Tuning: Java and Internet, 2nd Edition (Adrian Cockcroft)
http://www.booksmatter.com/b0130952494.htm

Unlocking the kernel
http://www.sun.com/sun-on-net/itworld/UIR980801perf.html

Performance and Tuning on the Solaris 2.6, 7, and 8
http://developers.sun.com/solaris/articles/tuning_solaris.html


Contributors
------------

Markus Kluge, Ramiro Santos, Allen Wooden, przemol, Casper Dik, Jon Andrews,
Thomas 'Mike' Michlmayr, Amiel Lee Yee, William Hathaway, Jeff Vaneek, Frank
Smith,
Darren Dunham, Jon Andrews, Darren Dunham, Luc I. Suryo, Joe Fletcher, Mark
Pfeiffer,
Joohyun Cha, Karl Vogel, Todd M. Wilkinson.
Yesterday Tobias Oetiker wrote:

> Folks,
>
> We have this 4 Way Sun Enterprise 420R server. With 4GB Ram and
> about 10GB swap. It runs a ton of services (Apache, Postfix,
> Amavis, Spamassassin) and it also acts as a NFS server.
>
> Lately we are experiencing performance issues ... the box goes to
> load 17 and responds rather sluggishly.
> When looking at the load we often see the following picture:
>
> 50% User
> 50% Kernel
> 0% Idle
>
> The 50% User is easy to attribute by looking at the processes. But
> what is the system doing in the 50% kernel time?
>
> Is there something like kernel-top? I played around with lockstat
> a bit, but it did not really answer my questions ...
>
> We are running Solaris 8.
>
> cheers
> tobi
>

--
______ __ _
/_ __/_ / / (_) Oetiker @ ISG.EE, ETZ J97, ETH, CH-8092 Zurich
/ // _ \/ _ \/ / System Manager, Time Lord, Coder, Designer, Coach
/_/ \.__/_.__/_/ http://people.ee.ethz.ch/~oetiker +41(0)1-632-5286


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise