Update of /cvsroot/ssic-linux/openssi/docs
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv29813/docs
Modified Files:
Tag: OPENSSI-FC
Introduction-to-SSI README.CVIP README.networking
Log Message:
Update to release 1.1.0
Index: README.networking
===================================================================
RCS file: /cvsroot/ssic-linux/openssi/docs/README.networking,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- README.networking 6 Jul 2004 23:15:21 -0000 1.3
+++ README.networking 24 Sep 2004 00:51:25 -0000 1.4
@@ -101,7 +101,8 @@
it in the network-scripts/ifcfg-eth# in each /cluster/node#/etc/sysconfig.
Adding a new interface (non-cluster interconnect) to a node can be
-done using the redhat-config-network or netconfig command on the node
+done using the system-config-network (redhat-config-network on RH9
+systems) or netconfig command on the node
you want to add the interface to. NOTE that there
appears to be no man page for netconfig. "netconfig --help" gives some
hints and it is imperative to run with --device=xxxx (eg --device=eth1).
Index: README.CVIP
===================================================================
RCS file: /cvsroot/ssic-linux/openssi/docs/README.CVIP,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -d -r1.1 -r1.2
--- README.CVIP 6 Jul 2004 23:16:01 -0000 1.1
+++ README.CVIP 24 Sep 2004 00:51:25 -0000 1.2
@@ -142,7 +142,8 @@
used to communicate outside the cluster) may already be configured or may
need to be configured. The can be configured to be on a different subnet
than that being used for the cluster interconnect. Configuring them
-can be done via redhat-configure-network. Once they are configured one
+can be done via system-configure-network (redhat-config-network on RH9
+systems). Once they are configured one
can run "onall ifconfig" to see that nodes with 2 interfaces have an eth0
and an eth1.
@@ -186,9 +187,16 @@
they must have a route back to the external environment and the gateway
you specify for them must have IP forwarding on.
h. If there are server nodes on an internal network that is private,
- you must use LVS-NAT and set up NAT on the director nodes. In
+ you must use LVS-NAT and set up NAT on the director node. In
addition, each server node must have a route back to the external
environment and the gateway you specify for them must have IP forwarding on.
+ To set LVS up to be NAT-routing, you must create the file
+ /etc/default/lvs_routing with the contents:
+ LVS_ROUTING=NAT
+ LVS_INTERNAL_GW=<ip-address of cluster interface on CVIP director node>
+ Use the standard mechanisms to set up NAT.
+
+
During the reboot you should see the ha-lvs service started up on all
nodes although it will exit after doing it's thing on non-director nodes;
Index: Introduction-to-SSI
===================================================================
RCS file: /cvsroot/ssic-linux/openssi/docs/Introduction-to-SSI,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -d -r1.5 -r1.6
--- Introduction-to-SSI 22 Sep 2004 17:50:42 -0000 1.5
+++ Introduction-to-SSI 24 Sep 2004 00:51:25 -0000 1.6
@@ -16,6 +16,7 @@
XIII: HPTC Middleware
XIV: libcluster, cluster.h and programming
XV: System Management
+XVI: Functional Updates from the 1.0.0 release
I: Overview
@@ -42,7 +43,7 @@
(e.g. cpuinfo, meminfo, etc.).
The networking model has 2 parts to it. First, each node has one or more
-addresses that are only locally visible. One address is used for
+addresses that act in a per-node manner. One address is used for
kernel-to-kernel communication to support the SSI. That address can
also be used for MPI or other cross-node application communication.
Second, to make the cluster look like a single, highly available
@@ -60,12 +61,11 @@
There are very few new commands, the idea being that standard single
system commands should just work. There is a command to see
the cluster membership (cluster). There are some commands to get
-processes launched on particular nodes (onnode,onall,migrate) and to see
+processes launched on particular nodes (onnode,onall,fast,migrate) and to see
where processes are running (where_pid). There are a couple of commands to
control load levelling (loads, loadlevel) and a command to run other programs
in a "localview" mode (localview - more on this in the process management and
-IPC sections below). This release has a first cut at man pages for many
-of the commands.
+IPC sections below). Man pages should exist for all of the commands.
Many of the opensource HPTC middleware has been run on the SSI cluster,
along with some purchasable capabilities. The list of things tested
include MPICH, LAMPI, HP MPI, openPBS, ScalablePBS, SLURM, ganglia,
@@ -83,7 +83,8 @@
This release has a first cut at some of the man pages. There are also
documents in /usr/share/doc/openssi-tools/ on the following subjects:
README.X-Windows README.clusterfstab README.ipvs cdsl rc-design-notes
-README-mosixll README.cfs README.hardmounts README.upgrade.
+README-mosixll README.cfs README.hardmounts README.upgrade
+README-networking README.CVIP.
II: Installation
Installation has 3 major pieces - installing on the first node, adding nodes
@@ -201,7 +202,7 @@
to another card on another node if the first node leaves the cluster.
The "clustername" command returns the same result on all nodes and should
be associated with the CVIP. In /usr/share/doc/openssi-tools/ there is a
-document on ipvs.
+document called README.CVIP and another called README.ipvs.
X: Load Balancing
There are two forms of load balancing built into the system. First,
@@ -238,7 +239,7 @@
There are very few new commands, the idea being that standard single
system commands should just work. There is a command to see
the cluster membership (cluster). There are some commands to get
-processes launched on particular nodes (onnode,onall,migrate) and to see
+processes launched on particular nodes (onnode,onall,fast,migrate) and to see
where processes are running (where_pid). There are a couple of commands to
control load levelling (loads, loadlevel) and a command to run other programs
in a "localview" mode (localview - more on this in the process management and
@@ -247,10 +248,11 @@
where readdir of /proc only shows local processes. It also limits the
visibility of sysVipc objects. "onnode -l" allows you to
run a command on a specific node and only see the /proc of that node.
- ps shows all the processes on all the nodes (it is
-currently unmodified). We are trying to decide all the variants and
-options to give more selective output. Running ps under localview will
-just show the processes on that node.
+ ps shows all the processes on all the nodes (it is modified to
+include a --shownode option which changes the display output to have a
+column indicating which a node each process is running on. A --node
+option provides a way to indicate you only want the processes running on
+a particular set of nodes.
The unmodified top command is interesting - it does show all the
processes from all the nodes; the %cpu is w.r.t. to the node they are
running (which it doesn't tell you); the header info is a mixture of
@@ -262,8 +264,7 @@
to allow a clusterwide view of booting and service startup/management.
A little more about this is described below in the System Management
section.
- who is clusterwide. last (i.e. wtmp) has some problems with log
-rotate that must still be addressed.
+ who is clusterwide, as is last.
XIII: HPTC Middleware
Many of the opensource HPTC middleware has been run on the SSI cluster,
@@ -304,3 +305,27 @@
if the initnode fails (you can disable this if appropriate).
It is assumed that ntp will be enabled and run on each node so the
time on each node will be almost the same.
+
+XVI: Functional Updates from the 1.0.0 release
+ A. CFS performance enhancements;
+ B. atomic migration of process groups;
+ If you do a migrate command with a negative pid, the pgrp with
+ that number will either completely be migrated or none of the
+ processes in it will be migrated.
+ C. migration of thread groups;
+ If you migrate a member of a thread group, all members will migrate
+ with it.
+ D. support for LVS-NAT;
+ E. support for process migration while holding file record locks;
+ F. Add support for the Fedora Core 2 distribution;
+ G. Upgrade of the base Linux kernel to Fedora Core 1 (2.4.22 with FC1
+ patches)
+ H. Addition of the "fast" and "fastnode" commands and their man pages;
+ I. Added the /proc/cluster/{nm_rate, nm_log_threshold, and
+ nm_nodedown_disabled} files. nm_rate can be used to alter how
+ often node monitoring messages are exchanged (default is 1 per
+ second) and how long before a node is declared down (default
+ 10 seconds); nm_log_threshold indicates how many missed monitoring
+ messages before a kernel printf is put out (default 2);
+ nm_nodedown_disabled can be set to turn off nodedown detection, which
+ is useful if you need to go into the debugger on one of the nodes.
-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.
Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
|