|
We are
preparing to implement the rules below to 'touch' a file each XXX seconds if
events were received. To know if events were received we are counting every new
event. I didn't want to have to count every event, but I have observed
situations where events quit being processed (e.g., if we lose DB comm) and the
timer still fires.
While
the file is being 'touched' from the rules, the crontab is running a perl script
that does a stat() on the file to see how long since it was last modified.
I.e.,
...
$now=time();
($device, $inode, $mode, $nlink, $uid, $gid, $rdev, $size, $atime, $mtime,
$ctime, $blksize, $blocks) = stat($Tec_check_file);
$diff=$now-$mtime; if ($diff > $threshold)
{
do_something; # send emails, pages
}
...
(I
haven't looked at the other methods posted yet - they might be more efficient
than mine)
-James
--------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % %
This rule counts each event. Used by tec_hb. % rule: init_count_each_evt:
( event: _event of_class _class,
reception_action:
( get_global_var('TEC_HB', 'COUNT',
_old_count, 0), _new_count is _old_count +
1, set_global_var('TEC_HB', 'COUNT',
_new_count) ) ).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % %
This rule schedules a tec_hb timer. % rule:
init_tec_start_schedule_hb_timer: ( event: _event of_class
'TEC_Start',
reception_action:
( get_global_var('TEC_HB', 'TIMER_STARTED',
_started, 'NOPE'), _started ==
'NOPE', set_global_var('TEC_HB',
'TIMER_STARTED', 'YEP'), first_instance(event:
_tic of_class 'TEC_Tick' where []),
set_timer(_tic, 30, 'tec_hb') ) ).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % %
Every _duration seconds we'll 'touch' a file for tec hb. % timer_rule:
tec_hb_touch: ( event: _tic of_class 'TEC_Tick' where
[], timer_info: equals 'tec_hb', timer_duration:
_duration,
action: (
get_global_var('TEC_HB',
'COUNT', _count, 0), get_global_var('TEC_HB',
'LAST_COUNT', _last_count, 0),
set_global_var('TEC_HB', 'LAST_COUNT',
_count), _interval_count is _count -
_last_count,
% only continue w/
'touch' if _interval_count > 0.
_interval_count > 0,
get_local_time(_time_local_struct),
resolve_time(_time_local_struct, _seconds, _minutes, _hours, _day_of_month,
_month0, _year0, _day_of_week, _day_of_year,
_daylight_savings), _year4 is _year0 +
1900, _month is _month0 +
1,
sprintf(_log_entry,
'%04d-%02d-%02d/%02d:%02d:%02d Events(Total/Interval):%d/%d', [_year4, _month,
_day_of_month,_hours,_minutes,_seconds, _count, _interval_count
]), % Probably want to change file mode from
a->w. fopen(_hbfile,
'/Tivoli/custom/log/dm_hb/heartbeat.tec', a),
fprintf(_hbfile,'%s\n',[_log_entry]),
fclose(_hbfile) ),
action:
( set_timer(_tic, _duration,
'tec_hb') )
).
Hi
list,
Just
curious...
How do you monitor
the availability of your Tivoli environment?
When you have a
single TMR environment with a separated TMR- and TECserver, your automated
incident registration is connected to you TMR. Then the monitoring of the
availability of your TEC is essential. What we need is an indication in case
of unvailability of the TEC server.
When a TEC server
is shutdown using the wstopesvr command a TEC_Stop event is generated which is
visible on the TEC console. In this case you will get a notification that the
eventserver is unavailable. In the sitiuation when the tec_* processes are
killed (or aborted by a coredump) or the eventserver gets overflooded by
events the console is unable to detect the unavailability.
This is because
the TEC (java) console queries the DB directly and does not communicate with
the tec_ui_server when no modifications are made to the interface by human
intervention (acknowledgement / closing).
Has anyone found
the ultimate solution, or does anyone know about future developments
concerning TEC Console which wil deal with this problem?
Cheers,
Peter
================================================ De informatie opgenomen
in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de
geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht
de inhoud niet te gebruiken en de afzender direct te informeren door het
bericht te retourneren.
================================================ The information
contained in this message may be confidential and is intended to be
exclusively for the addressee. Should you receive this message
unintentionally, please do not use the contents herein and notify the
sender immediately by return e-mail.
|