Hi,
I´m running Pgcluster 1.5(rc7) in production environment with the
following setup:
hosts:
db1: clusterdb + replicator
<Cluster_Server_Info>
<Host_Name> db1 </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7101 </Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
<Host_Name> db2 </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7101 </Recovery_Port>
</Cluster_Server_Info>
<Status_Log_File> /srv/pgcluster/pgreplicate.sts </Status_Log_File>
<Error_Log_File> /srv/pgcluster/pgreplicate.log </Error_Log_File>
<Replication_Port> 8001 </Replication_Port>
<Recovery_Port> 8101 </Recovery_Port>
<RLOG_Port> 8301 </RLOG_Port>
<Response_Mode> normal </Response_Mode>
<Use_Replication_Log> yes </Use_Replication_Log>
___________________________________________________________________
<Replicate_Server_Info>
<Host_Name> db1 </Host_Name>
<Port> 8001 </Port>
<Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Replicate_Server_Info>
<Host_Name> db2 </Host_Name>
<Port> 8001 </Port>
<Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Recovery_Port> 7101 </Recovery_Port>
<Rsync_Path> /usr/bin/rsync </Rsync_Path>
<Rsync_Option> ssh </Rsync_Option>
<Rsync_Compress> yes </Rsync_Compress>
<When_Stand_Alone> read_write </When_Stand_Alone>
<Status_Log_File> /srv/pgcluster/cluster.sts </Status_Log_File>
<Error_Log_File> /srv/pgcluster/cluster.err </Error_Log_File>
db2: clusterdb + replicator
<Cluster_Server_Info>
<Host_Name> db1 </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7101 </Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
<Host_Name> db2 </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7101 </Recovery_Port>
</Cluster_Server_Info>
<Replicate_Server_Info>
<Host_Name> db1 </Host_Name>
<Port> 8001 </Port>
<Recovery_Port> 8101 </Recovery_Port>
<LifeCheck_Port> 8201 </LifeCheck_Port>
</Replicate_Server_Info>
<Status_Log_File> /srv/pgcluster/pgreplicate.sts </Status_Log_File>
<Error_Log_File> /srv/pgcluster/pgreplicate.log </Error_Log_File>
<Replication_Port> 8001 </Replication_Port>
<Recovery_Port> 8101 </Recovery_Port>
<RLOG_Port> 8301 </RLOG_Port>
<Response_Mode> normal </Response_Mode>
<Use_Replication_Log> yes </Use_Replication_Log>
___________________________________________________________
<Replicate_Server_Info>
<Host_Name> db1 </Host_Name>
<Port> 8001 </Port>
<Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Replicate_Server_Info>
<Host_Name> db2 </Host_Name>
<Port> 8001 </Port>
<Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Recovery_Port> 7101 </Recovery_Port>
<Rsync_Path> /usr/bin/rsync </Rsync_Path>
<Rsync_Option> ssh </Rsync_Option>
<Rsync_Compress> yes </Rsync_Compress>
<When_Stand_Alone> read_write </When_Stand_Alone>
<Status_Log_File> /srv/pgcluster/cluster.sts </Status_Log_File>
<Error_Log_File> /srv/pgcluster/cluster.err </Error_Log_File>
I don´t use the pglb, i use LVS for that, but i also tryed it without load
balancing, with persistent und non persitent connections, it all makes no
difference.
About every 30 minutes my application using the database hangs for about 2
minutes.
i run the replicator in debug mode to find details:
2006-11-02 20:36:14 [18542] DEBUG:query=PGR_CLOSE_CONNECTION
2006-11-02 20:36:14 [18542] DEBUG:sem_lock [1] req
2006-11-02 20:36:14 [18543] DEBUG:cmdSts=O
2006-11-02 20:36:14 [18543] DEBUG:cmdType=x
2006-11-02 20:36:14 [18543] DEBUG:rlog=64
2006-11-02 20:36:14 [18543] DEBUG:port=5432
2006-11-02 20:36:14 [18543] DEBUG:pid=8457
2006-11-02 20:36:14 [18543] DEBUG:from_host=192.168.10.32
2006-11-02 20:36:14 [18543] DEBUG:dbName=suckerprod
2006-11-02 20:36:14 [18543] DEBUG:userName=pgman
2006-11-02 20:36:14 [18543] DEBUG:recieve sec=1162496174
2006-11-02 20:36:14 [18543] DEBUG:recieve usec=461004
2006-11-02 20:36:14 [18543] DEBUG:query_size=21
2006-11-02 20:36:14 [18543] DEBUG:request_id=0
2006-11-02 20:36:14 [18543] DEBUG:replicate_id=0
2006-11-02 20:36:14 [18543] DEBUG:query=PGR_CLOSE_CONNECTION
2006-11-02 20:36:14 [18543] DEBUG:sem_lock [1] req
2006-11-02 20:36:15 [18544] DEBUG:cmdSts=O
2006-11-02 20:36:15 [18544] DEBUG:cmdType=x
2006-11-02 20:36:15 [18544] DEBUG:rlog=64
2006-11-02 20:36:15 [18544] DEBUG:port=5432
2006-11-02 20:36:15 [18544] DEBUG:pid=8460
2006-11-02 20:36:15 [18544] DEBUG:from_host=192.168.10.32
2006-11-02 20:36:15 [18544] DEBUG:dbName=suckerprod
2006-11-02 20:36:15 [18544] DEBUG:userName=pgman
2006-11-02 20:36:15 [18544] DEBUG:recieve sec=1162496175
2006-11-02 20:36:15 [18544] DEBUG:recieve usec=218061
2006-11-02 20:36:15 [18544] DEBUG:query_size=21
2006-11-02 20:36:15 [18544] DEBUG:request_id=0
2006-11-02 20:36:15 [18544] DEBUG:replicate_id=0
2006-11-02 20:36:15 [18544] DEBUG:query=PGR_CLOSE_CONNECTION
2006-11-02 20:36:15 [18544] DEBUG:sem_lock [1] req
2006-11-02 20:36:15 [18545] DEBUG:cmdSts=Q
2006-11-02 20:36:15 [18545] DEBUG:cmdType=U
2006-11-02 20:36:15 [18545] DEBUG:rlog=0
2006-11-02 20:36:15 [18545] DEBUG:port=5432
2006-11-02 20:36:15 [18545] DEBUG:pid=8462
2006-11-02 20:36:15 [18545] DEBUG:from_host=192.168.10.32
2006-11-02 20:36:15 [18545] DEBUG:dbName=suckerprod
2006-11-02 20:36:15 [18545] DEBUG:userName=pgman
2006-11-02 20:36:15 [18545] DEBUG:recieve sec=1162496175
2006-11-02 20:36:15 [18545] DEBUG:recieve usec=550293
2006-11-02 20:36:15 [18545] DEBUG:query_size=217
2006-11-02 20:36:15 [18545] DEBUG:request_id=1
2006-11-02 20:36:15 [18545] DEBUG:replicate_id=0
2006-11-02 20:36:15 [18545] DEBUG:query=UPDATE com_gbvisit SET visit1 = '3730',
visit2 = '5475', visit3 = '5609', visit4 = '9341', visit5 = '20242', visit6 =
'1', visit7 = '17420', visit8 = '2903',
visit9 = '45', visit10 = '15213' WHERE gbowner_nr = '8453'
2006-11-02 20:36:15 [18545] DEBUG:sem_lock [1] req
2006-11-02 20:36:15 [18546] DEBUG:cmdSts=Q
2006-11-02 20:36:15 [18546] DEBUG:cmdType=U
2006-11-02 20:36:15 [18546] DEBUG:rlog=0
2006-11-02 20:36:15 [18546] DEBUG:port=5432
2006-11-02 20:36:15 [18546] DEBUG:pid=8461
2006-11-02 20:36:15 [18546] DEBUG:from_host=192.168.10.32
2006-11-02 20:36:15 [18546] DEBUG:dbName=suckerprod
2006-11-02 20:36:15 [18546] DEBUG:userName=pgman
2006-11-02 20:36:15 [18546] DEBUG:recieve sec=1162496175
2006-11-02 20:36:15 [18546] DEBUG:recieve usec=552288
2006-11-02 20:36:15 [18546] DEBUG:query_size=113
2006-11-02 20:36:15 [18546] DEBUG:request_id=1
2006-11-02 20:36:15 [18546] DEBUG:replicate_id=0
2006-11-02 20:36:15 [18546] DEBUG:query=UPDATE com_user SET email = 'blabla',
sex = 'w', geburtsdatum = '1993-09-18' WHERE user_nr = '7536'
2006-11-02 20:36:15 [18546] DEBUG:sem_lock [1] req
2006-11-02 20:38:19 [18187] DEBUG:deleteTransactionTbl():
2006-11-02 20:38:19 [18184] DEBUG:sem_unlock[1]
2006-11-02 20:38:19 [18184] DEBUG:PGRdo_replicate():PGRreplicate_packet_send
returns 0
2006-11-02 20:38:19 [18184] DEBUG:replicate_loop():session closed
this is the intresting moment. he won´t gain any semaphore for many
seconds, after some time he does this deleteTransactionTbl(): and after that it
runs again for about 30 minutes....
i can also send more lines of my debug file if needed.
What are the requirements of semaphores and shared memory for pgcluster?
I hope someone can help me.
Thank you in advance,
Wanja
_____________________________________________________________________
Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
http://smartsurfer.web.de/?mc=100071&distributionid=000000000066
|