I showed this to my friend who's a FreeBSD committer (Adrian Chadd) and he's
actually setting up a MacOS/X box at the moment and will look into it -
assuming you don't discover the problem first...
Chris
> -----Original Message-----
> From: pgsql-hackers-owner@xxxxxxxxxxxxxx
> [mailto:pgsql-hackers-owner@xxxxxxxxxxxxxx]On Behalf Of Tom Lane
> Sent: Tuesday, 30 April 2002 1:26 PM
> To: pgsql-hackers@xxxxxxxxxxxxxx
> Cc: Francois Suter
> Subject: [HACKERS] Mac OS X: system shutdown prevents checkpoint
>
>
> I've been looking into Francois Suter's recent reports of Postgres not
> shutting down cleanly on Mac OS X 10.1. I find that it's quite
> reproducible. If you tell the system to shut down in the normal
> fashion (eg, pick "Shut Down" from the Apple menu), the postmaster
> does not terminate, leading to WAL recovery upon restart --- or
> even worse, failure to restart if the postmaster PID recorded in the
> lockfile happens to get assigned to some other daemon.
>
> Observe the normal trace of postmaster shutdown (running with -d4,
> logging of timestamps and PIDs enabled):
>
> 2002-04-30 00:08:30 [315] DEBUG: pmdie 15
> 2002-04-30 00:08:30 [315] DEBUG: smart shutdown request
> 2002-04-30 00:08:30 [331] DEBUG: shutting down
> 2002-04-30 00:08:32 [331] DEBUG: database system is shut down
> 2002-04-30 00:08:32 [331] DEBUG: proc_exit(0)
> 2002-04-30 00:08:32 [331] DEBUG: shmem_exit(0)
> 2002-04-30 00:08:32 [331] DEBUG: exit(0)
> 2002-04-30 00:08:32 [315] DEBUG: reaping dead processes
> 2002-04-30 00:08:32 [315] DEBUG: proc_exit(0)
> 2002-04-30 00:08:32 [315] DEBUG: shmem_exit(0)
> 2002-04-30 00:08:32 [315] DEBUG: exit(0)
>
> The postmaster (here PID 315) forks a subprocess to flush shared buffers
> and checkpoint the WAL log. When the subprocess exits, the postmaster
> removes its lockfile and shuts down. The subprocess takes a minimum of
> 2 seconds because there's a sleep(2) in the checkpoint fsync code.
>
> Now here's what I see in the case of shutting down the OS X system:
>
> 2002-04-30 00:25:35 [376] DEBUG: pmdie 15
> 2002-04-30 00:25:35 [376] DEBUG: smart shutdown request
>
> ... and nothing more. Actual system shutdown (power down) occurred at
> approximately 00:26:06 by my watch, over thirty seconds later than the
> postmaster received SIGTERM. So there was plenty of time to do the
> checkpoint subprocess. (Indeed, I believe that thirty seconds is the
> grace period Darwin's init process allows SIGTERM'd processes before
> giving up and hard-killing them. So the system was actually sitting and
> waiting for the postmaster.)
>
> What we appear to have here is that the kernel is not allowing the
> postmaster to fork a checkpoint subprocess. But there's no indication
> that the postmaster got a fork() error return, either. Seems like it's
> just hung.
>
> Does this ring a bell with anyone? Is it an OSX bug, or a "feature";
> and if the latter, how can we work around it?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@xxxxxxxxxxxxxx
>
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@xxxxxxxxxxxxxx)
|
Try Searching:
servers, voip, java, networking, microsoft ...
|
|
|
|