|
|
Subject: 7.4.xx regression - msg#00158
List: db.postgresql.bugs
I have the following tables and indexes
CREATE TABLE cddb
(
id integer PRIMARY KEY,
/* artist name */
name text,
/* disk title */
title text,
/* the type of the artist group, person, orchestra*/
type smallint,
/* the creation date */
created integer
);
CREATE INDEX cddb1 on cddb(title);
CREATE INDEX cddb2 on cddb(name);
CREATE TABLE cddbentry
(
id integer PRIMARY KEY,
diskid text,
entry integer REFERENCES cddb
);
CREATE INDEX cddbentry1 on cddbentry (diskid);
CREATE INDEX cddbentry2 on cddbentry (entry);
I am using the following query:
select cddb.* from cddb,cddbentry where cddbentry.diskid = 'toto' and
cddbentry.entry = cddb.id;
On version 7.3.4 it produces this query plan
Nested Loop (cost=0.00..41.25 rows=5 width=78)
-> Index Scan using cddbentry1 on cddbentry (cost=0.00..17.07
rows=5 width=4)
Index Cond: (diskid = 'toto'::text)
-> Index Scan using cddb_pkey on cddb (cost=0.00..4.82 rows=1 width=74)
Index Cond: ("outer".entry = cddb.id)
I have upgraded to version 7.4.0 (compiling the software and migrating
the database using dump/restore)
Now the following plan is produced
Hash Join (cost=17.08..42.15 rows=7 width=74)
Hash Cond: ("outer".id = "inner".entry)
-> Seq Scan on cddb (cost=0.00..20.00 rows=1000 width=74)
-> Hash (cost=17.07..17.07 rows=6 width=4)
-> Index Scan using cddbentry1 on cddbentry (cost=0.00..17.07
rows=6 width=4)
Index Cond: (diskid = 'toto'::text)
which result in a VERY much slower query as the cddb table has more than
1 million entry.... and there is at most one entry in cddbentry
which matches the diskid !
The workaround is to disable the hash join capability using set
enable_hashjoin to false; resulting in the same query plan as in 7.3.x.
I think this can classified as a regression bug.
N.B: The cost value may be wrong, because to be able to send you this
email, they have been produced on empty tables; but
I can guarantee you they are the same on the full loaded database.
Hope this help to improve this great product
Bernard SNYERS
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: BUG #1061: message type 0x49 arrived from server while
When grilled further on (Sat, 24 Jan 2004 15:54:07 -0500),
Tom Lane <tgl@xxxxxxxxxxxxx> confessed:
> > A custom Perl 5.6.1, DBI 1.30, DBD::Pg 1.22 script on Solaris 5.8.
> > Using pgsql client version 7.3.3. I'm thinking it may be related to
> > my forking? My program has a db connection, which is then overwritten
> > in all child processes, first thing upon forking:
>
> Judging by that trace, there are multiple processes trying to use the
> same database connection. Better take a closer look at what you are
> doing.
>
> I suspect that "overwriting" the connection is not good enough, and that
> you have to somehow actively close the DBI object inherited from the
> parent. The fact that the excess operations all seem to be empty
> queries is suspicious. I wonder whether those are coming from logic
> internal to DBI or DBD::Pg, that might still have access to the
> connection even though your Perl code doesn't anymore.
I have more of a clue now. I'm calling "DBI->connect_cached" when I fork, even
though I only use that particular db connection in the parent. connect_cached
sends a ping to verify the connection is alive, which is probably what the empty
queries are. The logs confirm that the pid associated with that connection
[721] and [1050] don't actually execute any requests. The new server is much
faster than the old one, so it is probably processing the packets faster, thus
occasionally getting into this problem when the fork timing is just wrong and
the dual packets are received before the first can be processed.
I can remove that second connection (which was used to vacuum) now that I have
pg_autovacuum running on the new server. I suspect this will clear up the empty
queries and the message coming back because of the overlapping empty queries.
Presuming this is the problem, I never would of found this without your help.
Thanks for taking the time. I'll not forget to be more careful of using
"connect_cached" next time around.
>
> Hmm ... AFAIR, libpq doesn't provide any clean way to close the socket
> and discard the PGconn without actually sending a Terminate message to
> the backend, which is what you'd want to happen in the child process.
> This seems like a shortcoming in its API.
>
Cheers,
Rob
--
14:08:58 up 27 days, 3:55, 4 users, load average: 2.05, 2.05, 2.05
pgpUeeFbgiY2a.pgp
Description: PGP signature
Next Message by Date:
click to view message preview
installation on windows me
PSQL.EXE: could not connect to server: Socket error, no description available
(0X00002740)
Is the server running
on host localhost and accepting TCP/IP connection on port
5432?
Previous Message by Thread:
click to view message preview
signal 11
============================================================================
POSTGRESQL BUG REPORT TEMPLATE
============================================================================
Your name : Tanya Mamedalin
Your email address : tmamedalin@xxxxxxxxxxxxxxx
System Configuration
---------------------
Architecture (example: Intel Pentium) : XSER305 INTEL PENTIUM 4
Operating System (example: Linux 2.0.26 ELF) : RedHat 7.3
PostgreSQL version (example: PostgreSQL-7.4): PostgreSQL-7.4
Compiler used (example: gcc 2.95.2) : gcc version 2.96 20000731
(Red Hat Linux 7.3 2.96-110)
Please enter a FULL description of your problem:
------------------------------------------------
I was running the same Python code with Postgres 7.2.3 and psycopg 1.0.9
with no errors. As soon as I upgraded to Postgres 7.4-0.3 and pyscopg 1.1.6
I started getting errors when trying to execute a certain query. The
offending multi-part query is shown below. It seemed to specifically not
like the subquery portion. To circumvent the problem I changed the subquery
from ".pathname = (..subquery...)" to "e.pathname IN (...subquery...)"
suspecting that postgres was not happy with the one-to-one relation even
though the subquery was guaranteed to only return one record.
**---------------BEGIN ERROR----------------------------------------------
Traceback (most recent call last):
File "./auto_escalate.py", line 100, in ?
conn3 = c3.execute(third_query)
psycopg.ProgrammingError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
select distinct o.outage_id, o.machine, o.service, o.level,
o.comments, o2.time, e.num
from outages o, outages o2, escalation e
where o.outage_id='31009'
and e.pathname=(select pathname from escalation_link el
where el.machine=o.machine and el.service=o.service)
and e.level=o.level and o2.outage_id=o.outage_id
and o2.status in ('ALERT','WARN','CRIT','DOWN','UNKNOWN')
order by level desc limit 1
There was an error connecting to the database
Traceback (most recent call last):
File "./ops2server.py", line 60, in dbopen
db=psycopg.connect(host=HOST,user=username, database=DB,
password=passwd)
OperationalError: FATAL: the database system is starting up
**-----------------END ERROR----------------------------------------------
>From the postgres logs I was showing that every time the offending query was
attempted the process would receive a signal 11, quit and restart.
**-----------------BEGIN LOG----------------------------------------------
2004-01-23 10:44:00 LOG: database system is ready
2004-01-23 10:44:20 LOG: connection received: host=[local] port=
2004-01-23 10:44:20 LOG: connection authorized: user=postgres database=ops2
2004-01-23 10:44:45 LOG: server process (PID 18880) was terminated by
signal 11
2004-01-23 10:44:45 LOG: terminating any other active server processes
2004-01-23 10:44:45 LOG: all server processes terminated; reinitializing
2004-01-23 10:44:45 LOG: database system was interrupted at 2004-01-23
10:44:00 EST
2004-01-23 10:44:45 LOG: checkpoint record is at 0/D0337C
2004-01-23 10:44:45 LOG: redo record is at 0/D0337C; undo record is at 0/0;
shutdown TRUE
2004-01-23 10:44:45 LOG: next transaction ID: 1242; next OID: 98209
2004-01-23 10:44:45 LOG: database system was not properly shut down;
automatic recovery in progress
2004-01-23 10:44:45 LOG: connection received: host=[local] port=
2004-01-23 10:44:45 LOG: record with zero length at 0/D033BC
2004-01-23 10:44:45 LOG: redo is not required
2004-01-23 10:44:45 FATAL: the database system is starting up
2004-01-23 10:44:47 LOG: database system is ready
**-----------------END LOG----------------------------------------------
Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------
Possible ways to reproduce the error: Run a multipart query similar to:
SELECT DISTINCT o.outage_id, o.machine, o.service, o.level, o.comments,
o2.time, e.num
FROM outages o, outages o2, escalation e
WHERE o.outage_id='31009'
AND e.pathname = (SELECT pathname from escalation_link el
WHERE el.machine=o.machine AND
el.service=o.service)
AND e.level=o.level
AND o2.outage_id=o.outage_id
AND o2.status in ('ALERT','WARN','CRIT','DOWN','UNKNOWN')
ORDER BY LEVEL DESC LIMIT 1;
Thanks,
Tanya Mamedalin
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
Next Message by Thread:
click to view message preview
Re: 7.4.xx regression
bs <bs@xxxxxxxxxxxxx> writes:
> I have upgraded to version 7.4.0 (compiling the software and migrating
> the database using dump/restore)
> Now the following plan is produced
> Hash Join (cost=17.08..42.15 rows=7 width=74)
> Hash Cond: ("outer".id = "inner".entry)
> -> Seq Scan on cddb (cost=0.00..20.00 rows=1000 width=74)
> -> Hash (cost=17.07..17.07 rows=6 width=4)
> -> Index Scan using cddbentry1 on cddbentry (cost=0.00..17.07
> rows=6 width=4)
> Index Cond: (diskid = 'toto'::text)
> which result in a VERY much slower query as the cddb table has more than
> 1 million entry.... and there is at most one entry in cddbentry
> which matches the diskid !
> I think this can classified as a regression bug.
No, it can be classified as a "user forgot to vacuum or analyze"
mistake. The estimates shown in the query plan look like the defaults
for an empty table.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
message can get through to the mailing list cleanly
|
|