logo       
Google Custom Search
    AddThis Social Bookmark Button

Re: cvsdb: The "descs" table: msg#00011

Subject: Re: cvsdb: The "descs" table
On 04/07/06, Troels Arvin <troels@xxxxxxxx> wrote:
Hello,

On Tue, July 4, 2006 12:06, James Henstridge wrote:
> 1. the support for postgresql schemas is quite invasive.  It would
> probably be easier to issue a "set search_path to myschema,public"
> command after creating the DB connection.

Yes, much easier. But maybe also less portable. My guess is that the
"invasive" way would make it easier to add MSSQL/DB2/... support.

The code to create the db connection is database specific.  Adding
some extra database specific initialisation isn't really a problem
here.

> Is there any particular reason you want to put the checkin db tables
> in a non-default schema rather than in a db of their own?  Do you have
> some custom tables you want to form some relationships with?

Yes. And in PostgreSQL, you can't join across databases. This is why I
need schemas.




> 2. the WipeDb() code is the wrong way to do things.  The postgresql
> equivalent of mysql's replace syntax would be:
>  * "SELECT somecolumn FROM table WHERE primarykey = value"
>  * if a result is returned, "UPDATE table SET ... WHERE primarykey =
> value"
>  * otherwise "INSERT INTO table (columns, ...) VALUES (values, ...)"

I obviously don't agree that it's a wrong way to do things.

If doing a complete re-build, there is no need to SELECT first. So I think
that the WipeDb function is still valid to have. But I could change the
code so that it's only used in the complete-rebuild case.

You said above that you want to combine the checkin db data with your
own tables above.  If you're blowing away all the data on an update,
won't that screw with foreign key constraints?

Furthermore, the above logic would be needed when using svndbadmin as
a hook script.  I'd start with the above style logic, and then look at
something more streamlined if it turns out to be necessary for the
"rebuild everything" case (which would generally be an infrequent
operation).


> 3. For the "rebuild everything" use case, you probably want to commit
> more often.

I would actually prefer to have it happen in one big transaction. I
haven't tried my code on our 30000+ revs repository. May I report back
when I have some more real-World experience with this?

Having big long running transactions is definitely a problem if you
have multiple writers accessing the database.  So it is generally a
good idea to commit regularly (although committing too often can
introduce performance problems itself).

I don't know how much of an effect this would have on the viewcvs
checkin db code, but it is worth keeping in mind.

> 4. Too many "if 'pgsql'==self._syntax:" blocks.  It'd be better to
> have one check that early on and set the appropriate regexp operator,
> etc as attributes on the CheckinDatabase.  Imagine if we got patches
> for a third or forth database that hooked in like this: the code would
> become unreadable.

I'll try to re-factor this.

> 5. Try to keep to maintain the coding style of other code in the file.
>  The equality tests in your if statements look backwards :)

I have a habit of using "if CONSTANT == variable", to be able to catch
cases where I have written a "=" instead of "==". I feel rather strongly
about this, as I believe that it really catches bugs. The conding style
isn't all that consistent as it is, so I hope I can keep my "backwards"
equality tests.

I am guessing that you picked up this habbit in a language other than
Python.  A statement like "if variable = CONSTANT: ..." is already a
syntax error in Python since an assignment is not an expression.

If you feel that a change to the coding style would be a good idea, it
should be brought up as a separate patch.  That way it can be
evaluated independently of the postgres support patch.

And if we were going to change the style, we'd change the entire file
(or even the whole project).  Otherwise you'd end up with spagetti.


> 6. The postgresql database schema should include the relevant foreign
> key contstraints.

True.

On 05/07/06, Troels Arvin <troels@xxxxxxxx> wrote:
On Tue, 04 Jul 2006 18:06:03 +0800, James Henstridge wrote:
> 1. the support for postgresql schemas is quite invasive.  It would
> probably be easier to issue a "set search_path to myschema,public"
> command after creating the DB connection.

The reason why I don't just do a "SET SEARCH_PATH TO foo" after connect,
is that I'm afraid what could happen in systems with persistent
connections and/or connection pools.

If ViewVC is run via mod_python and uses a PostgreSQL driver with
connection pooling, I'm afraid of the scenario discussed here:
http://thread.gmane.org/gmane.comp.python.db.psycopg.devel/3274/

As Federico says in that thread, psycopg2 doesn't do connection
pooling behind your back.  Sharing connections between unrelated
programs is asking for trouble unless they are explicitly cooperating
anyway.  And if you issue the "set search_path to ..." statement at
the point you get the connection should ensure the search path is as
expected, right?

- Or if someone were to include support for SQL Relay[1], could it be that
the search path is changed around ViewVC's back?

Such a system seems inherently fragile without a way to reset the
connection state.  From the ViewVC side of things though, it wouldn't
be a problem if it set the search path when it acquired the
connection.

James.



Try Searching:
servers, voip, java, networking, microsoft ...
<Prev in Thread] Current Thread [Next in Thread>