#12: psycopg2 use buffer for binary columns
----------------------+-----------------------------------------------------
Id: 12 | Status: new
Component: psycopg2 | Modified: Sat 16 Apr 2005 08:20:34 PM CEST
Severity: normal | Milestone: PSYCOPG 2.0 beta1
Priority: normal | Version: 1.99.13
Owner: fog | Reporter: fog
----------------------+-----------------------------------------------------
Old description:
> the attached patch makes psycopg2 return a buffer object for bytea
> columns. There are two advantages for this:
>
> - - It provides for transparent DB -> Python -> DB round trips for
> bytea columns. Without this patch, psycopg returns a 'str' object for
> bytea columns. This string cannot be passed directly back to
> cursor.execute(), as psycopg assumes strings are intended for a
> character column and therefore strips out null bytes. The solution
> was to wrap it into psycopg.Binary, but is not transparent as you
> need to know the column is binary. With the patch, psycopg returns a
> 'buffer' object for bytea columns, which is already recognized by
> psycopg and assumed to be going into a binary column.
> - - It is faster. No extra copy is needed to generate the Python
> string. Using the buffer interface, Python code can directly access
> the low-level buffer returned by PQunescapeBytea.
>
> The implementation introduces a new C-level Python type called
> "chunk". A chunk basically holds a memory area, and makes sure it is
> freed at object destruction. The chunk exports the buffer interface
> (PyBufferProcs), and is wrapped in a real Python buffer object using
> PyBuffer_FromObject(). This different from the implementation hint by
> Federico in the comment in typecast.c:
>
> {{{
> /* TODO: using a PyBuffer would make this a zero-copy operation but
> we'll
> need to define our own buffer-derived object to keep a reference to
> the
> memory area: does it buy it? */
> }}}
>
> I think this implementation is a bit simpler, as it doesn't involve
> subclassing builtin types and it doesn't export any new types to the
> end user.
New description:
the attached patch makes psycopg2 return a buffer object for bytea
columns. There are two advantages for this:
* It provides for transparent DB -> Python -> DB round trips for
bytea columns. Without this patch, psycopg returns a 'str' object for
bytea columns. This string cannot be passed directly back to
cursor.execute(), as psycopg assumes strings are intended for a
character column and therefore strips out null bytes. The solution
was to wrap it into psycopg.Binary, but is not transparent as you
need to know the column is binary. With the patch, psycopg returns a
'buffer' object for bytea columns, which is already recognized by
psycopg and assumed to be going into a binary column.
* It is faster. No extra copy is needed to generate the Python
string. Using the buffer interface, Python code can directly access
the low-level buffer returned by PQunescapeBytea.
The implementation introduces a new C-level Python type called
"chunk". A chunk basically holds a memory area, and makes sure it is
freed at object destruction. The chunk exports the buffer interface
(PyBufferProcs), and is wrapped in a real Python buffer object using
PyBuffer_FromObject(). This different from the implementation hint by
Federico in the comment in typecast.c:
{{{
/* TODO: using a PyBuffer would make this a zero-copy operation but
we'll
need to define our own buffer-derived object to keep a reference to
the
memory area: does it buy it? */
}}}
I think this implementation is a bit simpler, as it doesn't involve
subclassing builtin types and it doesn't export any new types to the
end user.
--
Ticket URL: <http://initd.org/tracker/psycopg/ticket/12>
psycopg <http://initd.org/>
psycopg_______________________________________________
Psycopg mailing list
Psycopg-IAPFreCvJWPBWskQ1e/+sw@xxxxxxxxxxxxxxxx
http://lists.initd.org/mailman/listinfo/psycopg
|