logo       
Bookmark and Share

beginning plannings for PEAR for PHP6: msg#00018

php.pear.core

Subject: beginning plannings for PEAR for PHP6

Hi all,

I'd like to start the discussion about the next generation PEAR
installer. First of all, some things I don't want to change:

#1 package.xml version 2.0/2.1 is here to stay
#2 PEAR Channels and Channel REST is here to stay
#3 custom file roles/tasks will continue to work
#4 packages are still downloaded and installed to disk by the PEAR Installer

Now, for the things that I'd like to improve. First of all, I'd like to
shoot for a concurrent release with PHP 6 stable, and leverage the tools
of PHP 6 (which is basically saying the tools of PHP 5, since the major
addition is unicode, and possibly namespaces).

I've been polling many developers for the past 2 years, trying to get
ideas for the next generation PEAR Installer, and there are some
consistent issues that have arisen that I've used to craft a list of my
priorities for the next generation installer.

#1 installer should run out of the box without needing installation
#2 libraries need to be opcode cache-friendly (require instead of
require_once, or a class loader)
#3 simpler configuration
#4 better registry

Only #2 affects the PEAR community at large, although as you'll see #3
could have a significant impact.

#1 installer should be able to run out of the box without needing
installation

The only way we can implement #1 is to distribute the PEAR Installer as
a phar archive as well as a .tgz. However, I would like it to be
installable as well from the phar, and this means we need to write code
that can live happily both in a phar and on disk. So far, in my
experience, the best way to do this is to rely on a class loader and to
use __FILE__ for internal files that need to be retrieved (like the
command pattern and so on). More on this later in the email.

#2 libraries need to be opcode cache-friendly (require instead of
require_once, or a class loader)

To do this, we must eliminated "_once" from all require/include
statements. There are two ways of implementing this. The first way is
a clever usage of return:

<?php
if (class_exists('MDB2')) return;
class MDB2
{
...
}
?>

As documented in the manual, return will simply stop parsing the
included file before getting to MDB2. HOWEVER, this means that in this
example:

<?php
if (class_exists('MDB2')) return;
class MDB2
{
...
}
echo 1;
?>

The line of code "echo 1;" never gets executed! The reason is that in
pre-execution parsing of included files, the class MDB2 is registered,
so that when step-by-step execution occurs, only the first line of code
is ever executed. The way to fix this is of course to use a sub-section:

<?php
if (class_exists('MDB2')) {
if (!defined('MDB2_INITIALIZED')) {
define('MDB2_INITIALIZED', 1);
echo 1;
}
return;
}
class MDB2
{
...
}
?>

The second way is to require usage of __autoload(). More on this later.

#3 simpler configuration

Several issues with the current system of PEAR configuration have arisen
over the past 7 years.

1) finding non-php role files is impossible without the use of
replacements, or knowledge of the location of the configuration file itself
2) relocating installations to another directory structure is
impossible because we all use replacements to find stuff.

The other day, I realized there is a simple truth that would fix both of
these problems. First of all, the package registry is *always* located
in php_dir. Why not store the config values that can't change inside
php_dir as well?

In other words, if we have a configuration file, install a few packages,
and then change php_dir, but not data_dir or bin_dir, any packages we
install into the new php_dir could potentially conflict with those from
the old php_dir. The same is true for PEAR installs that have 2
configuration files pointing to the same php_dir.

In other words, I want to split up the current PEAR configuration into
user-specific values like the GPG signature, login, default channel and
whatnot from repository-specific configuration information like where
files should be installed.

In this way, when relocating an installation, the important information
can be updated directly in the repository upon the move.

This would mean that instead of relying upon a replacement task to find
something like so:

$datafile = '@data_dir@/Package_Name/datafile.dat';

we would need to do something like this (assume we're in Package/Name.php):

$datafile = file_get_contents(dirname(dirname(__FILE__)) .
'/.conf/data_dir') . '/Package_Name/datafile.dat';

In other words, relying upon the configuration. The only problem with
this approach is that when the location of any files is changed, we
would also need to recursively relocate the installed files. A solution
for this problem would be to store current configuration values upon
installation of a package in its registry, and to provide a facility to
relocate files on a per-package basis. In any case, php_dir will
disappear as a configurable item, as it will be implicitly defined by
the location of the config file.

An obvious side note is that the configuration should not be stored as
serialized PHP, that's just retarded, because it makes hand-editing
corrupted configurations very difficult. I imagine storing as XML, and
possibly having each value in a separate file (as in the example above)
for quick package access to configuration values.

I would also like to have each user configuration file explicitly define
a "pear_path": a list of repositories, starting from system and down to
local. This would allow us to check the system PEAR install to see if a
package is installed when doing dependency checks.

#4 better registry

Currently, we only store serialized PHP. Each package has its own
thing, and a dependency "database" is stored in serialized format
(serialized array).

I would like to store individual package information within these formats:

1) original package.xml per installed package, with added info of
previous version installed
2) [optionally] a diff of changes made to each installed file by
replacements, and which task made the changes, stored in text format
3) current configuration information at the time of installation
4) global database in sqlite format containing
a) installed files
b) dependency information
c) previous version installed
d) [optionally] a diff of changes made to each installed file by
replacements, and which task made the changes

With this redundancy, it would also be easy to reconstruct the global
database, were it to get corrupted, or even to repair a corrupted
installation manually by looking at package.xml and grabbing the right
package versions.

========================================
Auto-loading classes: how to solve the require_once problem
========================================

OK, this is the last thing. I would like to provide an __autoload()
callback for all PEAR packages. It would reside in 'PEAR/Autoload.php'
and look like this:

<?php
if (function_exists('PEAR_Autoload')) {
spl_autoload_register('PEAR_Autoload');
return;
}
function PEAR_Autoload($class)
{
require str_replace('_', '/', $class) . '.php';
}
?>

The biggest problem with __autoload() is that it is magical, and makes
debugging hard. However, we can remove the magic in PEAR with coding
standards. Here is how.

Old way:
<?php
require_once 'PEAR/Config.php';
require_once 'PEAR/Dependency2.php';
...
?>

New way:
<?php
// register classes we need
if (!class_exists('PEAR_Config', true) ||
!class_exists('PEAR_Dependency2.php', true)) {
throw new Exception('Cannot find needed dependency classes
PEAR_Config or PEAR_Dependency2');
}
...
?>

This new way also allows us to deal with the problem of confusing error
messages - we can control what is actually seen by the user. We have
the same benefit of clear dependency requirements stated at the top of
the file. At the same time, this code can be used on disk, or packaged
into a phar archive without *any* modification. All we would need is a
different ___autoload() callback for the phar archive, which is easy to
do (and we will provide one in the phar extension, by the way, I talked
to Marcus about this implementation today.)

Please note that using "require" will not allow code to co-exists on
disk or in a phar archive, as include_path cannot contain stream
wrappers (and with good reason, I might add).

With these changes, we would be able to do something revolutionary:
distribute an install-or-not PEAR Installer that can be run right out of
a phar or installed and run on the disk. This means that we can
distribute applications with little built-in installers based on phar
archives, without having to rewrite the code or modify it to fit into a
phar.

The only new PHP extensions required to make this new installer happen are:

spl
sqlite

In addition, the PEAR Installer itself would be much simpler in many
ways, while allowing much more robust management of complicated
repositories.

I know this is a long email, but we're on the cusp of a major change to
the installer as well as to the PEAR repository itself, and I need to
let you all know what I'm thinking before I start tasking things out,
devising a roadmap and moving on this.

Thanks,
Greg



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | Mail Home | sitemap | FAQ | advertise