Bertrand Delacretaz wrote:
> However, this requires measurable tests of one's success.
> Automated tests would be hard to implement for FOP I think
I think so. There are however a few interesting approaches.
The easiest case is to test for correct diagnosis of known
problems: invalid input, invalid options, problems with the
environment like missing user fonts. This can be fully
automated.
It would be interesting to check generated output for validity
wrt the output format, e.g. checking for generating invalid
PDF. Maybe one of the OSS tools for rendering PDF can be helpful
here, for example XPDF or ghostscript.
Another easy case is detecting regressions during code cleanup,
refactoring or implementing features which don't have influence
on a certain test case. In this case the test case can be
expected to produce an identical PDF (or other format). We can
write a test which runs the formatting, compute a MD5 from the
PDF byte stream and compares it with a precomputed value. We'll
need a some framework because the checksums can be expected to
change fairly often, but I think this approach is still of some
help overall.
A slightly more advanced idea is to render the PDF page(s) into
standarized bitmap(s), then use a tool providing fuzzy bitmap
matches to compare with an expected result. This could check
whether a 30pt text comes really out at 30pt but should be
somewhat tolerant of many kinds of small changes in the layout
algogorithm. Anybody taking the task for tracking down useful
tools and providing a proof of concept?
An even more complicated approach would be to get some OCR
components, for example from
http://freshmeat.net/projects/gocr/
http://freshmeat.net/projects/claraocr/
and actually retrieve text from the rendered bitmaps. Uh,
well, xpdf can do this too without going through bitmaps.
I'm not a JUnit guru, so I'd like to have someone else
to volunteer. I *do* hava already halff a zillion FO
files, mostly related to defects filed on bugzilla,
but also some tests specific to some of the features
I implemented/fixed recently. They will need clean up though
in order to serve as good test data. I think a single test
case should preferably test only a single feature, and in
the most stable manner. Actually writinig good tests is
as difficult as writing the code implementing the features.
I will make a tarball from my FO files and put it into
my home directory on cvs.apache.org.
Other ideas: should we have a Wiki page where everyone can
add his/her favorite test case, from a short description in
prose ("test JPG with ICC profiless") up to ready-to-run
FO code?
Another issue is that there should be at least a short
description of the purpose of the available test cases,
preferably one which can be rendered into a web page. I
don't know whether JUnit or any other OSS test framework
already provide for this.
J.Pietschmann
|