On 1/7/06, David Mundie <mundie@xxxxxxxxxx> wrote:
markus wrote:
>> sort of. However, as far as I understand RelaxNG and XML Schema, the
>> distinction between elements and attributes is somewhat blurred, as
>> you can base the validation on attribute values of an ancestor
>> element.
This is true of RELAX NG, but not XML Schema.
> Ah, I'd forgotten that. Now I see why you wanted to move to RelaxNG,
> although I don't see why you think having a single element is preferable.
The reason why not is because you cannot find a finite list of
referernce types that will suit everyone. I've tried to do it, and
it's impoossible.
> In fact, I believe the main distinction is whether a reference type
> requires or allows part and series data.
In the model I favor (which I based citeproc on), you have a few
subclasses of a reference base class:
monograph (standalone item, published singly; not all electronic
resources fit here)
serial (someting published continually; periodicals, court reporters, etc.)
part-inMonograph
part-inSerial
Here's how I was writing it out in Ruby very schematically (where I
also add collection to include series and archival collections):
# the base class
class Reference
attr_reader :title, :creator, :year, :type, :bibparams
def initialize(title, creator=[], year, type, bibparams={})
@title = title
@creator = creator
@year = year
@type = type
@bibparams = bibparams
end
end
# standalone items like books, reports, or albums
class Monograph < Reference
def initialize(title, creator, year, type, publisher)
super(title, creator, year, type)
@publisher = publisher
end
end
# book chapters, parts of reports, etc.
class PartInMonograph < Reference
def initialize(title, creator, year, type, partOf, pages)
super(title, creator, year, type)
@partOf = partOf
@pages = pages
end
end
# articles, bills, etc.; items published in serials/periodicals
class PartInSerial < Reference
def initialize(title, creator, year, type, partOf, pages, volume,
issue)
super(title, creator, year, type)
@partOf = partOf
@pages = pages
@volume = volume
@issue = issue
end
end
# non-citable resources such as periodicals, series, or
# archival collections
class Collection
def initialize(title)
@title = title
end
end
class Periodical < Collection
def initialize(title)
super(title)
end
end
class Series < Collection
def initialize(title)
super(title)
end
end
# == Agents
class Agent
attr_reader :name, :sortname
def initialize(name, sortname)
@name = name
@sortname = sortname
end
def inspect()
"#<#{self.class}: #{@name}>"
end
end
class Person < Agent
def initialize(name, sortname)
super
end
end
class Organization < Agent
def initialize(name, sortname=nil)
super
end
end
class Publisher < Organization
def initialize(name, place)
super(name)
@place = place
end
end
# == Events
class Event
def initialize(name, date, sponsor)
@name = name
@date = date
@sponsor = sponsor
end
end
class Conference < Event
def initialize(name, date, sponsor)
super(name, date, sponsor)
end
end
> That was my going-in position, but the more I stare at the RIS
> documentation, the less true that seems - there are overloaded fields all
> over the place. For example, Bills and Statutes have "Code" in slot 16,
> normally used for Pub Place. Bills have in Bill/Res Number in slot 12, while
> Statues have Title/Code. This says to me that risy will really have to have
> bill-pubinfo and statute-pubinfo elements.
I think you may as well forget about RIS if you want to do this right.
A "code" for a bill is really just a specific kind of document number,
BTW. Likewise, a "court reporter" is just a kind of periodical, etc.
So there are general structures underneath the type-specific language.
> Let me ask you something. Have you ever considered using a native XML
> database like XML as your data store? Could RefDB do everything it does if
> the internal representation were Risx instead of SQL? It would certainly
> avoid all the messiness of mappyng Risx onto tables, and allow more freedom
> in the grammar. But then, I'm an XML bigot.
I think if you were going to ditch the SQL-based storage it'd be
better to adopt an RDF-based solution (say Redland with a SQL
backend).
>> If you're looking for a one-size-fits-all reference data format, it is going
>> to look like MODS.
>
> Do you really think so? I certainly hope not, although the history of
> bibliography in the last 20 years seems to point to that conclusion.
I'm really the one that was advocating MODS for a long time. I do
think it provides a nice meeting ground between more end-user-oriented
XML and MARC.
However, my more recent thinking is that RDF is a better way to do
this, both because it can be expressed in XML, because it is
fundamentally a relational model (and so is easier to map to RDBMSes
if needed), and because it can be both rich and expressive, and fairly
simple.
Consider something like:
<ex:Article rdf:about="x">
<dc:title>Some Title</dc:title>
<dc:creator rdf:resource="y"/>
<dcq:isPartOf rdf:resource="z"/>
<prism:volume>23</prism:volume>
<prism:number>2</prism:number>
</ex:Article>
<ex:Journal rdf:about="z">
<dc:title>Some Journal</dc:title>
</ex:Journal>
<foaf:Person rdf:about="y">
<givenname>Jane</givenname>
<famiily_name>Doe</family_name>
</foaf:Person>
It's fairly easy to validate with RELAX NG too, and metadata providers
like Ingenta provide RDF/RSS feeds of their holdings; see, for
example:
http://api.ingentaconnect.com/content/bpl/anna/latest?format=rss
> When you have a moment, I'd really love to get a couple of sentences from
> you about what you think of the TEI bibliographic elements and Bibtex. (I
> know what you think of MODS.)
I think the level metaphor of TEI and RIS is wrong. Indeed, look at
the revised bib model as implemented (I think!) in TEI 5, which is
more like MODS or DC in structure.
I think BibTeX is a hack form a data modelling perspective. It only
really works if you're a hard scientist, and even then can be a
problem.
Bruce
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
|