logo       

Re: risx + ELEC: msg#00023

Subject: Re: risx + ELEC
On 1/7/06, David Mundie <mundie@xxxxxxxxxx> wrote:

markus wrote:

>> sort of. However, as far as I understand RelaxNG and XML Schema, the
>> distinction between elements and attributes is somewhat blurred, as
>> you can base the validation on attribute values of an ancestor
>> element.

This is true of RELAX NG, but not XML Schema.

> Ah, I'd forgotten that. Now I see why you wanted to move to RelaxNG,
> although I don't see why you think having a single element is preferable.

The reason why not is because you cannot find a finite list of
referernce types that will suit everyone.  I've tried to do it, and
it's impoossible.

> In fact, I believe the main distinction is whether a reference type
> requires or allows part and series data.

In the model I favor (which I based citeproc on), you have a few
subclasses of a reference base class:

monograph (standalone item, published singly; not all electronic
resources fit here)
serial (someting published continually; periodicals, court reporters, etc.)
part-inMonograph
part-inSerial

Here's how I was writing it out in Ruby very schematically (where I
also add collection to include series and archival collections):

# the base class
class Reference
  attr_reader :title, :creator, :year, :type, :bibparams

  def initialize(title, creator=[], year, type, bibparams={})
    @title = title
    @creator = creator
    @year = year
    @type = type
    @bibparams = bibparams
  end
end

# standalone items like books, reports, or albums
class Monograph < Reference
  def initialize(title, creator, year, type, publisher)
    super(title, creator, year, type)
    @publisher = publisher
  end
end

# book chapters, parts of reports, etc.
class PartInMonograph < Reference
  def initialize(title, creator, year, type, partOf, pages)
    super(title, creator, year, type)
    @partOf = partOf
    @pages = pages
  end
end

# articles, bills, etc.; items published in serials/periodicals
class PartInSerial < Reference
  def initialize(title, creator, year, type, partOf, pages, volume,
                 issue)
    super(title, creator, year, type)
    @partOf = partOf
    @pages = pages
    @volume = volume
    @issue = issue
  end
end

# non-citable resources such as periodicals, series, or
# archival collections
class Collection
  def initialize(title)
    @title = title
  end
end

class Periodical < Collection
  def initialize(title)
    super(title)
  end
end

class Series < Collection
  def initialize(title)
    super(title)
  end
end


# == Agents

class Agent
  attr_reader :name, :sortname
  def initialize(name, sortname)
    @name = name
    @sortname = sortname
  end

  def inspect()
    "#<#{self.class}: #{@name}>"
  end
end

class Person < Agent
  def initialize(name, sortname)
    super
  end
end

class Organization < Agent
  def initialize(name, sortname=nil)
    super
  end
end

class Publisher < Organization
  def initialize(name, place)
    super(name)
    @place = place
  end
end


# == Events

class Event
  def initialize(name, date, sponsor)
    @name = name
    @date = date
    @sponsor = sponsor
  end
end

class Conference < Event
  def initialize(name, date, sponsor)
    super(name, date, sponsor)
  end
end


> That was my going-in position, but the more I stare at the RIS
> documentation, the less true that seems - there are overloaded fields all
> over the place. For example, Bills and Statutes have "Code" in slot 16,
> normally used for Pub Place. Bills have in Bill/Res Number in slot 12, while
> Statues have Title/Code. This says to me that risy will really have to have
> bill-pubinfo and statute-pubinfo elements.

I think you may as well forget about RIS if you want to do this right.

A "code" for a bill is really just a specific kind of document number,
BTW. Likewise, a "court reporter" is just a kind of periodical, etc.

So there are general structures underneath the type-specific language.

> Let me ask you something. Have you ever considered using a native XML
> database like XML as your data store? Could RefDB do everything it does if
> the internal representation were Risx instead of SQL? It would certainly
> avoid all the messiness of mappyng Risx onto tables, and allow more freedom
> in the grammar. But then, I'm an XML bigot.

I think if you were going to ditch the SQL-based storage it'd be
better to adopt an RDF-based solution (say Redland with a SQL
backend).

>> If you're looking for a one-size-fits-all reference data format, it is going
>> to look like MODS.
>
> Do you really think so? I certainly hope not, although the history of
> bibliography in the last 20 years seems to point to that conclusion.

I'm really the one that was advocating MODS for a long time.  I do
think it provides a nice meeting ground between more end-user-oriented
XML and MARC.

However, my more recent thinking is that RDF is a better way to do
this, both because it can be expressed in XML, because it is
fundamentally a relational model (and so is easier to map to RDBMSes
if needed), and because it can be both rich and expressive, and fairly
simple.

Consider something like:

<ex:Article rdf:about="x">
  <dc:title>Some Title</dc:title>
  <dc:creator rdf:resource="y"/>
  <dcq:isPartOf rdf:resource="z"/>
  <prism:volume>23</prism:volume>
  <prism:number>2</prism:number>
</ex:Article>

<ex:Journal rdf:about="z">
  <dc:title>Some Journal</dc:title>
</ex:Journal>

<foaf:Person rdf:about="y">
  <givenname>Jane</givenname>
  <famiily_name>Doe</family_name>
</foaf:Person>

It's fairly easy to validate with RELAX NG too, and metadata providers
like Ingenta provide RDF/RSS feeds of their holdings; see, for
example:

http://api.ingentaconnect.com/content/bpl/anna/latest?format=rss

> When you have a moment, I'd really love to get a couple of sentences from
> you about what you think of the TEI bibliographic elements and Bibtex. (I
> know what you think of MODS.)

I think the level metaphor of TEI and RIS is wrong. Indeed, look at
the revised bib model as implemented (I think!) in TEI 5, which is
more like MODS or DC in structure.

I think BibTeX is a hack form a data modelling perspective. It only
really works if you're a hard scientist, and even then can be a
problem.

Bruce


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
web.pylons.gene...    hurd.l4/2002-10...    kernel.commits....    user-groups.lin...    yellowdog.gener...    java.drools.use...    security.openva...    package-managem...    linux.debian.us...    qnx.openqnx.dev...    genealogy.gramp...    file-systems.if...    voip.wengophone...    tex.context/200...    ietf.smime/2003...    audio.csound.de...    culture.region....    xfree86.devel/2...    mobile.kannel.u...    distributed.con...    education.engli...    org.user-groups...    bug-tracking.gn...    recreation.bicy...   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe