logo       

Related Msgs: audio.musicbrai...    enbd.general/20...    ietf.idr/2002-0...    java.ant-contri...    gnu.make.genera...    qplus.devel/200...    video.freevo.cv...    os.netbsd.ports...    yellowdog.gener...    xfree86.cvs/200...    search.nutch.us...    freedesktop.xse...    programming.swi...    capabilities.ge...    telephony.pbx.a...    mail.sylpheed.c...    db.firebase.por...    boot-loaders.u-...    recreation.radi...    netbsd.bugs/200...    web.zope.plone....    user-groups.lin...   

[jira] Created: (XERCESJ-1061) Regex "$" and "^" characters treated as spec: msg#00056

Subject: [jira] Created: (XERCESJ-1061) Regex "$" and "^" characters treated as special chars in conflict with XML Schema spec
Regex "$" and "^" characters treated as special chars in conflict with XML 
Schema spec
--------------------------------------------------------------------------------------

         Key: XERCESJ-1061
         URL: http://issues.apache.org/jira/browse/XERCESJ-1061
     Project: Xerces2-J
        Type: Bug
  Components: XML Schema datatypes  
    Versions: 2.6.2    
 Environment: Test Environment: Win XP SP1, JDK v1.5.0_02, Xerces v2.6.2 
(manually used; overrides any other, if packaged with the JDK)
    Reporter: Darien Kindlund
    Priority: Minor


Xerces rejects the following schema:
<xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'>
 <xs:element name="test">
  <xs:simpleType>
   <xs:restriction base="xs:string">
    <xs:pattern value="$?[0-9]+\.[0-9]{2}" />
   </xs:restriction>
  </xs:simpleType>
 </xs:element>
</xs:schema>

The code within org.apache.xerces.impl.xpath.regex.RegexParser throws a parser 
exception over the use of the "$?" characters, unless the "$" character is 
escaped. For example, this works:

    <xs:pattern value="\$?[0-9]+\.[0-9]{2}" />

The fundamental problem is that the Xerces RegexParser code does NOT follow the 
XML Schema specification, as defined by this URL:
http://www.w3.org/TR/2000/WD-xmlschema-2-20000407/#dt-metac

Specifically, the XML Schema specification does NOT give special meaning to the 
"$" and "^" characters, whereas the RegexParser code seems to indicate that 
these characters have the normal, standard UNIX definitions of "end-of-line" 
and "start-of-line" anchors respectively.

Regards,
--
Darien Kindlund
The MITRE Corporation
InfoSec Engr / Scientist, Sr.
kindlund@xxxxxxxxx

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira



Try Searching:
servers, voip, java, networking, microsoft ...
<Prev in Thread] Current Thread [Next in Thread>