<! attlist := ATTLIST
character buffer, for names
character buffer, for names
The library and compiler parsers had the interesting distinction of different behavior for nextch (a function for which there are a total of two plausible behaviors, so we know the design space was fully explored.) One of them returned the value of nextch before the increment and one of them the new value.
The library and compiler parsers had the interesting distinction of different behavior for nextch (a function for which there are a total of two plausible behaviors, so we know the design space was fully explored.) One of them returned the value of nextch before the increment and one of them the new value. So to unify code we have to at least temporarily abstract over the nextchs.
content1 ::= '<' content1 | '&' charref ...
'<' content1 ::= ...
[22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? [23] XMLDecl ::= ' VersionInfo EncodingDecl? SDDecl? S? '?>' [24] VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"') [25] Eq ::= S? '=' S? [26] VersionNum ::= '1.0' [27] Misc ::= Comment | PI | S
'<' element ::= xmlTag1 '>' { xmlExpr | '{' simpleExpr '}' } ETag | xmlTag1 '/' '>'
<! element := ELEMENT
<! element := ELEMENT
externalID ::= SYSTEM S syslit PUBLIC S pubid S syslit
As the current code requires you to call nextch once manually after construction, this method formalizes that suboptimal reality.
stack of inputs
"rec-xml/#ExtSubset" pe references may not occur within markup declarations
These are 99% sure to be redundant but refactoring on the safe side.
These are 99% sure to be redundant but refactoring on the safe side.
Name ::= ( Letter | '_' ) (NameChar)*
See [5] of XML 1.0 specification.
NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender
See [4] and Appendix B of XML 1.0 specification.
NameStart ::= ( Letter | '_' )
where Letter means in one of the Unicode general
categories { Ll, Lu, Lo, Lt, Nl }
.
We do not allow a name to start with :
.
See [3] and Appendix B of XML 1.0 specification
(#x20 | #x9 | #xD | #xA)+
(#x20 | #x9 | #xD | #xA)
Returns true
if the encoding name is a valid IANA encoding.
Returns true
if the encoding name is a valid IANA encoding.
This method does not verify that there is a decoder available
for this encoding, only that the characters are valid for an
IANA encoding name.
The IANA encoding name.
Create a lookahead reader which does not influence the input
Create a lookahead reader which does not influence the input
holds the next character
this method tells ch to get the next character when next called
this method tells ch to get the next character when next called
'N' notationDecl ::= "OTATION"
parses document type declaration and assigns it to instance variable dtd.
parses document type declaration and assigns it to instance variable dtd.
<! parseDTD ::= DOCTYPE name ... >
holds the position in the source file
<? prolog ::= xml S?
// this is a bit more lenient than necessary...
[12] PubidLiteral ::= '"' PubidChar* '"' | "'" (PubidChar - "'")* "'"
append Unicode character to name buffer
append Unicode character to name buffer
Apply a function and return the passed value
Apply a function and return the passed value
Execute body with a variable saved and restored after execution
Execute body with a variable saved and restored after execution
attribute value, terminated by either ' or ".
attribute value, terminated by either ' or ". value may not contain <.
AttValue ::= `'` { _ } `'`
| `"` { _ } `"`
prolog, but without standalone
holds temporary values of pos
holds temporary values of pos
attribute value, terminated by either '
or "
.
attribute value, terminated by either '
or "
. value may not contain <
.
either '
or "
parse attribute and create namespace scope, metadata
parse attribute and create namespace scope, metadata
[41] Attributes ::= { S Name Eq AttValue }
'<! CharData ::= [CDATA[ ( {char} - {char}"]]>"{char} ) ']]>' see [15]
CharRef ::= "&#" '0'..'9' {'0'..'9'} ";" | "&#x" '0'..'9'|'A'..'F'|'a'..'f' { hexdigit } ";"
CharRef ::= "&#" '0'..'9' {'0'..'9'} ";" | "&#x" '0'..'9'|'A'..'F'|'a'..'f' { hexdigit } ";"
see [66]
Comment ::= '' see [15]
scan [S] '=' [S]
scan [S] '=' [S]
[42] '<' xmlEndTag ::= '<' '/' Name S? '>'
[42] '<' xmlEndTag ::= '<' '/' Name S? '>'
entity value, terminated by either ' or ".
entity value, terminated by either ' or ". value may not contain <.
AttValue ::= `'` { _ } `'`
| `"` { _ } `"`
actually, Name ::= (Letter | '_' | ':') (NameChar)* but starting with ':' cannot happen Name ::= (Letter | '_') (NameChar)*
actually, Name ::= (Letter | '_' | ':') (NameChar)* but starting with ':' cannot happen Name ::= (Letter | '_') (NameChar)*
see [5] of XML 1.0 specification
pre-condition: ch != ':' // assured by definition of XMLSTART token post-condition: name does neither start, nor end in ':'
'<?' ProcInstr ::= Name [S ({Char} - ({Char}'>?' {Char})]'?>'
'<?' ProcInstr ::= Name [S ({Char} - ({Char}'>?' {Char})]'?>'
see [15]
scan [3] S ::= (#x20 | #x9 | #xD | #xA)+
scan [3] S ::= (#x20 | #x9 | #xD | #xA)+
skip optional space S?
skip optional space S?
parse a start or empty tag.
parse a start or empty tag. [40] STag ::= '<' Name { S Attribute } [S] [44] EmptyElemTag ::= '<' Name { S Attribute } [S]
Take characters from input stream until given String "until" is seen.
Take characters from input stream until given String "until" is seen. Once seen, the accumulated characters are passed along with the current Position to the supplied handler function.
<? prolog ::= xml S ... ?>
An XML parser.
Parses XML 1.0, invokes callback methods of a
MarkupHandler
and returns whatever the markup handler returns. UseConstructingParser
if you just want to parse XML to construct instances ofscala.xml.Node
.While XML elements are returned, DTD declarations - if handled - are collected using side-effects.
1.0