This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Whitespace handling for XML.load and XML.loadString

3 replies
Per Halvor Tryg...
Joined: 2010-03-16,
User offline. Last seen 42 years 45 weeks ago.
In Scala 2.7.7 XML.load and XML.loadString omits significant whitespace:   scala>  scala.xml.XML.loadString("<div><span class=\"article\">a</span> <span class=\"noun\">boat</span></div>")
res1: scala.xml.Elem = <div><span class="article">a</span><span class="noun">boat</span></div>   The space between the two span elements disappears after XML.loadString.   Fortunately this is corrected in Scala 2.8, but we're still dependent on 2.7.7 in our application. Is there any way to work around this in Scala 2.7.7, e.g. by settings for underlying  parsers  or something?   Per Halvor
Anthony B. Coates
Joined: 2009-09-12,
User offline. Last seen 2 years 35 weeks ago.
Re: Whitespace handling for XML.load and XML.loadString

I haven't looked at the code, but this is ignorable whitespace, which a
parser can legitimately remove by default. You could try setting

xml:space="preserve"

on the . Alternately, put a around the space.

Cheers, Tony.

On Tue, 16 Mar 2010 08:17:24 -0000, Per Halvor Tryggeseth
wrote:

> In Scala 2.7.7 XML.load and XML.loadString omits significant whitespace:
>
> scala> scala.xml.XML.loadString("a
> boat")
> res1: scala.xml.Elem = a class="noun">boat
>
> The space between the two span elements disappears after XML.loadString.
>
> Fortunately this is corrected in Scala 2.8, but we're still dependent on
> 2.7.7 in our application. Is there any way to work around this in Scala
> 2.7.7, e.g. by settings for underlying parsers or something?
>
> Per Halvor

milessabin
Joined: 2008-08-11,
User offline. Last seen 33 weeks 3 days ago.
Re: Whitespace handling for XML.load and XML.loadString

On Tue, Mar 16, 2010 at 9:35 PM, Anthony B. Coates (Londata)
wrote:
> I haven't looked at the code, but this is ignorable whitespace, which a
> parser can legitimately remove by default.

Strictly speaking whitespace is only ignorable if there's a content
model which says that the element in question has element-content
only.

Cheers,

Miles

Per Halvor Tryg...
Joined: 2010-03-16,
User offline. Last seen 42 years 45 weeks ago.
SV: Whitespace handling for XML.load and XML.loadString

Setting xml:space="preserve" on the does not have the desired effect. The space between the to span elements is still removed:

scala> scala.xml.XML.loadString("a boat")
res2: scala.xml.Elem = aboat

Even if I put a span around the space it is ignored:

scala> scala.xml.XML.loadString("a boat")
res3: scala.xml.Elem = aboat

I also think that section 2.10 in the xml spec - http://www.w3.org/TR/REC-xml/#sec-white-space - supports what Miles sais.

Luckily though, as I said, scala 2.8 does not have this behaviour, but we're still stuck with 2.7.7 for some time.

Cheers,
Per Halvor

-----Opprinnelig melding-----
Fra: Miles Sabin [mailto:miles@milessabin.com]
Sendt: 16. mars 2010 21:53
Til: scala-xml@listes.epfl.ch
Emne: Re: [scala-xml] Whitespace handling for XML.load and XML.loadString

On Tue, Mar 16, 2010 at 9:35 PM, Anthony B. Coates (Londata) wrote:
> I haven't looked at the code, but this is ignorable whitespace, which
> a parser can legitimately remove by default.

Strictly speaking whitespace is only ignorable if there's a content model which says that the element in question has element-content only.

Cheers,

Miles

--
Miles Sabin
tel: +44 (0)7813 944 528
skype: milessabin
http://www.chuusai.com/
http://twitter.com/milessabin

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland