This page is no longer maintained — Please continue to the home page at www.scala-lang.org

XML pull parser ignores CDATA?

2 replies
Johann Petrak
Joined: 2010-04-09,
User offline. Last seen 15 weeks 6 days ago.

I am using Scala version 2.8.0.final with Java 1.6.0_21 on Linux 32 bit.

When I tried to use the pull parser scala.xml.pull._ to read a rather
large XML file into scala, I was surprised to get no content from XML
at all.
As it turned out, all content was embedded in CDATA sections and the
pull parser seems to totally ignore that:

new XMLEventReader(Source.fromString("")).foreach(println)
gives
EvElemStart(null,tag,,)
EvElemEnd(null,tag)

but
new XMLEventReader(Source.fromString("some
text")).foreach(println)
gives
EvElemStart(null,tag,,)
EvText(some text)
EvElemEnd(null,tag)

when mixing:
new XMLEventReader(Source.fromString("outside")).foreach(println)
gives
EvElemStart(null,tag,,)
EvText(outside)
EvElemEnd(null,tag)

Am I missing something here or is this broken?

I am not fixated on using the pull parser, it just seems it is the only
scala-esque way to read a large XML file, i.e. the only way to do it
without actually just doing it with Java libs the Java way from within
Scala?

Cheers,
Johann

huynhjl
Joined: 2009-10-27,
User offline. Last seen 42 years 45 weeks ago.
Re: XML pull parser ignores CDATA?

It seems like a bug to me.

Line 329 of
http://lampsvn.epfl.ch/trac/scala/browser/scala/trunk/src/library/scala/...
does not contain a call back to handle.text.

If I recompile a version of the file with:
def mkResult(pos: Int, s: String): NodeSeq = {
handle.text(pos, s); PCData(s)
}

I then get:
scala> :load Test.scala
Loading Test.scala...
import io.Source
import xml.pull._
EvElemStart(null,tag,,)
EvText(some text)
EvElemEnd(null,tag)

I tested by just downloading that one file, editing it, compiling it with
scalac, making a jar of the classes (00MarkupParser.jar) and copying it to
the lib directory of the distribution.

huynhjl
Joined: 2009-10-27,
User offline. Last seen 42 years 45 weeks ago.
Re: XML pull parser ignores CDATA?

I've created the following ticket:
https://lampsvn.epfl.ch/trac/scala/ticket/3720

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland