- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
XhtmlParser (or ConstructingParser) Usage
Tue, 2010-05-25, 22:33
XhtmlParser (or ConstructingParser) Usage
Hi Folks,
I am exploring the scala.xml.parsing package and am trying to load some HTML up in the REPL. Here’s the command I am executing (It’s long, but complete):
scala.xml.parsing.XhtmlParser(scala.io.Source.fromURL(java.net.URI.create("http://www.java.net/").toURL))
When I do it, I get a whole bunch of stuff kind of like this (what seems like hundreds of lines of it):
:17:24: '/' expected instead of '' ^
:17:24: name expected, but char '' cannot start a name ^
:17:24: '>' expected instead of '' ^
The final result is this:
res1: scala.xml.NodeSeq = Document(<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>< The=""><><><><><><><><><><><></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></>)
Clearly, something isn’t being parsed correctly. Is there something that I am doing incorrectly? Any tips on what I should be doing?
I get similar results when I try to use the ConstructingParser.
Thanks,
Mark
I am exploring the scala.xml.parsing package and am trying to load some HTML up in the REPL. Here’s the command I am executing (It’s long, but complete):
scala.xml.parsing.XhtmlParser(scala.io.Source.fromURL(java.net.URI.create("http://www.java.net/").toURL))
When I do it, I get a whole bunch of stuff kind of like this (what seems like hundreds of lines of it):
:17:24: '/' expected instead of '' ^
:17:24: name expected, but char '' cannot start a name ^
:17:24: '>' expected instead of '' ^
The final result is this:
res1: scala.xml.NodeSeq = Document(<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>< The=""><><><><><><><><><><><></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></>)
Clearly, something isn’t being parsed correctly. Is there something that I am doing incorrectly? Any tips on what I should be doing?
I get similar results when I try to use the ConstructingParser.
Thanks,
Mark
Wed, 2010-05-26, 14:47
#2
Re: XhtmlParser (or ConstructingParser) Usage
Re: [scala-user] XhtmlParser (or ConstructingParser) Usage
Is there any canned functionality in the API for handling HTML? Just wondering.
On 5/25/10 5:20 PM, "David Pollak" <feeder [dot] of [dot] the [dot] bears [at] gmail [dot] com" rel="nofollow">feeder.of.the.bears@gmail.com> wrote:
On 5/25/10 5:20 PM, "David Pollak" <feeder [dot] of [dot] the [dot] bears [at] gmail [dot] com" rel="nofollow">feeder.of.the.bears@gmail.com> wrote:
Java.net is not an XHTML site, it's HTML... which is not well formed XML. It's not going to parse as XML. Sorry.
On Tue, May 25, 2010 at 2:32 PM, Bastian, Mark <mbastia [at] sandia [dot] gov" rel="nofollow">mbastia@sandia.gov> wrote:
Hi Folks,
I am exploring the scala.xml.parsing package and am trying to load some HTML up in the REPL. Here’s the command I am executing (It’s long, but complete):
scala.xml.parsing.XhtmlParser(scala.io.Source.fromURL(java.net.URI.create("http://www.java.net/ <http://www.java.net/> ").toURL))
When I do it, I get a whole bunch of stuff kind of like this (what seems like hundreds of lines of it):
:17:24: '/' expected instead of '' ^
:17:24: name expected, but char '' cannot start a name ^
:17:24: '>' expected instead of '' ^
The final result is this:
res1: scala.xml.NodeSeq = Document(<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>< The=""><><><><><><><><><><><></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></>)
Clearly, something isn’t being parsed correctly. Is there something that I am doing incorrectly? Any tips on what I should be doing?
I get similar results when I try to use the ConstructingParser.
Thanks,
Mark
Wed, 2010-05-26, 16:07
#3
Re: XhtmlParser (or ConstructingParser) Usage
On Wed, May 26, 2010 at 6:43 AM, Bastian, Mark <mbastia@sandia.gov> wrote:
Is there any canned functionality in the API for handling HTML? Just wondering.
No. There are some Java libraries (you can Google for them) that parse HTML and that you can use from Scala.
On 5/25/10 5:20 PM, "David Pollak" <feeder [dot] of [dot] the [dot] bears [at] gmail [dot] com" target="_blank" rel="nofollow">feeder.of.the.bears@gmail.com> wrote:
Java.net is not an XHTML site, it's HTML... which is not well formed XML. It's not going to parse as XML. Sorry.
On Tue, May 25, 2010 at 2:32 PM, Bastian, Mark <mbastia [at] sandia [dot] gov" target="_blank" rel="nofollow">mbastia@sandia.gov> wrote:
Hi Folks,
I am exploring the scala.xml.parsing package and am trying to load some HTML up in the REPL. Here’s the command I am executing (It’s long, but complete):
scala.xml.parsing.XhtmlParser(scala.io.Source.fromURL(java.net.URI.create("http://www.java.net/ <http://www.java.net/> ").toURL))
When I do it, I get a whole bunch of stuff kind of like this (what seems like hundreds of lines of it):
:17:24: '/' expected instead of '' ^
:17:24: name expected, but char '' cannot start a name ^
:17:24: '>' expected instead of '' ^
The final result is this:
res1: scala.xml.NodeSeq = Document(<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>< The=""><><><><><><><><><><><></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></></>)
Clearly, something isn’t being parsed correctly. Is there something that I am doing incorrectly? Any tips on what I should be doing?
I get similar results when I try to use the ConstructingParser.
Thanks,
Mark
--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics
On Tue, May 25, 2010 at 2:32 PM, Bastian, Mark <mbastia@sandia.gov> wrote:
--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics