Re: Re : the mayor of xmltown

4 replies

Tue, 2009-12-01, 10:37

Kevin Wright

Joined: 2009-06-09,

+1 It can't stay in the current condition

On Tue, Dec 1, 2009 at 9:34 AM, Viktor Klang <viktor.klang@gmail.com> wrote:

+1, it's now or never.

On Tue, Dec 1, 2009 at 10:28 AM, Eric Torreborre <etorreborre@yahoo.com> wrote:

+1. That's the right time to do so.
----------------------------------------------
Eric TORREBORRE
T +61 411 707 402
E etorreborre@yahoo.com
B http://etorreborre.blogspot.com
P http://specs.googlecode.com
----------------------------------------------

De : David Pollak <feeder.of.the.bears@gmail.com>
À : Paul Phillips <paulp@improving.org>
Cc : scala-internals@listes.epfl.ch
Envoyé le : Mar 1 Décembre 2009, 16 h 04 min 35 s
Objet : Re: [scala-internals] the mayor of xmltown

Fix it now. I'm all for API breakage if we can get this issue fixed.

On Mon, Nov 30, 2009 at 8:54 PM, Paul Phillips <paulp@improving.org> wrote:

We're so close to having consistent equality, it seems like a shame to
ship 2.8 with a whole segment of the standard library which violates not
only the equals/hashCode contract, but our own collections equality
contracts. Unfortunately this looks inevitable unless I can get the OK
to smash up XML a bit.

Here is the irresolvable issue (which has been nothing but pain from the
word go) which cannot be worked around: most every class in the XML lib
descends from Seq[Node]. I'm sure this seemed like a good idea at the
time, but when a Node IS a Seq[Node], there is no hope. Here is an
actual inheritance line:

Seq[Node]
NodeSeq
Node
SpecialNode
Atom
Text

Now the XML lib seems to depend on things flipflopping between Seq[Node]
and NodeSeq as the phase of the moon warrants, but it has other
expectations - let's look at NodeSeq's equals method:

override def equals(x: Any): Boolean = x match {
case z:Node => (length == 1) && z == apply(0)
case z:Seq[_] => sameElements(z)
case z:String => text == z
case _ => false
}

Ha ha, right.

Hypothetically speaking, let's say we removed all the nutty
irreconcilable equals methods from the XML classes. At that point
nothing would work at all, because the first time you tried to compare
two nodes you'd see this:

scala> new Atom(42) == new Atom(42)
java.lang.StackOverflowError
at scala.xml.Node.theSeq(Node.scala:147)
at scala.xml.NodeSeq.iterator(NodeSeq.scala:54)
at scala.collection.IterableLike$class.sameElements(IterableLike.scala:325)
at scala.xml.NodeSeq.sameElements(NodeSeq.scala:46)
...

This makes perfect sense, because of course two freaking Atoms are
sequences which should have their individual elements compared. What
are those elements? Well let's see, there's just this one Atom -- but
hey that's a sequence... the infinite regression is avoided right now
through a delicate and totally inconsistent set of overrides.

I could fix all this, I think. I might even be able to keep 2.7 code
working via implicits, just different ones. It's not what I'm looking
to do at this point, but neither is shipping the XML lib in its current
form. But I don't want to touch another line of XML unless I can first
rip Seq out of the object hierarchy and send it out of XMLtown on a
rail. Or, maybe I should just chmod 0000 scala/xml and filter all the
bug reports to /dev/null, so I can stop being tempted.

--
Paul Phillips | Simplicity and elegance are unpopular because
Vivid | they require hard work and discipline to achieve
Empiricist | and education to be appreciated.
slap pi uphill! | -- Dijkstra

--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics

--
Viktor Klang
| "A complex system that works is invariably
| found to have evolved from a simple system
| that worked." - John Gall

Blog: klangism.blogspot.com
Twttr: twitter.com/viktorklang
Code: github.com/viktorklang

Tue, 2009-12-01, 18:47

extempore

Joined: 2008-12-17,

Re: the mayor of xmltown

On Tue, Dec 01, 2009 at 02:06:31PM +0100, Burak Emir wrote:
> It's a long time that I have become a silent observer

You are not kidding! I have spent so much time reading scala source and
documents the last year or two (much of it concentrated in areas where
you worked) that to hear you speak for the first time is like watching
someone step out of the pages of a book. It's spooooky.

> Now XPath-like ops are operational: think (node \ "a" \ "b") == ((node
> \ "a") \ "b") and you will see that NodeSeq has to define \.

Sure. But \ and \\ don't have to use == to compare nodes. That is the
very first change I made, and the minimum imaginable -- to use a new
xml_== instead of ==.

> Now why have Node also be a NodeSeq?
> * again in (node \ "a") we need \, why not inherit it from NodeSeq, and
> * when including a single Node in an embedded syntax e.g. { myNode },
> we can then handle just NodeSeq and be done. See xml parser code in the
> Scala parser.

If necessary this can easily be handled by having an implicit from Node
to NodeSeq. It doesn't require Node to BE a NodeSeq.

As tony morris would likely point out right now, implicits and
inheritance are the same thing, except inheritance is much less
flexible. The relationship

Node => NodeSeq

need not (and I am saying, should not) be modelled via inheritance.

> 2) Equality does not seem to follow a mathematical rule, I won't claim
> it does, but it makes writing xml programs very convenient.

I do not consider any of these relationships convenient:

(a == b) != (b == a)
m(a) = b ; m(a) == c ; c != b
(a == b && b == c) != (a == c)

All of those are implied by the current equality implementation. None
of them are necessary to implement the convenient parts.

> If you go for change, I think a good route would be if you ran your
> beta scala.xml 2.8 by David Pollak and gave him a week to assess what
> he likes and dislikes about your new organization of scala.xml.

I'll tell you what: I won't dream of checking in any major change to XML
unless david pollak is literally foaming at the mouth in his enthusiasm
for it. I'm pretty sure I can prove by construction that there is a
better architecture, preserving all the conveniences and adding a
consistent foundation -- the only obstacle is the many other things I
have to do as well.

> If I were to decide, I would leave most things as they are. Maybe I
> don't get the benefit of "consistent equality" in presence of
> subtyping.

Subtyping is not an obstacle to consistent equality when you structure
your equals method to account for it. Observe Seq's equals method:

override def equals(that: Any): Boolean = that match {
case that: Seq[_] => (that canEqual this) && (this sameElements that)
case _ => false
}

But the benefit of consistent equality is fundamental. If you can't
expect that a == b implies b == a then you are needlessly hamstrung in
your ability to reason about code. Now in fact you can't assume that
about arbitrary objects from arbitrary sources, because anyone can write
an asymmetric or otherwise broken equals method: but people sure ought
to be able to assume that about code in the scala standard library.

Tue, 2009-12-01, 18:47

David Pollak

Joined: 2008-12-16,

Re: the mayor of xmltown

On Tue, Dec 1, 2009 at 9:37 AM, Paul Phillips <paulp@improving.org> wrote:

On Tue, Dec 01, 2009 at 02:06:31PM +0100, Burak Emir wrote:
> It's a long time that I have become a silent observer

You are not kidding! I have spent so much time reading scala source and
documents the last year or two (much of it concentrated in areas where
you worked) that to hear you speak for the first time is like watching
someone step out of the pages of a book. It's spooooky.

> Now XPath-like ops are operational: think (node \ "a" \ "b") == ((node
> \ "a") \ "b") and you will see that NodeSeq has to define \.

Sure. But \ and \\ don't have to use == to compare nodes. That is the
very first change I made, and the minimum imaginable -- to use a new
xml_== instead of ==.

> Now why have Node also be a NodeSeq?
> * again in (node \ "a") we need \, why not inherit it from NodeSeq, and
> * when including a single Node in an embedded syntax e.g. <a>{ myNode }</a>,
> we can then handle just NodeSeq and be done. See xml parser code in the
> Scala parser.

If necessary this can easily be handled by having an implicit from Node
to NodeSeq. It doesn't require Node to BE a NodeSeq.

As tony morris would likely point out right now, implicits and
inheritance are the same thing, except inheritance is much less
flexible. The relationship

Node => NodeSeq

need not (and I am saying, should not) be modelled via inheritance.

> 2) Equality does not seem to follow a mathematical rule, I won't claim
> it does, but it makes writing xml programs very convenient.

I do not consider any of these relationships convenient:

(a == b) != (b == a)
m(a) = b ; m(a) == c ; c != b
(a == b && b == c) != (a == c)

All of those are implied by the current equality implementation. None
of them are necessary to implement the convenient parts.

> If you go for change, I think a good route would be if you ran your
> beta scala.xml 2.8 by David Pollak and gave him a week to assess what
> he likes and dislikes about your new organization of scala.xml.

I'll tell you what: I won't dream of checking in any major change to XML
unless david pollak is literally foaming at the mouth in his enthusiasm
for it. I'm pretty sure I can prove by construction that there is a
better architecture, preserving all the conveniences and adding a
consistent foundation -- the only obstacle is the many other things I
have to do as well.

Paul and I discussed these changes over sushi at one of my favorite sushi restaurants. I was literally drooling over both the sushi and Paul's proposed changes. I'm not sure I'm at the "foaming at the mouth" phase of getting these changes, but I'm definitely drooling.

I'll be happy to work for a day with Paul at his place or mine working on a branch of Lift and some real world Lift apps to see how his changes make things better.

> If I were to decide, I would leave most things as they are. Maybe I
> don't get the benefit of "consistent equality" in presence of
> subtyping.

Subtyping is not an obstacle to consistent equality when you structure
your equals method to account for it. Observe Seq's equals method:

override def equals(that: Any): Boolean = that match {
case that: Seq[_] => (that canEqual this) && (this sameElements that)
case _ => false
}

But the benefit of consistent equality is fundamental. If you can't
expect that a == b implies b == a then you are needlessly hamstrung in
your ability to reason about code. Now in fact you can't assume that
about arbitrary objects from arbitrary sources, because anyone can write
an asymmetric or otherwise broken equals method: but people sure ought
to be able to assume that about code in the scala standard library.

--
Paul Phillips | The most dangerous man to any government is the man who
Protagonist | is able to think things out [...] Almost inevitably he
Empiricist | comes to the conclusion that the government he lives under
ha! spill, pupil | is dishonest, insane, intolerable. -- H. L. Mencken

--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics

Sat, 2009-12-05, 04:47

extempore

Joined: 2008-12-17,

Re: the mayor of xmltown

On Fri, Dec 04, 2009 at 11:09:44AM +0100, Burak Emir wrote:
> Yeah, but you also do not have any way of ensuring that people who
> implement Node will preserve these.

Guarantees in life are hard to come by, but we don't let it stop us from
trying to do things right.

> I would just urge you to not introduce backwards incompatibility when
> you can avoid it.

Have a little faith! This goes without saying.

Sat, 2009-12-05, 17:17

lex@lexspoon.org

Joined: 2009-12-05,

Re: the mayor of xmltown

On Tue, Dec 1, 2009 at 12:37 PM, Paul Phillips <paulp@improving.org> wrote:

Sure. But \ and \\ don't have to use == to compare nodes. That is the
very first change I made, and the minimum imaginable -- to use a new
xml_== instead of ==.

Sounds great.
Just one thing sounds worth considering in cleaning it up. XPath is weird, but it's also well known, well considered, and widely implemented. It would be really great if Scala can continue to claim that it supports the real XPath, including the weird blurring of nodes versus sequences of nodes.
Lex

Scala Main Menu

Re: Re : the mayor of xmltown

Scala Quick Links

Featured News

User login