- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Re: default charsets and Source.from behavior
Fri, 2009-04-24, 19:11
>>>>> "Paul" == Paul Phillips writes:
Paul> I can see no reason why Apple JDK should default to
Paul> MacRoman...it's simply wrong.
People complain bitterly about this on the Apple java-dev mailing list
and have been complaining about it for years and years. It is regarded
by all as a great blunder on Apple's part.
Paul> Combining all that with what I think is the majority opinion that
Paul> in the absence of a specified and non-platform-default file
Paul> encoding we should always use UTF-8 anyway, then that is the
Paul> logic I intend to follow.
+1 on defaulting to UTF-8. (actually about +1,000,000)
Paul> No matter what we do, all bets are going to be off for people who
Paul> directly call into java libs without specifying an encoding. So
Paul> I'll document that issue, and focus on making the scala I/O
Paul> routines internally consistent with sane defaults.
Yes, let people dealing directly with ancient, poorly designed Java
API's suffer. If I'm only dealing with a Scala API I shouldn't have to
suffer for no very good reason.
Sat, 2009-04-25, 15:07
#2
Re: default charsets and Source.from behavior
>>>>> "martin" == martin odersky writes:
martin> I'm still not convinced. The fact that we could change the
martin> global behavior globally 15 months ago without a user revolt
martin> indicates for me that code that relies on encodings simply was
martin> not that common.
I don't think we should draw that conclusion. Before 15 months ago,
Scala used Java's horribly broken defaults. When Scala changed to a
much better default, I bet the real story went like this: anyone who was
aware of encodings at all either:
1) was already explicitly specifying encodings in their code, either
because they had particular needs or just because Java's defaults
are so horrible, and so didn't notice the change.
2) noticed the change and approved of it because Java's defaults are
so horrible and Scala was finally doing something more sensible.
Personally I fell into category 2.
martin> And Scala io is really rudimentary and not all that useful.
I disagree strongly. I think it's worth stopping and really thinking
about this point. It may be rudimentary, but it is extremely, extremely
useful. It's really important for Scala to get basics right. 95% of
the time io.Source is all you need when you want to read a file. It's
something everybody does: beginners want to do it almost immediately
upon learning the language, and it's something you do pretty much every
time you write a script (do we care if Scala is good for scripting or
not? I think we should care, if for no other reason than because
scripting is how new languages become popular these days). The Java
alternatives for I/O are ugly and cumbersome (even apart from the
encoding issue). Java veterans already know how to use the Java
alternatives, but others don't (for example Ruby users, who seems to be
most of the new people asking questions on #scala these days).
People often ask on the scala mailing lists how to read a file and often
they get told to use scalax or use the Java stuff. (io.Source is much
less buggy these days than it used to be, but the collective memory
forgets slowly.) I think it's a pretty sad situation if we have to tell
new users who only want to read a simple text file they either need to
download and install an optional library, or learn some ancient, crufty
Java API that is roundly hated by all.
Just because Scala lets you use Java libraries doesn't mean that you
should have to in order accomplish elementary tasks. Those of us who
have been doing Java for many years hardly notice when we have to drop
down to Java stuff in our Scala code, but when that happens to beginners
it's extremely noticeable and disconcerting, and it's also extremely
noticeable and disconcerting to people who are not beginners but whose
heads aren't stuffed with Java arcana like ours are.
martin> In the end the question is: Do we want to make it easy to move
martin> files between Java (or other System programs) and Scala, or do
martin> we want to make it easy to move files between Scala programs
martin> running on different systems? My vote is on the former.
The former *sounds* appealing, but -- very unfortunately! -- using
Java's broken encoding defaults does not actually accomplish the former.
martin> So what is the right thing? I agree that utf8 is a nice,
martin> universal, modern standard. But there are really powerful
martin> arguments to do the same as Java here. We are talking about
martin> I/O! So files are bound to be written by one program and read
martin> by another. It would be annoying if Scala and Java used
martin> different conventions which then hampered interoperability.
This argument simply doesn't work on Macs, where the default Java
encoding is MacRoman which no one with even a mild awareness of
encodings ever uses for *anything*.
I'm mostly a Mac guy, so my views on this whole issue are probably
somewhat conditioned by that. Even if y'all decide to go with the Java
defaults on other operating systems -- and despite this long,
impassioned message I've written I have to admit that good arguments
have been made on the other side, too -- please, please don't let scala
default to MacRoman on Macs. Doing so benefits no one.
Sat, 2009-04-25, 15:17
#3
Re: default charsets and Source.from behavior
>>>>> "Carl-Eric" == Carl-Eric Menzel writes:
Carl-Eric> Thus I think that once you have to interface with Java or
Carl-Eric> any other legacy system, you are screwed anyway and *have*
Carl-Eric> to check your encoding.
Exactly, exactly, exactly. That is why simply falling back on the Java
defaults *seems* like it would be a good idea, but isn't.
On Sat, Apr 25, 2009 at 5:51 AM, Carl-Eric Menzel <cm.scala-ml@users.bitforce.com> wrote:
I agree 100%
From the "web developer" perspective, everything is UTF-8 unless otherwise specified. Most web developers doing I18N work set their editors to spit out UTF-8. UTF-8 is supported everywhere. I'd like to see Scala do UTF-8 unless otherwise specified.
--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp