- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
feeling sort of out of sorts
Tue, 2010-02-02, 20:44
I couldn't take the sorting situation for another minute so I went in
and did some stuff in r20771. I hope it's to everyone's liking -- it
looks like an easy winner to me -- but let me know if not.
One thing that process reminded me of is that removeDuplicates has to be
among the worst named methods ever. I didn't know about it for the
first many moons I used scala because the letters "uniq" never appear in
it, but that aside, all that verbosity in the name serves only to
mislead. The word "remove" pretty much throughout the rest of the
library refers to mutating a collection in place, the opposite of what
removeDuplicates actually does.
I boldly propose we deprecate that method and rename it "unique".
Everyone knows what unique means. Then in combination with r20771 you
can say
xs.unique.sorted
instead of the current
xs.removeDuplicates.sortWith(_ < _)
Home run.
Tue, 2010-02-02, 22:07
#2
Re: feeling sort of out of sorts
On Tuesday February 2 2010, Paul Phillips wrote:
> ...
>
> I boldly propose we deprecate that method and rename it "unique".
> Everyone knows what unique means. Then in combination with r20771
> you can say
Wouldn't "distinct" be more accurate?
> ...
Randall Schulz
Tue, 2010-02-02, 22:27
#3
Re: feeling sort of out of sorts
On Tue, Feb 02, 2010 at 01:02:56PM -0800, Randall R Schulz wrote:
> Wouldn't "distinct" be more accurate?
From the standpoint of what words mean, yes. From the standpoint of
what things are usually called, no. I would gladly take either over
removeDuplicates.
Tue, 2010-02-02, 22:57
#4
Re: feeling sort of out of sorts
On Tue, Feb 2, 2010 at 10:17 PM, Paul Phillips wrote:
> On Tue, Feb 02, 2010 at 01:02:56PM -0800, Randall R Schulz wrote:
>> Wouldn't "distinct" be more accurate?
>
> From the standpoint of what words mean, yes. From the standpoint of
> what things are usually called, no. I would gladly take either over
> removeDuplicates.
>
I think in SQL the analogous function is called distinct. I agree I
would take either over removeDuplicates.
Cheers
Tue, 2010-02-02, 23:47
#5
Re: feeling sort of out of sorts
>>>>> "Paul" == Paul Phillips writes:
Paul> I boldly propose we deprecate that method and rename it "unique".
Paul> Everyone knows what unique means. Then in combination with
Paul> r20771 you can say
Paul> xs.unique.sorted
Ahhhhh... therein lies the nub.
Wed, 2010-02-03, 19:47
#6
Re: feeling sort of out of sorts
On Tue, Feb 2, 2010 at 10:17 PM, Paul Phillips wrote:
> On Tue, Feb 02, 2010 at 01:02:56PM -0800, Randall R Schulz wrote:
>> Wouldn't "distinct" be more accurate?
>
> From the standpoint of what words mean, yes. From the standpoint of
> what things are usually called, no. I would gladly take either over
> removeDuplicates.
>
After having signed off on the code review I am getting doubts. We
agree that distinct has the closer connotation. SQL uses it. Who uses
unique?
Cheers
Wed, 2010-02-03, 19:47
#7
Re: feeling sort of out of sorts
On Wed, Feb 03, 2010 at 07:22:02PM +0100, martin odersky wrote:
> After having signed off on the code review I am getting doubts. We
> agree that distinct has the closer connotation. SQL uses it. Who uses
> unique?
I think most of them spell it without all the fancy vowels (and I'd have
proposed uniq if I thought it would fly) but unix, perl, ruby, I don't
know that many languages.
This page isn't that helpful because it mostly implements it without
informing us whether a built-in way exists. However I'd say it's quite
telling that almost every sample implementation uses the term "unique"
in so doing. I realize one could conjure up any number of explanations
for that, but I still think that "unique" more than any other word is
what people think of first.
http://rosettacode.org/wiki/Create_a_Sequence_of_unique_elements
Wed, 2010-02-03, 20:07
#8
RE: feeling sort of out of sorts
SQL uses both distinct and unique.
Distinct is used in queries to remove duplicate records from results.
Unique is used in column definitions to specify that the values in column or combination of columns must be unique and that the database should reject operations that would introduce duplicates.
-----Original Message-----
From: odersky@gmail.com [mailto:odersky@gmail.com] On Behalf Of martin odersky
Sent: Wednesday, February 03, 2010 1:22 PM
To: Paul Phillips
Cc: Randall R Schulz; scala-internals@listes.epfl.ch
Subject: Re: [scala-internals] feeling sort of out of sorts
On Tue, Feb 2, 2010 at 10:17 PM, Paul Phillips wrote:
> On Tue, Feb 02, 2010 at 01:02:56PM -0800, Randall R Schulz wrote:
>> Wouldn't "distinct" be more accurate?
>
> From the standpoint of what words mean, yes. From the standpoint of
> what things are usually called, no. I would gladly take either over
> removeDuplicates.
>
After having signed off on the code review I am getting doubts. We
agree that distinct has the closer connotation. SQL uses it. Who uses
unique?
Cheers
Wed, 2010-02-03, 20:17
#9
Re: feeling sort of out of sorts
> After having signed off on the code review I am getting doubts. We
> agree that distinct has the closer connotation. SQL uses it. Who uses
> unique?
The Unix tool for removing duplicates from (sorted) list of lines is
called 'uniq'. I'd vote for 'distinct' in Scala myself.
Regards,
Maciek
Wed, 2010-02-03, 20:27
#10
Re: feeling sort of out of sorts
What's in a name...
but unique / distinct both work quite well.
for what it's worth, a similar method in .NET LINQ is called distinct
http://msdn.microsoft.com/en-us/library/cc716801.aspx
I suppose what you're actually getting is a Set, but set has specific meaning in the API, as a method distinct would be clear and in-line with both SQL and .NET LINQ
cheers, Louis
--
Web: www.chillipower.com
Blog: http://louisbotterill.blogspot.com/
Twitter: http://twitter.com/BinaryJunkie
LinkedIn: http://uk.linkedin.com/pub/louis-botterill/10/3b2/265
Please consider your environmental responsibility before printing this e-mail
but unique / distinct both work quite well.
for what it's worth, a similar method in .NET LINQ is called distinct
http://msdn.microsoft.com/en-us/library/cc716801.aspx
I suppose what you're actually getting is a Set, but set has specific meaning in the API, as a method distinct would be clear and in-line with both SQL and .NET LINQ
cheers, Louis
--
Web: www.chillipower.com
Blog: http://louisbotterill.blogspot.com/
Twitter: http://twitter.com/BinaryJunkie
LinkedIn: http://uk.linkedin.com/pub/louis-botterill/10/3b2/265
Please consider your environmental responsibility before printing this e-mail
Wed, 2010-02-03, 20:37
#11
Re: feeling sort of out of sorts
Out of curiosity:
Distinct (C#)
nub (Haskell)
uniq (IDL)
removeDuplicates & deleteDuplicates (Lisp)
remdup (Logo)
prune (Factor)
DeleteDuplicated (Mathematica)
cull (Nial)
uniq (Perl)
array_unique (PHP)
sort-object -unique (powershell)
unique (R)
unique (Rebol)
unique (Raven)
uniq (Ruby)
removeDuplicates (Scala)
remove-duplicates (Scheme)
lsort -unique (Tcl)
uniq (Unix command)
The use of "unique" as a variable in the other languages was really prevalent, though the question itself uses that word. Oz defined a function called "nub", but of exceptions worth noting, that's it.
The first time *I* searched for it, I tried "distinct" first, "uniq" second. I think it was paulp who pointed me at removeDuplicates. Paul's objection against removeDuplicates ressonates with me. It describes an action, which one usually associate with methods that produce mutation. On the other hand, "distinct" or "unique" describe properties, which sound better for operations that don't mutate. Furthermore, one can think of "isDistinct" or "isUnique", which is definitely not the case for "removeDuplicates".
On Wed, Feb 3, 2010 at 4:31 PM, Paul Phillips <paulp@improving.org> wrote:
--
Daniel C. Sobral
I travel to the future all the time.
The first time *I* searched for it, I tried "distinct" first, "uniq" second. I think it was paulp who pointed me at removeDuplicates. Paul's objection against removeDuplicates ressonates with me. It describes an action, which one usually associate with methods that produce mutation. On the other hand, "distinct" or "unique" describe properties, which sound better for operations that don't mutate. Furthermore, one can think of "isDistinct" or "isUnique", which is definitely not the case for "removeDuplicates".
On Wed, Feb 3, 2010 at 4:31 PM, Paul Phillips <paulp@improving.org> wrote:
On Wed, Feb 03, 2010 at 07:22:02PM +0100, martin odersky wrote:
> After having signed off on the code review I am getting doubts. We
> agree that distinct has the closer connotation. SQL uses it. Who uses
> unique?
I think most of them spell it without all the fancy vowels (and I'd have
proposed uniq if I thought it would fly) but unix, perl, ruby, I don't
know that many languages.
This page isn't that helpful because it mostly implements it without
informing us whether a built-in way exists. However I'd say it's quite
telling that almost every sample implementation uses the term "unique"
in so doing. I realize one could conjure up any number of explanations
for that, but I still think that "unique" more than any other word is
what people think of first.
http://rosettacode.org/wiki/Create_a_Sequence_of_unique_elements
--
Paul Phillips | Where there's smoke, there's mirrors!
Everyman |
Empiricist |
all hip pupils! |----------* http://www.improving.org/paulp/ *----------
--
Daniel C. Sobral
I travel to the future all the time.
Wed, 2010-02-03, 20:47
#12
Re: feeling sort of out of sorts
+1 for distinct.
I like the SQL differentiation between distinct and unique -- "distinct"
to specify what an action does; "unique" to specify a structural
requirement.
Dave
On 4/02/10 7:52 AM, Grand, Mark wrote:
> SQL uses both distinct and unique.
>
> Distinct is used in queries to remove duplicate records from results.
>
> Unique is used in column definitions to specify that the values in column or combination of columns must be unique and that the database should reject operations that would introduce duplicates.
>
> -----Original Message-----
> From: odersky@gmail.com [mailto:odersky@gmail.com] On Behalf Of martin odersky
> Sent: Wednesday, February 03, 2010 1:22 PM
> To: Paul Phillips
> Cc: Randall R Schulz; scala-internals@listes.epfl.ch
> Subject: Re: [scala-internals] feeling sort of out of sorts
>
> On Tue, Feb 2, 2010 at 10:17 PM, Paul Phillips wrote:
>
>> On Tue, Feb 02, 2010 at 01:02:56PM -0800, Randall R Schulz wrote:
>>
>>> Wouldn't "distinct" be more accurate?
>>>
>> From the standpoint of what words mean, yes. From the standpoint of
>> what things are usually called, no. I would gladly take either over
>> removeDuplicates.
>>
>>
> After having signed off on the code review I am getting doubts. We
> agree that distinct has the closer connotation. SQL uses it. Who uses
> unique?
>
> Cheers
>
> -- Martin
>
> This e-mail message (including any attachments) is for the sole use of
> the intended recipient(s) and may contain confidential and privileged
> information. If the reader of this message is not the intended
> recipient, you are hereby notified that any dissemination, distribution
> or copying of this message (including any attachments) is strictly
> prohibited.
>
> If you have received this message in error, please contact
> the sender by reply e-mail message and destroy all copies of the
> original message (including attachments).
>
Wed, 2010-02-03, 21:57
#13
Re: feeling sort of out of sorts
On Tue, Feb 2, 2010 at 1:44 PM, Paul Phillips <paulp@improving.org> wrote:
Interestingly, a seq with a low ratio of duplicates would be both faster and more memory efficient to sort first, then filter.
xs.unique.sorted
instead of the current
xs.removeDuplicates.sortWith(_ < _)
Interestingly, a seq with a low ratio of duplicates would be both faster and more memory efficient to sort first, then filter.
Wed, 2010-02-03, 22:07
#14
Re: feeling sort of out of sorts
Please use distinct as LINQ does.
—Mohamed
On 3 February 2010 19:39, David Brooks <d.brooks@auckland.ac.nz> wrote:
—Mohamed
What's in a name...
but unique / distinct both work quite well.
for what it's worth, a similar method in .NET LINQ is called distinct
http://msdn.microsoft.com/en-us/library/cc716801.aspx
I suppose what you're actually getting is a Set, but set has specific meaning in the API, as a method distinct would be clear and in-line with both SQL and .NET LINQ
cheers, Louis
On 3 February 2010 19:39, David Brooks <d.brooks@auckland.ac.nz> wrote:
+1 for distinct.
I like the SQL differentiation between distinct and unique -- "distinct" to specify what an action does; "unique" to specify a structural requirement.
Dave
On 4/02/10 7:52 AM, Grand, Mark wrote:
SQL uses both distinct and unique.
Distinct is used in queries to remove duplicate records from results.
Unique is used in column definitions to specify that the values in column or combination of columns must be unique and that the database should reject operations that would introduce duplicates.
-----Original Message-----
From: odersky@gmail.com [mailto:odersky@gmail.com] On Behalf Of martin odersky
Sent: Wednesday, February 03, 2010 1:22 PM
To: Paul Phillips
Cc: Randall R Schulz; scala-internals@listes.epfl.ch
Subject: Re: [scala-internals] feeling sort of out of sorts
On Tue, Feb 2, 2010 at 10:17 PM, Paul Phillips<paulp@improving.org> wrote:
On Tue, Feb 02, 2010 at 01:02:56PM -0800, Randall R Schulz wrote:After having signed off on the code review I am getting doubts. We
Wouldn't "distinct" be more accurate?From the standpoint of what words mean, yes. From the standpoint of
what things are usually called, no. I would gladly take either over
removeDuplicates.
agree that distinct has the closer connotation. SQL uses it. Who uses
unique?
Cheers
Wed, 2010-02-03, 22:27
#15
Re: feeling sort of out of sorts
What about "unsimilar" or the catchy "oneOfEach" ;)
On Wed, Feb 3, 2010 at 9:54 PM, Mohamed Bana <mohamed@bana.org.uk> wrote:
On Wed, Feb 3, 2010 at 9:54 PM, Mohamed Bana <mohamed@bana.org.uk> wrote:
Please use distinct as LINQ does.
—MohamedWhat's in a name...
but unique / distinct both work quite well.
for what it's worth, a similar method in .NET LINQ is called distinct
http://msdn.microsoft.com/en-us/library/cc716801.aspx
I suppose what you're actually getting is a Set, but set has specific meaning in the API, as a method distinct would be clear and in-line with both SQL and .NET LINQ
cheers, Louis
On 3 February 2010 19:39, David Brooks <d.brooks@auckland.ac.nz> wrote:
+1 for distinct.
I like the SQL differentiation between distinct and unique -- "distinct" to specify what an action does; "unique" to specify a structural requirement.
Dave
On 4/02/10 7:52 AM, Grand, Mark wrote:
SQL uses both distinct and unique.
Distinct is used in queries to remove duplicate records from results.
Unique is used in column definitions to specify that the values in column or combination of columns must be unique and that the database should reject operations that would introduce duplicates.
-----Original Message-----
From: odersky@gmail.com [mailto:odersky@gmail.com] On Behalf Of martin odersky
Sent: Wednesday, February 03, 2010 1:22 PM
To: Paul Phillips
Cc: Randall R Schulz; scala-internals@listes.epfl.ch
Subject: Re: [scala-internals] feeling sort of out of sorts
On Tue, Feb 2, 2010 at 10:17 PM, Paul Phillips<paulp@improving.org> wrote:
On Tue, Feb 02, 2010 at 01:02:56PM -0800, Randall R Schulz wrote:After having signed off on the code review I am getting doubts. We
Wouldn't "distinct" be more accurate?From the standpoint of what words mean, yes. From the standpoint of
what things are usually called, no. I would gladly take either over
removeDuplicates.
agree that distinct has the closer connotation. SQL uses it. Who uses
unique?
Cheers
Thu, 2010-02-04, 10:17
#16
Re: feeling sort of out of sorts
I am more and more convinced it should be "distinct". We want
for-comprehensions to be easily mappable to SQL and LINQ. So having
distinct instead of unique removes one small hurdle for that.
Besides unique seems to vary in meaning a lot (SQL: require uniqeness:
Unix: remove identical following lines, Ruby: same as
removeDuplicates). Distinct seems to be more consistent in its
meaning.
Cheers
Thu, 2010-02-04, 13:27
#17
Re: feeling sort of out of sorts
I'm an easy sell on this one. Distinct it is.
On Tue, Feb 2, 2010 at 11:44 AM, Paul Phillips <paulp@improving.org> wrote:
Looks almost Ruby-like in its cleanliness.
--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics