This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Some thoughts on BitSets

1 reply
Stefan Zeiger
Joined: 2008-12-21,
User offline. Last seen 27 weeks 3 days ago.

Hi,

I've been looking into the current BitSet implementation (because we
want to use bit sets as a basis for Enumeration value sets) and I'd
welcome your opinions on some issues I discovered:

* BitSet.fromArray and the BitSet.BitSetN constuctor are public and take
an Array[Long] argument which becomes part of the internal
representation of the BitSet, thus allowing it to be modified from the
outside:

Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM,
Java 1.6.0_27).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val a = Array(1l, 1l, 1l)
a: Array[Long] = Array(1, 1, 1)

scala> val s = collection.immutable.BitSet.fromArray(a)
s: scala.collection.immutable.BitSet = BitSet(0, 64, 128)

scala> a(0) = 2l

scala> s
res1: scala.collection.immutable.BitSet = BitSet(1, 64, 128)

I suppose it makes sense to keep this implementation around for
performance reasons but I'd prefer to hide it better. We could let
BitSet.fromArray make a copy of the data and keep the BitSetN
constructor public (with an appropriate warning in the doc comment) for
the cases where you don't want the copying.

* There's no way to get the Array[Long] back from an arbitrary BitSet.
I'd like to add BitSet#toBitMask as a counterpart for BitSet.fromArray.
This method would copy the element data into a new Array[Long]. If you
want to get the data without copying, you can match on the type of the
implementation, and in case of a BitSetN, access the public elems array
directly.

* The name fromArray is misleading. It looks like the opposite of a
hypothetical toArray (or the actual copyToArray) method. I'd expect
toArray to create an Array of Int values representing the individual
bits in the set (like copyToArray does), and fromArray to do the
opposite. The current fromArray method should be renamed to fromBitMask
(and the old name deprecated).

* The Scala compiler has its own nsc.util.BitSet but it seems to be
unused. Delete?

* BitSet should extend OrderedSet. I now have an implementation for this
which compiles and passes partest.

Cheers,
Stefan

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: Some thoughts on BitSets
Hi Stefan,

I think all these changes are definite improvements. So I am in favor of all of them. - Martin


On Wed, Nov 30, 2011 at 2:59 PM, Stefan Zeiger <szeiger@novocode.com> wrote:
Hi,

I've been looking into the current BitSet implementation (because we want to use bit sets as a basis for Enumeration value sets) and I'd welcome your opinions on some issues I discovered:

* BitSet.fromArray and the BitSet.BitSetN constuctor are public and take an Array[Long] argument which becomes part of the internal representation of the BitSet, thus allowing it to be modified from the outside:

Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_27).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val a = Array(1l, 1l, 1l)
a: Array[Long] = Array(1, 1, 1)

scala> val s = collection.immutable.BitSet.fromArray(a)
s: scala.collection.immutable.BitSet = BitSet(0, 64, 128)

scala> a(0) = 2l

scala> s
res1: scala.collection.immutable.BitSet = BitSet(1, 64, 128)

I suppose it makes sense to keep this implementation around for performance reasons but I'd prefer to hide it better. We could let BitSet.fromArray make a copy of the data and keep the BitSetN constructor public (with an appropriate warning in the doc comment) for the cases where you don't want the copying.

* There's no way to get the Array[Long] back from an arbitrary BitSet. I'd like to add BitSet#toBitMask as a counterpart for BitSet.fromArray. This method would copy the element data into a new Array[Long]. If you want to get the data without copying, you can match on the type of the implementation, and in case of a BitSetN, access the public elems array directly.

* The name fromArray is misleading. It looks like the opposite of a hypothetical toArray (or the actual copyToArray) method. I'd expect toArray to create an Array of Int values representing the individual bits in the set (like copyToArray does), and fromArray to do the opposite. The current fromArray method should be renamed to fromBitMask (and the old name deprecated).

* The Scala compiler has its own nsc.util.BitSet but it seems to be unused. Delete?

* BitSet should extend OrderedSet. I now have an implementation for this which compiles and passes partest.

Cheers,
Stefan



--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland