Improving binary compatibility of lazy values

11 replies

Tue, 2011-10-18, 14:12

Mirco Dotta 2

Joined: 2011-10-18,

Hi folks,

At Scala Lift Off 2011 I discussed how lazy values can harshly bite
you with unexpected semantic binary incompatibilities. The main issue
is that, while changing an eager value into a lazy one is a perfectly
fine transformation, the other way around can potentially break
semantic of lazy value, simply because the bitmap that handles fields'
initialization is shared across subclasses.

This is a particularly bad issue wrt binary compatibility, and I
believe there is room for improvement, but since I'm no compiler guru
I'd like to know what you guys think. Maybe I'm overlooking something.

I believe there is an easy solution to this problem, which is to have
one bitmap for class (avoid sharing, problem solved). This has the
drawback of increasing the memory footprint, but it may be still
acceptable giving the actual simplicity of the solution.

Likely, there could be more elaborated strategies. One that came up
during the conference is to store, for each lazy value, a (class)
field to know the bitmap's position the lazy field should use for
correct initialization. This "bitmap's position" field would be
initialized at runtime in the class' static constructor (so one field
per class, not per instance).

A further elaboration of the above idea is to have one single class
field per class that contain a lazy value. The (class) field would
hold the first unused position in the bitmap . Since the position
would be computed at runtime in the static class constructor, adding a
new lazy value in a parent class would shift by one the starting
position used by a subclass containing lazy values, i.e., eliminating
the risk of having two lazy values pointing to the same bitmap
location.

Cheers,
Mirco

Tue, 2011-10-18, 14:47

odersky

Joined: 2008-07-29,

Re: Improving binary compatibility of lazy values

On Tue, Oct 18, 2011 at 3:12 PM, Mirco Dotta <mirco.dotta@scalasolutions.com> wrote:

Hi folks,

At Scala Lift Off 2011 I discussed how lazy values can harshly bite
you with unexpected semantic binary incompatibilities. The main issue
is that, while changing an eager value into a lazy one is a perfectly
fine transformation, the other way around can potentially break
semantic of lazy value, simply because the bitmap that handles fields'
initialization is shared across subclasses.

This is a particularly bad issue wrt binary compatibility, and I
believe there is room for improvement, but since I'm no compiler guru
I'd like to know what you guys think. Maybe I'm overlooking something.

I believe there is an easy solution to this problem, which is to have
one bitmap for class (avoid sharing, problem solved). This has the
drawback of increasing the memory footprint, but it may be still
acceptable giving the actual simplicity of the solution.

Likely, there could be more elaborated strategies. One that came up
during the conference is to store, for each lazy value, a (class)
field to know the bitmap's position the lazy field should use for
correct initialization. This "bitmap's position" field would be
initialized at runtime in the class' static constructor (so one field
per class, not per instance).

A further elaboration of the above idea is to have one single class
field per class that contain a lazy value. The (class) field would
hold the first unused position in the bitmap . Since the position
would be computed at runtime in the static class constructor, adding a
new lazy value in a parent class would shift by one the starting
position used by a subclass containing lazy values, i.e., eliminating
the risk of having two lazy values pointing to the same bitmap
location.

I like either of the last two solutions. Price is one static lookup instead of a constant in the bit fiddling operations for lazy vals. I hope that's not too bad but we need to measure.

-- Martin

Tue, 2011-10-18, 14:57

Viktor Klang

Joined: 2008-12-17,

Re: Improving binary compatibility of lazy values

Hey Mirco,

On Tue, Oct 18, 2011 at 3:12 PM, Mirco Dotta <mirco.dotta@scalasolutions.com> wrote:

Hi folks,

At Scala Lift Off 2011 I discussed how lazy values can harshly bite
you with unexpected semantic binary incompatibilities. The main issue
is that, while changing an eager value into a lazy one is a perfectly
fine transformation, the other way around can potentially break
semantic of lazy value, simply because the bitmap that handles fields'
initialization is shared across subclasses.

This is a particularly bad issue wrt binary compatibility, and I
believe there is room for improvement, but since I'm no compiler guru
I'd like to know what you guys think. Maybe I'm overlooking something.

I believe there is an easy solution to this problem, which is to have
one bitmap for class (avoid sharing, problem solved). This has the
drawback of increasing the memory footprint, but it may be still
acceptable giving the actual simplicity of the solution.

Likely, there could be more elaborated strategies. One that came up
during the conference is to store, for each lazy value, a (class)
field to know the bitmap's position the lazy field should use for
correct initialization.

For me this is a no-go, simply because it's expensive.

This "bitmap's position" field would be
initialized at runtime in the class' static constructor (so one field
per class, not per instance).

A further elaboration of the above idea is to have one single class
field per class that contain a lazy value. The (class) field would
hold the first unused position in the bitmap . Since the position
would be computed at runtime in the static class constructor,

You mean in a static initializer block?

adding a
new lazy value in a parent class would shift by one the starting
position used by a subclass containing lazy values, i.e., eliminating
the risk of having two lazy values pointing to the same bitmap
location.

This might be a very good solution, but it needs, as Martin says, benching to verify.

Cheers!

√

Cheers,
Mirco

--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang

Tue, 2011-10-18, 15:07

Joshua.Suereth

Joined: 2008-09-02,

Re: Improving binary compatibility of lazy values

I think I like the idea of computing the position at runtime rather than compile time, as long as this is *stable* for a given class across the entire application. I've always been a bit nervous about lazy vals and serialization since the 2.7 days where you couldn't be sure what you were sending over the wire.
If I can validate that only *vals* and captured variables in a class are serialized with it and lazy vals get recomputed, it actually make distributed computed a lot better.
So, between this runtime-computation and other binary compatible constraints, I'd want to make *very* sure this runtime computation is *well ordered* and not tied to one particular JVM classloader, otherwise it's a big no-go.
- Josh
2011/10/18 √iktor Ҡlang <viktor.klang@gmail.com>

Hey Mirco,

On Tue, Oct 18, 2011 at 3:12 PM, Mirco Dotta <mirco.dotta@scalasolutions.com> wrote:

Hi folks,

At Scala Lift Off 2011 I discussed how lazy values can harshly bite
you with unexpected semantic binary incompatibilities. The main issue
is that, while changing an eager value into a lazy one is a perfectly
fine transformation, the other way around can potentially break
semantic of lazy value, simply because the bitmap that handles fields'
initialization is shared across subclasses.

This is a particularly bad issue wrt binary compatibility, and I
believe there is room for improvement, but since I'm no compiler guru
I'd like to know what you guys think. Maybe I'm overlooking something.

I believe there is an easy solution to this problem, which is to have
one bitmap for class (avoid sharing, problem solved). This has the
drawback of increasing the memory footprint, but it may be still
acceptable giving the actual simplicity of the solution.

Likely, there could be more elaborated strategies. One that came up
during the conference is to store, for each lazy value, a (class)
field to know the bitmap's position the lazy field should use for
correct initialization.

For me this is a no-go, simply because it's expensive.

This "bitmap's position" field would be
initialized at runtime in the class' static constructor (so one field
per class, not per instance).

A further elaboration of the above idea is to have one single class
field per class that contain a lazy value. The (class) field would
hold the first unused position in the bitmap . Since the position
would be computed at runtime in the static class constructor,

You mean in a static initializer block?

adding a
new lazy value in a parent class would shift by one the starting
position used by a subclass containing lazy values, i.e., eliminating
the risk of having two lazy values pointing to the same bitmap
location.

This might be a very good solution, but it needs, as Martin says, benching to verify.

Cheers!

√

Cheers,
Mirco

--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang

Tue, 2011-10-18, 20:57

dotta

Joined: 2011-10-18,

Re: Improving binary compatibility of lazy values

Hey Mirco,

Hi Viktor,
comments inline.

On Tue, Oct 18, 2011 at 3:12 PM, Mirco Dotta <mirco.dotta@scalasolutions.com> wrote:
Hi folks,

At Scala Lift Off 2011 I discussed how lazy values can harshly bite
you with unexpected semantic binary incompatibilities. The main issue
is that, while changing an eager value into a lazy one is a perfectly
fine transformation, the other way around can potentially break
semantic of lazy value, simply because the bitmap that handles fields'
initialization is shared across subclasses.

This is a particularly bad issue wrt binary compatibility, and I
believe there is room for improvement, but since I'm no compiler guru
I'd like to know what you guys think. Maybe I'm overlooking something.

I believe there is an easy solution to this problem, which is to have
one bitmap for class (avoid sharing, problem solved). This has the
drawback of increasing the memory footprint, but it may be still
acceptable giving the actual simplicity of the solution.

Likely, there could be more elaborated strategies. One that came up
during the conference is to store, for each lazy value, a (class)
field to know the bitmap's position the lazy field should use for
correct initialization.

For me this is a no-go, simply because it's expensive.

I wonder why this is too expensive. Can you elaborate?

This "bitmap's position" field would be
initialized at runtime in the class' static constructor (so one field
per class, not per instance).

A further elaboration of the above idea is to have one single class
field per class that contain a lazy value. The (class) field would
hold the first unused position in the bitmap . Since the position
would be computed at runtime in the static class constructor,

You mean in a static initializer block?

Yes, sorry for the wrong terminology.

adding a
new lazy value in a parent class would shift by one the starting
position used by a subclass containing lazy values, i.e., eliminating
the risk of having two lazy values pointing to the same bitmap
location.

This might be a very good solution, but it needs, as Martin says, benching to verify.

This was Iulian's idea, it came up as a further optimization of the one discussed above ;)
Cheers, Mirco

Tue, 2011-10-18, 21:07

Viktor Klang

Joined: 2008-12-17,

Re: Improving binary compatibility of lazy values

On Tue, Oct 18, 2011 at 9:53 PM, Mirco Dotta <mirco.dotta@typesafe.com> wrote:

Hey Mirco,

Hi Viktor,
comments inline.
On Tue, Oct 18, 2011 at 3:12 PM, Mirco Dotta <mirco.dotta@scalasolutions.com> wrote:
Hi folks,

At Scala Lift Off 2011 I discussed how lazy values can harshly bite
you with unexpected semantic binary incompatibilities. The main issue
is that, while changing an eager value into a lazy one is a perfectly
fine transformation, the other way around can potentially break
semantic of lazy value, simply because the bitmap that handles fields'
initialization is shared across subclasses.

This is a particularly bad issue wrt binary compatibility, and I
believe there is room for improvement, but since I'm no compiler guru
I'd like to know what you guys think. Maybe I'm overlooking something.

I believe there is an easy solution to this problem, which is to have
one bitmap for class (avoid sharing, problem solved). This has the
drawback of increasing the memory footprint, but it may be still
acceptable giving the actual simplicity of the solution.

Likely, there could be more elaborated strategies. One that came up
during the conference is to store, for each lazy value, a (class)
field to know the bitmap's position the lazy field should use for
correct initialization.

For me this is a no-go, simply because it's expensive.

I wonder why this is too expensive. Can you elaborate?

I might have misread that, did you mean a final static field in the Class?

This "bitmap's position" field would be
initialized at runtime in the class' static constructor (so one field
per class, not per instance).

A further elaboration of the above idea is to have one single class
field per class that contain a lazy value. The (class) field would
hold the first unused position in the bitmap . Since the position
would be computed at runtime in the static class constructor,

You mean in a static initializer block?

Yes, sorry for the wrong terminology.

adding a
new lazy value in a parent class would shift by one the starting
position used by a subclass containing lazy values, i.e., eliminating
the risk of having two lazy values pointing to the same bitmap
location.

This might be a very good solution, but it needs, as Martin says, benching to verify.

This was Iulian's idea, it came up as a further optimization of the one discussed above ;)
Cheers, Mirco

--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang

Tue, 2011-10-18, 21:17

dotta

Joined: 2011-10-18,

Re: Improving binary compatibility of lazy values

> I might have misread that, did you mean a final static field in the Class?

Yes.

Tue, 2011-10-18, 21:27

Viktor Klang

Joined: 2008-12-17,

Re: Improving binary compatibility of lazy values

On Tue, Oct 18, 2011 at 10:06 PM, Mirco Dotta <mirco.dotta@typesafe.com> wrote:

> I might have misread that, did you mean a final static field in the Class?

Yes.

Put things in another light.How would that work with traits with lazy vals in them?

Tue, 2011-10-18, 21:37

dotta

Joined: 2011-10-18,

Re: Improving binary compatibility of lazy values

On Oct 18, 2011, at 10:10 PM, √iktor Ҡlang wrote:

On Tue, Oct 18, 2011 at 10:06 PM, Mirco Dotta <mirco.dotta@typesafe.com> wrote:
> I might have misread that, did you mean a final static field in the Class?

Yes.

Put things in another light.How would that work with traits with lazy vals in them?

What do you mean? Trait's concrete members are always injected in the classes mixin the trait. That would not change independently of the strategy used to encode lazy vals.
-- Mirco

Tue, 2011-10-18, 21:47

Viktor Klang

Joined: 2008-12-17,

Re: Improving binary compatibility of lazy values

On Tue, Oct 18, 2011 at 10:27 PM, Mirco Dotta <mirco.dotta@typesafe.com> wrote:

On Oct 18, 2011, at 10:10 PM, √iktor Ҡlang wrote:

On Tue, Oct 18, 2011 at 10:06 PM, Mirco Dotta <mirco.dotta@typesafe.com> wrote:

> I might have misread that, did you mean a final static field in the Class?

Yes.

Put things in another light.How would that work with traits with lazy vals in them?

What do you mean? Trait's concrete members are always injected in the classes mixin the trait. That would not change independently of the strategy used to encode lazy vals.

Just pulling your leg mate :-)

-- Mirco

--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang

Wed, 2011-10-19, 07:37

#10

Iulian Dragos

Joined: 2008-12-18,

Re: Improving binary compatibility of lazy values

On Tue, Oct 18, 2011 at 3:57 PM, Josh Suereth wrote:
> I think I like the idea of computing the position at runtime rather than
> compile time, as long as this is *stable* for a given class across the
> entire application. I've always been a bit nervous about lazy vals and
> serialization since the 2.7 days where you couldn't be sure what you were
> sending over the wire.
> If I can validate that only *vals* and captured variables in a class are
> serialized with it and lazy vals get recomputed, it actually make
> distributed computed a lot better.
> So, between this runtime-computation and other binary compatible
> constraints, I'd want to make *very* sure this runtime computation is *well
> ordered* and not tied to one particular JVM classloader, otherwise it's a
> big no-go.

What do you mean by well-ordered?

The computation happens at class-load time, using standard class
initializers, so JVM's should behave consistently.

Following Viktor's observation, I think each class/trait should have a
field that tells how many lazy fields it has. Then each class computes
the first free index in the bitmap by summing all these superclass
fields. Since this is a static initializer, we pay the cost just once
per class.

What we need to benchmark is if reading a lazy val is not too much
slower (since instead of a bitshift by a constant value, we need to
read this static field as well).

There's material for a SID here..

cheers,
iulian

> - Josh
> 2011/10/18 √iktor Ҡlang
>>
>> Hey Mirco,
>>
>> On Tue, Oct 18, 2011 at 3:12 PM, Mirco Dotta
>> wrote:
>>>
>>> Hi folks,
>>>
>>> At Scala Lift Off 2011 I discussed how lazy values can harshly bite
>>> you with unexpected semantic binary incompatibilities. The main issue
>>> is that, while changing an eager value into a lazy one is a perfectly
>>> fine transformation, the other way around can potentially break
>>> semantic of lazy value, simply because the bitmap that handles fields'
>>> initialization is shared across subclasses.
>>>
>>> This is a particularly bad issue wrt binary compatibility, and I
>>> believe there is room for improvement, but since I'm no compiler guru
>>> I'd like to know what you guys think. Maybe I'm overlooking something.
>>>
>>> I believe there is an easy solution to this problem, which is to have
>>> one bitmap for class (avoid sharing, problem solved). This has the
>>> drawback of increasing the memory footprint, but it may be still
>>> acceptable giving the actual simplicity of the solution.
>>>
>>> Likely, there could be more elaborated strategies. One that came up
>>> during the conference is to store, for each lazy value, a (class)
>>> field to know the bitmap's position the lazy field should use for
>>> correct initialization.
>>
>> For me this is a no-go, simply because it's expensive.
>>
>>>
>>> This "bitmap's position" field would be
>>> initialized at runtime in the class' static constructor (so one field
>>> per class, not per instance).
>>>
>>> A further elaboration of the above idea is to have one single class
>>> field per class that contain a lazy value. The (class) field would
>>> hold the first unused position in the bitmap . Since the position
>>> would be computed at runtime in the static class constructor,
>>
>> You mean in a static initializer block?
>>
>>>
>>> adding a
>>> new lazy value in a parent class would shift by one the starting
>>> position used by a subclass containing lazy values, i.e., eliminating
>>> the risk of having two lazy values pointing to the same bitmap
>>> location.
>>
>> This might be a very good solution, but it needs, as Martin says, benching
>> to verify.
>>
>> Cheers!
>>
>> √
>>
>>>
>>> Cheers,
>>> Mirco
>>>
>>
>>
>>
>> --
>> Viktor Klang
>>
>> Akka Tech Lead
>> Typesafe - Enterprise-Grade Scala from the Experts
>>
>> Twitter: @viktorklang
>
>

Wed, 2011-10-19, 08:57

#11

Viktor Klang

Joined: 2008-12-17,

Re: Improving binary compatibility of lazy values

On Wed, Oct 19, 2011 at 8:34 AM, iulian dragos <jaguarul@gmail.com> wrote:

On Tue, Oct 18, 2011 at 3:57 PM, Josh Suereth <joshua.suereth@gmail.com> wrote:
> I think I like the idea of computing the position at runtime rather than
> compile time, as long as this is *stable* for a given class across the
> entire application. I've always been a bit nervous about lazy vals and
> serialization since the 2.7 days where you couldn't be sure what you were
> sending over the wire.
> If I can validate that only *vals* and captured variables in a class are
> serialized with it and lazy vals get recomputed, it actually make
> distributed computed a lot better.
> So, between this runtime-computation and other binary compatible
> constraints, I'd want to make *very* sure this runtime computation is *well
> ordered* and not tied to one particular JVM classloader, otherwise it's a
> big no-go.

What do you mean by well-ordered?

The computation happens at class-load time, using standard class
initializers, so JVM's should behave consistently.

Following Viktor's observation, I think each class/trait should have a
field that tells how many lazy fields it has. Then each class computes
the first free index in the bitmap by summing all these superclass
fields. Since this is a static initializer, we pay the cost just once
per class.

What we need to benchmark is if reading a lazy val is not too much
slower (since instead of a bitshift by a constant value, we need to
read this static field as well).

There's material for a SID here..

There goes all your spare time... ;)

cheers,
iulian

> - Josh
> 2011/10/18 √iktor Ҡlang <viktor.klang@gmail.com>
>>
>> Hey Mirco,
>>
>> On Tue, Oct 18, 2011 at 3:12 PM, Mirco Dotta
>> <mirco.dotta@scalasolutions.com> wrote:
>>>
>>> Hi folks,
>>>
>>> At Scala Lift Off 2011 I discussed how lazy values can harshly bite
>>> you with unexpected semantic binary incompatibilities. The main issue
>>> is that, while changing an eager value into a lazy one is a perfectly
>>> fine transformation, the other way around can potentially break
>>> semantic of lazy value, simply because the bitmap that handles fields'
>>> initialization is shared across subclasses.
>>>
>>> This is a particularly bad issue wrt binary compatibility, and I
>>> believe there is room for improvement, but since I'm no compiler guru
>>> I'd like to know what you guys think. Maybe I'm overlooking something.
>>>
>>> I believe there is an easy solution to this problem, which is to have
>>> one bitmap for class (avoid sharing, problem solved). This has the
>>> drawback of increasing the memory footprint, but it may be still
>>> acceptable giving the actual simplicity of the solution.
>>>
>>> Likely, there could be more elaborated strategies. One that came up
>>> during the conference is to store, for each lazy value, a (class)
>>> field to know the bitmap's position the lazy field should use for
>>> correct initialization.
>>
>> For me this is a no-go, simply because it's expensive.
>>
>>>
>>> This "bitmap's position" field would be
>>> initialized at runtime in the class' static constructor (so one field
>>> per class, not per instance).
>>>
>>> A further elaboration of the above idea is to have one single class
>>> field per class that contain a lazy value. The (class) field would
>>> hold the first unused position in the bitmap . Since the position
>>> would be computed at runtime in the static class constructor,
>>
>> You mean in a static initializer block?
>>
>>>
>>> adding a
>>> new lazy value in a parent class would shift by one the starting
>>> position used by a subclass containing lazy values, i.e., eliminating
>>> the risk of having two lazy values pointing to the same bitmap
>>> location.
>>
>> This might be a very good solution, but it needs, as Martin says, benching
>> to verify.
>>
>> Cheers!
>>
>> √
>>
>>>
>>> Cheers,
>>> Mirco
>>>
>>
>>
>>
>> --
>> Viktor Klang
>>
>> Akka Tech Lead
>> Typesafe - Enterprise-Grade Scala from the Experts
>>
>> Twitter: @viktorklang
>
>

--
« Je déteste la montagne, ça cache le paysage »
Alphonse Allais

--
Viktor Klang

Akka Tech LeadTypesafe - Enterprise-Grade Scala from the Experts

Twitter: @viktorklang

Scala Main Menu

Improving binary compatibility of lazy values

Scala Quick Links

Featured News

User login