- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
The "make" flag
Sun, 2009-02-15, 18:30
Well, several days have passed since my "Tell me not to or I'll commit this" ultimatum, so here we are. :-) As of r17115, scalac now supports the flag "-make", which allows you to select only a subset of the files passed to it to recompile.
This flag currently takes four options with varying degrees of incorrect behaviour:
- changed. Recompile only files which have changed. This is wildly incorrect and will break things. It also happens to be what ant does by default, and you might get away with it sometimes.
- immediate. Recompile only files which have changed and those which depend directly upon them. You'll get away with this most of the time.
- transitive. Recompile all changed files. If A depends on B and B is recompiled then so is A. This is only very mildly incorrect.
- all (this is the default). Recompile all files. This is correct.
For projects which do not have a lot of cyclic dependencies, I *strongly encourage* you to use transitive. Of the not-ridiculously-slow options it is the most correct by far. Unfortunately for some projects such as scalac this is not an option as it simply pulls in too much of the project.
So, why are all of these (other than "recompile everything" incorrect? Am I just that lousy a programmer? I'm afraid not. Scala semantics are such that it is in fact impossible to correctly recompile a project without either very single source file being scanned or very detailed analysis done up front.
Let's look at why these are incorrect:
-changed is incorrect in just about any language: If the signature of any methods defined in a file has changed then dependencies must be recompiled to make sure they still work. It's additionally incorrect in Scala because of the implementation of traits.
-immediate The presence of type inference means that if signatures in a file's dependencies have changed then the signature of dependent files may also change, forcing you to also recompile dependencies of the files whose signatures are changed. This is particularly complicated when types in A depend on types in B and vice versa, as you end up with cases where you have two files which must always be compiled together.
Additionally you end up with cases where you have
trait A;
trait B extends A;
class C extends B;
each defined in different files. If A changes and we only recompile B then C can fail to implement some of A's methods.
- transitive
Nested packages and resolution of imports screw you up. The problem with transitive is that changes can be made which would introduce a new dependency, but only if that file was recompiled. Example:
file foo/A.scala
class A;
file foo/bar/B.scala;
package foo.bar;
class B{
val a = new A;
}
So we compile these. foo/bar/B.scala depends on foo/A.scala
Now we add
file foo/bar/A.scala
package foo.bar;
class A{
}
Now, we should have recompiled B.scala at this point because if we did it would now point to foo.bar.A instead of foo.A, but it's hard to know this here. Things can similarly go wrong when you've got imports of the form foo._ and add another symbol to foo.
Anyway, many of these problems can be partially resolved by making behaviour smarter or turned into compiler warnings when you do unsafe things at your current make level. I'm planning to continue working on this (and I will investigate further the details of the eclipse scheme, though I intend to take a much more conservative approach to correctness than that), but felt that it was better to get a basic working concept committed and working.
My immediate priorities from here are:
a) Clean up a few things. Particularly with regards to the file format.
b) Get this working for scalac development so I can get back to working on the pattern matcher without tearing my hair out over the build process. :-) Currently I *think* that -make immediate should be acceptable for scalac development, but you'll run into long recompiles every time you touch Global (include obvious rant about Global here). Consequently I've not yet changed the build scripts to use this option.
David
This flag currently takes four options with varying degrees of incorrect behaviour:
- changed. Recompile only files which have changed. This is wildly incorrect and will break things. It also happens to be what ant does by default, and you might get away with it sometimes.
- immediate. Recompile only files which have changed and those which depend directly upon them. You'll get away with this most of the time.
- transitive. Recompile all changed files. If A depends on B and B is recompiled then so is A. This is only very mildly incorrect.
- all (this is the default). Recompile all files. This is correct.
For projects which do not have a lot of cyclic dependencies, I *strongly encourage* you to use transitive. Of the not-ridiculously-slow options it is the most correct by far. Unfortunately for some projects such as scalac this is not an option as it simply pulls in too much of the project.
So, why are all of these (other than "recompile everything" incorrect? Am I just that lousy a programmer? I'm afraid not. Scala semantics are such that it is in fact impossible to correctly recompile a project without either very single source file being scanned or very detailed analysis done up front.
Let's look at why these are incorrect:
-changed is incorrect in just about any language: If the signature of any methods defined in a file has changed then dependencies must be recompiled to make sure they still work. It's additionally incorrect in Scala because of the implementation of traits.
-immediate The presence of type inference means that if signatures in a file's dependencies have changed then the signature of dependent files may also change, forcing you to also recompile dependencies of the files whose signatures are changed. This is particularly complicated when types in A depend on types in B and vice versa, as you end up with cases where you have two files which must always be compiled together.
Additionally you end up with cases where you have
trait A;
trait B extends A;
class C extends B;
each defined in different files. If A changes and we only recompile B then C can fail to implement some of A's methods.
- transitive
Nested packages and resolution of imports screw you up. The problem with transitive is that changes can be made which would introduce a new dependency, but only if that file was recompiled. Example:
file foo/A.scala
class A;
file foo/bar/B.scala;
package foo.bar;
class B{
val a = new A;
}
So we compile these. foo/bar/B.scala depends on foo/A.scala
Now we add
file foo/bar/A.scala
package foo.bar;
class A{
}
Now, we should have recompiled B.scala at this point because if we did it would now point to foo.bar.A instead of foo.A, but it's hard to know this here. Things can similarly go wrong when you've got imports of the form foo._ and add another symbol to foo.
Anyway, many of these problems can be partially resolved by making behaviour smarter or turned into compiler warnings when you do unsafe things at your current make level. I'm planning to continue working on this (and I will investigate further the details of the eclipse scheme, though I intend to take a much more conservative approach to correctness than that), but felt that it was better to get a basic working concept committed and working.
My immediate priorities from here are:
a) Clean up a few things. Particularly with regards to the file format.
b) Get this working for scalac development so I can get back to working on the pattern matcher without tearing my hair out over the build process. :-) Currently I *think* that -make immediate should be acceptable for scalac development, but you'll run into long recompiles every time you touch Global (include obvious rant about Global here). Consequently I've not yet changed the build scripts to use this option.
David
Sun, 2009-02-15, 21:57
#2
Re: The "make" flag
On Sun, Feb 15, 2009 at 8:19 PM, Miles Sabin <miles@milessabin.com> wrote:
I'd like to do a bit more direct investigation of what's possible before I formalize anything for this. In particular I need to think harder to know what sort of changes and what sort of dependencies allow us to skip following a dependency.
The basic algorithm is really ludicrously simple right now: Look for changed files. At each step pull in anything that depends on files that need recompiling. The different make options (other than "all") just limit the number of steps the process is run for. With -changed the limit is 0 (only changed files), with immediate it's 1 (pull in dependencies once then stop), with transitive it's MaxInt (I really hope you don't have enough files to hit this limit).
On Sun, Feb 15, 2009 at 5:30 PM, David MacIver <david.maciver@gmail.com> wrote:
> So, why are all of these (other than "recompile everything" incorrect? Am I
> just that lousy a programmer? I'm afraid not. Scala semantics are such that
> it is in fact impossible to correctly recompile a project without either
> very single source file being scanned or very detailed analysis done up
> front.
>
> Let's look at why these are incorrect:
<snip/>
Very interesting. It'd be nice if this could be formalized and/or
fleshed out with an algorithm sketch ... something like that would be
a very useful addition to the SLS.
I'd like to do a bit more direct investigation of what's possible before I formalize anything for this. In particular I need to think harder to know what sort of changes and what sort of dependencies allow us to skip following a dependency.
The basic algorithm is really ludicrously simple right now: Look for changed files. At each step pull in anything that depends on files that need recompiling. The different make options (other than "all") just limit the number of steps the process is run for. With -changed the limit is 0 (only changed files), with immediate it's 1 (pull in dependencies once then stop), with transitive it's MaxInt (I really hope you don't have enough files to hit this limit).
On Sun, Feb 15, 2009 at 5:30 PM, David MacIver wrote:
> So, why are all of these (other than "recompile everything" incorrect? Am I
> just that lousy a programmer? I'm afraid not. Scala semantics are such that
> it is in fact impossible to correctly recompile a project without either
> very single source file being scanned or very detailed analysis done up
> front.
>
> Let's look at why these are incorrect:
Very interesting. It'd be nice if this could be formalized and/or
fleshed out with an algorithm sketch ... something like that would be
a very useful addition to the SLS.
Cheers,
Miles