Edit this page on GitHub

Dotc's concept of time

Conceptually, the scalac compiler's job is to maintain views of various artifacts associated with source code at all points in time. But what is time for scalac? In fact, it is a combination of compiler runs and compiler phases.

The hours of the compiler's clocks are measured in compiler runs. Every run creates a new hour, which follows all the compiler runs (hours) that happened before. scalac is designed to be used as an incremental compiler that can support incremental builds, as well as interactions in an IDE and a REPL. This means that new runs can occur quite frequently. At the extreme, every keystroke in an editor or REPL can potentially launch a new compiler run, so potentially an "hour" of compiler time might take only a fraction of a second in real time.

The minutes of the compiler's clocks are measured in phases. At every compiler run, the compiler cycles through a number of phases. The list of phases is defined in the [Compiler]object There are currently about 60 phases per run, so the minutes/hours analogy works out roughly. After every phase the view the compiler has of the world changes: trees are transformed, types are gradually simplified from Scala types to JVM types, definitions are rearranged, and so on.

Many pieces in the information compiler are time-dependent. For instance, a Scala symbol representing a definition has a type, but that type will usually change as one goes from the higher-level Scala view of things to the lower-level JVM view. There are different ways to deal with this. Many compilers change the type of a symbol destructively according to the "current phase". Another, more functional approach might be to have different symbols representing the same definition at different phases, which each symbol carrying a different immutable type. scalac employs yet another scheme, which is inspired by functional reactive programming (FRP): Symbols carry not a single type, but a function from compiler phase to type. So the type of a symbol is a time-indexed function, where time ranges over compiler phases.

Typically, the definition of a symbol or other quantity remains stable for a number of phases. This leads us to the concept of a period. Conceptually, period is an interval of some given phases in a given compiler run. Periods are conceptually represented by three pieces of information

  • the ID of the current run,
  • the ID of the phase starting the period
  • the number of phases in the period

All three pieces of information are encoded in a value class over a 32 bit integer. Here's the API for class Period:

class Period(val code: Int) extends AnyVal {
  def runId: RunId            // The run identifier of this period.
  def firstPhaseId: PhaseId   // The first phase of this period
  def lastPhaseId: PhaseId    // The last phase of this period
  def phaseId: PhaseId        // The phase identifier of this single-phase period

  def containsPhaseId(id: PhaseId): Boolean
  def contains(that: Period): Boolean
  def overlaps(that: Period): Boolean

  def & (that: Period): Period
  def | (that: Period): Period
}

We can access the parts of a period using runId, firstPhaseId, lastPhaseId, or using phaseId for periods consisting only of a single phase. They return RunId or PhaseId values, which are aliases of Int. containsPhaseId, contains and overlaps test whether a period contains a phase or a period as a sub-interval, or whether the interval overlaps with another period. Finally, & and | produce the intersection and the union of two period intervals (the union operation | takes as runId the runId of its left operand, as periods spanning different runIds cannot be constructed.

Periods are constructed using two apply methods:

object Period {
  /** The single-phase period consisting of given run id and phase id */
  def apply(rid: RunId, pid: PhaseId): Period

  /** The period consisting of given run id, and lo/hi phase ids */
  def apply(rid: RunId, loPid: PhaseId, hiPid: PhaseId): Period
}

As a sentinel value there's Nowhere, a period that is empty.