💤

Lazy Evaluation

FoundationsDeferred computation on demand

Lazy evaluation is the strategy of not computing a value at the moment it is defined, but instead delaying the computation until the value is actually needed. The expression is stored as 'work to do later,' and execution happens only when some consumer demands the result. That change in timing is powerful. It lets programs describe infinite sequences, skip unused branches, and avoid paying for the full cost of a pipeline when only part of the output is consumed.

▶Architecture Diagram

🔄 Process

📝표현식 선언

📦Thunk / Generator

⏸️대기

📣값 요청

⚙️실제 평가

✅값 생산

Dashed line animations indicate the flow direction of data or requests

Why do you need it?

In eager evaluation, work happens up front. If you chain several transformations over a large collection but ultimately read only the first few results, every intermediate step still runs and every intermediate array is still allocated. If the left side of a condition already determines the answer, eagerly computing the expensive right side is pure waste. More fundamentally, eager evaluation cannot naturally represent an infinite sequence because it tries to compute the whole thing immediately. When the consumer's needs are smaller than the declared computation, the execution model itself becomes the source of inefficiency.

Why did this approach emerge?

Haskell made lazy evaluation famous by adopting it as the default execution model, enabling infinite lists and highly compositional code that still runs only as far as needed. Mainstream languages usually remain eager by default, but they have steadily imported lazy techniques through iterators, generators, streams, and short-circuit operators. As large-scale data processing and reactive UIs became common, evaluation timing stopped being a purely academic concern and became part of API and performance design. Lazy evaluation is now best understood as a way of aligning computation cost with consumption patterns.

How should you read it today?

Today, it is usually more accurate to read lazy evaluation as a selective tool inside otherwise eager languages than as an all-or-nothing execution model. Most application developers meet it through generators, iterators, streams, short-circuit operators, and query pipelines that defer work until a consumer pulls on the result. That shifts the practical question. The question is rarely 'should this whole program be lazy?' and more often 'do we actually need the whole value right now?' The answer matters most when computation is pure. Once logging, network calls, or other side effects are mixed in, deferred execution can make timing harder to reason about, so lazy evaluation is usually safest when the effectful boundary stays explicit.

How does it work inside?

Lazy evaluation typically works by wrapping computation in something like a thunk, iterator, or generator instead of executing it immediately. That wrapper contains the recipe for producing the value, but not the value itself yet. When the consumer calls next(), asks for the head of a sequence, or reaches a branch that truly needs the result, evaluation is triggered. Even then, only the needed portion must be computed. Some implementations cache the produced value so repeated access does not repeat the work; others recompute on demand. The defining feature is that evaluation time shifts from definition time to consumption time.

In Code

A generator that produces values only on demand

function* naturals(): Generator<number> {
  let n = 0;
  while (true) {
    yield n;
    n += 1;
  }
}

const nums = naturals();
nums.next().value; // 0
nums.next().value; // 1
nums.next().value; // 2

// The full sequence is never created up front.
// Each value appears only when next() is called.

The generator does not materialize all natural numbers in advance. It computes one value each time the consumer asks, which is the essence of lazy evaluation.

Boundaries & Distinctions

Lazy evaluation does not compete with transformations like map, filter, and reduce; it changes when those transformations run. The same logical pipeline can be eager or lazy depending on whether each stage executes immediately or waits for demand from the consumer. The distinction affects timing, memory usage, and performance characteristics rather than the meaning of the transformation itself. The main tension is with side effects. When evaluation is delayed, logs, network calls, and exceptions also move later, which can make the behavior harder to predict. Lazy evaluation stays clearest when paired with pure computation and explicit effect boundaries.

Trade-off

Lazy evaluation can save substantial work and memory because it computes only what the consumer actually uses, and it enables expressions over infinite or extremely large sequences. The cost is that execution timing becomes less obvious. Bugs may surface later than expected, and deferred computations can hold onto more memory than intended if they remain reachable for too long. This is why lazy systems sometimes suffer from space leaks: the program is correct, but too much deferred work accumulates before it is forced. The technique is most valuable when consumption is partial, sequential, or demand-driven. It is less attractive when the full result is always needed immediately and explicit timing is more important than flexibility.

When should you use it?

Lazy evaluation shows up in generator-driven data consumption, iterator pipelines, streaming parsers, and UI flows that prepare only the visible portion of a large dataset. It is also useful for expensive fallback calculations that should run only if a cache miss or branch condition truly requires them. The design question is simple: do we really need the whole value right now? If the answer is no, lazy evaluation becomes a serious candidate because it lets the shape of consumption drive the cost of computation.

Large pipelines -- skipping the rest of the work when only the first few results are consumedInfinite sequences -- representing streams like natural numbers without materializing them allConditional branches -- avoiding expensive fallback computation unless a condition truly requires itStreaming and pagination -- producing values only as the consumer asks for more