Eliminate dead refs by penelopeysm · Pull Request #220 · TuringLang/Libtask.jl

penelopeysm · 2026-03-25T11:46:26Z

Closes #193 by performing live variable analysis on the BBCode intermediate representation.

Using the example from #193 (but changed to use an input argument, otherwise the entire function gets constant-folded away):

using Libtask

function f(x)
    a = 2*x
    b = 2*a
    produce(b)
    c = 3*b
    produce(c)
    return nothing
end

Libtask.generate_ir(:transformed_bb, f, 1.0)

Libtask.generate_ir(:optimised_bb, f, 1.0)

the BBCode before the ref elimination pass looks like this. I've highlighted operations that are removed with a (*):

BBCode (3 args, 4 blocks)
#32 ─
│   %30 = Libtask.resume_block_is(_1, 15)
│   %31 = Libtask.resume_block_is(_1, 13)
│   %33 = nothing
└── switch %30 => #15, %31 => #13, fallthrough #14
#14 ─
│   %7 = Base.mul_float(2.0, _3)::Float64
│   %16 = Libtask.set_ref_at!(_1, 1, %7)              (*)
│   %17 = Libtask.get_ref_at(_1, 1)                   (*)
│   %8 = Base.mul_float(2.0, %17)::Float64
│   %18 = Libtask.set_ref_at!(_1, 2, %8)
│   %19 = Libtask.set_resume_block!(_1, 15)
│   %20 = Libtask.get_ref_at(_1, 2)                   (*)
│   %21 = Libtask.ProducedValue(%20)
└── return %21
#15 ─
│   %23 = Libtask.get_ref_at(_1, 2)
│   %10 = Base.mul_float(3.0, %23)::Float64
│   %24 = Libtask.set_ref_at!(_1, 3, %10)             (*)
│   %25 = Libtask.set_resume_block!(_1, 13)
│   %26 = Libtask.get_ref_at(_1, 3)                   (*)
│   %27 = Libtask.ProducedValue(%26)
└── return %27
#13 ─
│   %29 = Libtask.set_resume_block!(_1, -1)
└── return Main.nothing

and afterwards it looks like

BBCode (3 args, 4 blocks)
#32 ─
│   %30 = Libtask.resume_block_is(_1, 15)
│   %31 = Libtask.resume_block_is(_1, 13)
│   %33 = nothing
└── switch %30 => #15, %31 => #13, fallthrough #14
#14 ─
│   %7 = Base.mul_float(2.0, _3)::Float64
│   %8 = Base.mul_float(2.0, %7)::Float64
│   %18 = Libtask.set_ref_at!(_1, 2, %8)
│   %19 = Libtask.set_resume_block!(_1, 15)
│   %21 = Libtask.ProducedValue(%8)
└── return %21
#15 ─
│   %23 = Libtask.get_ref_at(_1, 2)
│   %10 = Base.mul_float(3.0, %23)::Float64
│   %25 = Libtask.set_resume_block!(_1, 13)
│   %27 = Libtask.ProducedValue(%10)
└── return %27
#13 ─
│   %29 = Libtask.set_resume_block!(_1, -1)
└── return Main.nothing

Benchmarks

Running cd benchmarks; julia --project=. benchmark.jl, the change in time for the taped calls:

rosenbrock 3.408 ms -> 2.132 ms
ackley 26.259 ms -> 25.703 ms
matrix_test 365.375 μs -> 270.667 µs
neural_net 3.208 µs -> 3.177 µs

so some very significant gains on the first and third, the others less so.

With Turing, the performance gains from this PR are probably very diluted because of the amount of other stuff that needs to be done:

using Turing
J = 8
y = [28, 8, -3, 7, -1, 1, 18, 12]
sigma = [15, 10, 16, 11, 9, 11, 10, 18]
@model function eesc(J, y, sigma)
    mu ~ Normal(0, 5)
    tau ~ truncated(Cauchy(0, 5); lower=0)
    theta ~ MvNormal(fill(mu, J), tau^2 * I)
    for i in 1:J
        y[i] ~ Normal(theta[i], sigma[i])
    end
end
model = eesc(J, y, sigma)

@time sample(model, PG(20), 100; chain_type=Any, progress=false);
# main    1.314166 seconds (9.02 M allocations: 1.010 GiB, 6.64% gc time)
# this PR 1.207591 seconds (8.40 M allocations: 1019.048 MiB, 6.38% gc time)

github-actions · 2026-03-25T15:19:19Z

Libtask.jl documentation for PR #220 is available at:
https://TuringLang.github.io/Libtask.jl/previews/PR220/

penelopeysm added 10 commits March 24, 2026 14:36

Break code up

0a7080a

Add pretty-printing for BBCode

201fa0e

Improve granularity of generate_ir debugging tool

87939f4

Implement ref elimination pass

de29666

Handle GlobalRefs

dbbc852

Add note

83a2762

fix a bug where SSA IDs in set_ref_at! weren't being replaced

7605810

improve generate_ir symbol arguments

bc89258

fix undef phi node values on 1.10

f34a6cd

add docstring to docs

5ea45dd

penelopeysm marked this pull request as ready for review March 25, 2026 15:18

penelopeysm merged commit 5320a5a into main Mar 25, 2026
22 checks passed

penelopeysm deleted the py/perf branch March 25, 2026 15:48

This was referenced Mar 25, 2026

Delete unused refs from the tuple argument entirely #221

Merged

Further elimination of references #224

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eliminate dead refs#220

Eliminate dead refs#220
penelopeysm merged 10 commits intomainfrom
py/perf

penelopeysm commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

penelopeysm commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

penelopeysm commented Mar 25, 2026 •

edited

Loading