Selected Works
IFCP
2025
Dan Plyukhin, Xueying Qin, Fabrizio Montesi
@article{plyukhin2025relax,
author = {Dan Plyukhin and Xueying Qin and Fabrizio Montesi},
title = {Relax! The Semilenient Core of Choreographic Programming (Functional Pearl)},
journal = {Proc. {ACM} Program. Lang.},
volume = {9},
number = {{ICFP}},
article = {269},
year = {2025},
month = {aug},
pages = {269:1--269:27},
doi = {10.1145/3747538},
url = {https://doi.org/10.1145/3747538},
publisher = {{ACM}}
}
The past few years have seen a surge of interest in choreographic
programming, a programming paradigm for concurrent and distributed
systems. The paradigm allows programmers to implement a distributed
interaction protocol with a single high-level program, called a
choreography, and then mechanically project it into
correct implementations of its participating processes. A choreography
can be expressed as a λ-term
parameterized by constructors for creating data “at” a process and for
communicating data between processes. Through this lens, recent work has
shown how one can add choreographies to mainstream languages like Java,
or even embed choreographies as a DSL in languages like Haskell and
Rust. These new choreographic languages allow programmers to write in
applicative style (like in functional programming) and write
higher-order choreographies for better modularity. But the semantics of
functional choreographic languages is not well-understood. Whereas
typical λ-calculi can have
their operational semantics defined with just a few rules, existing
models for choreographic λ-calculi have dozens of
complex rules and no clear or agreed-upon evaluation
strategy.
We show that functional choreographic programming is simple.
Beginning with the Chorλ model
from previous work, we strip away inessential features to produce a
“core” model called λχ. We discover
that underneath Chorλ’s
apparently ad-hoc semantics lies a close connection to non-strict λ-calculi; we call the resulting
evaluation strategy semilenient. Then, inspired by previous
non-strict calculi, we develop a notion of choreographic evaluation
contexts and a special commute rule to simplify and
explain the unusual semantics of functional choreographic languages. The
extra structure leads us to a presentation of λχ with just ten
rules, and a discovery of three missing rules in previous presentations
of Chorλ. We also show how the
extra structure comes with nice properties, which we use to simplify the
correspondence proof between choreographies and their projections. Our
model serves as both a principled foundation for functional
choreographic languages and a good entry point for newcomers.
PLDI
2025
Dan Plyukhin, Gul Agha, Fabrizio Montesi
@article{plyukhin2025crgc,
author = {Plyukhin, Dan and Agha, Gul and Montesi, Fabrizio},
title = {CRGC: Fault-Recovering Actor Garbage Collection in Pekko},
year = {2025},
issue_date = {June 2025},
publisher = {{ACM}},
address = {New York, NY, USA},
volume = {9},
number = {{PLDI}},
url = {https://doi.org/10.1145/3729288},
doi = {10.1145/3729288},
journal = {Proc. ACM Program. Lang.},
month = jun,
articleno = {185},
numpages = {25},
keywords = {actor model, actors, distributed systems, fault tolerance, garbage collection}
}
Actors are lightweight reactive processes that communicate by
asynchronous message-passing. Actors address common problems like
concurrency control and fault tolerance, but resource management remains
challenging: in all four of the most popular actor frameworks (Pekko,
Akka, Erlang, and Elixir) programmers must explicitly kill actors to
free up resources. To simplify resource management, researchers have
devised actor garbage collectors (actor GCs) that monitor the
application and detect when actors are safe to kill. However, existing
actor GCs are impractical for distributed systems where the network is
unreliable and nodes can fail. The simplest actor GCs do not collect
cyclic garbage, whereas more sophisticated actor GCs are not
fault-recovering: dropped messages and crashed nodes can cause
actors to become garbage that never gets collected.
We present Conflict-free Replicated Garbage Collection (CRGC): the
first fault-recovering cyclic actor GC. In CRGC, actors and nodes record
information locally and broadcast updates to the garbage collectors
running on each node. CRGC does not require locks, explicit memory
barriers, or any assumptions about message delivery order, except for
reliable FIFO channels from actors to their local garbage collector.
Moreover, CRGC is simple: we concisely present its operational
semantics, which has been formalized in TLA+, and prove both soundness
(non-garbage actors are never killed) and completeness (all garbage
actors are eventually killed, under reasonable assumptions). We also
present a preliminary implementation in Apache Pekko and measure its
performance using two actor benchmark suites. Our results show the
performance overhead of CRGC is competitive with simpler approaches like
weighted reference counting, while also being much more powerful.
ECOOP
2024
Dan Plyukhin, Marco Peressotti, Fabrizio Montesi
@inproceedings{DBLP:conf/ecoop/PlyukhinPM24,
author = {Dan Plyukhin and
Marco Peressotti and
Fabrizio Montesi},
editor = {Jonathan Aldrich and
Guido Salvaneschi},
title = {Ozone: Fully Out-of-Order Choreographies},
booktitle = {38th European Conference on Object-Oriented Programming, {ECOOP} 2024,
September 16-20, 2024, Vienna, Austria},
series = {LIPIcs},
volume = {313},
pages = {31:1--31:28},
publisher = {Schloss Dagstuhl - Leibniz-Zentrum f{\"{u}}r Informatik},
year = {2024},
url = {https://doi.org/10.4230/LIPIcs.ECOOP.2024.31},
doi = {10.4230/LIPICS.ECOOP.2024.31},
timestamp = {Mon, 03 Mar 2025 21:02:51 +0100},
biburl = {https://dblp.org/rec/conf/ecoop/PlyukhinPM24.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Choreographic programming is a paradigm for writing distributed
applications. It allows programmers to write a single program, called a
choreography, that can be compiled to generate correct implementations
of each process in the application. Although choreographies provide good
static guarantees, they can exhibit high latency when messages or
processes are delayed. This is because processes in a choreography
typically execute in a fixed, deterministic order, and cannot adapt to
the order that messages arrive at runtime. In non-choreographic code,
programmers can address this problem by allowing processes to execute
out of order—for instance by using futures or reactive programming.
However, in choreographic code, out-of-order process execution can lead
to serious and subtle bugs, called communication integrity
violations (CIVs).
In this paper, we develop a model of choreographic programming for
out-of-order processes that guarantees absence of CIVs and deadlocks. As
an application of our approach, we also introduce an API for safe
non-blocking communication via futures in the choreographic programming
language Choral. The API allows processes to execute out of order,
participate in multiple choreographies concurrently, and to handle
unordered data messages. We provide an illustrative evaluation of our
API, showing that out-of-order execution can reduce latency and increase
throughput by overlapping communication with computation.
@phdthesis{plyukhin2024,
author = {Dan Plyukhin},
title = {Fault-Tolerant and Fault-Recovering Garbage Collection
for the Actor Model: A Collage-Based Approach},
school = {University of Illinois Urbana-Champaign, {USA}},
year = {2024}
}
An actor garbage collector (actor GC) is a tool for
automatically identifying actors that are safe to delete, and reclaiming
their resources. Actor GC could be particularly useful in distributed
applications, because programmers have difficulty reclaiming resources
after faults such as crashed nodes or dropped messages. Unfortunately,
faults are a pain point in existing actor GCs: in existing approaches,
an actor on a crashed node with a reference to an actor on a healthy
node will prevent the healthy actor—and its references—from ever being
garbage collected. Moreover, existing GC algorithms have poor
scalability in a distributed systems. This is because of the
synchronization and message overhead they introduce by requiring causal
delivery, or by introducing a large number of control messages. For
these reasons, it has not been practical to add actor GC to popular
frameworks like Akka and Erlang.
This thesis explores an emerging technique for actor GC, dubbed the
collage-based approach. Collage-based GCs are capable of high
performance because they do not dictate when an actor should participate
in garbage collection, and by design they naturally make progress with
only partial information. The thesis presents two collage-based GCs: PRL
and CRGC. Both GCs are provably correct and impose no locks, memory
barriers, or message ordering requirements. PRL uses distributed
reference listing to collect acyclic garbage and allows node-local
garbage collectors to detect distributed cyclic garbage via a
lightweight gossip protocol. We then use insights from PRL to develop
CRGC: the first actor GC capable of recovering from crashed nodes and
dropped messages. We have formalized CRGC in TLA+ and implemented CRGC
in Akka. Preliminary evaluation shows that CRGC imposes little overhead
in practice and is capable of collecting actors that become garbage
caused by crashed nodes.