POPL 2015 Artifact Evaluation

Welcome from the AEC Chairs


It is our pleasure to report on the artifact evaluation process that we ran on the behalf of the Program Chair, David Walker. POPL’15 is the first in the long- running POPL conference series to have an Artifact Evaluation Committee (AEC).

Artifact evaluation is concerned with the byproducts of science. An “artifact” is something intended to support the scientific claims made in a paper. For instance, an artifact might be a program’s source code, a data collection, a test suite, a proof, or a model. “Evaluation” is a best-effort attempt to reconcile a paper’s artifacts with the claims made in the paper. A primary goal of the artifact evaluation process is to encourage authors to create artifacts that can be shared and used by others as bases for new activities. The process seeks other benefits as well. These include encouraging authors to be precise in their claims and publicly recognizing efforts toward making high-quality artifacts.

Three months before the POPL’15 paper submission deadline, the AEC chairs invited members of the POPL community to nominate PhD students and postdocs to serve on the AEC. The chairs selected 21 of these, based on their levels of experience and areas of expertise.

After the decisions for POPL’15 submissions were distributed, the authors of accepted papers were invited to submit artifacts for evaluation. (Thus, by design, the artifact evaluation process had no effect on which papers were chosen to appear at POPL.) Authors had one week, until October 6, to respond to the call for artifacts. The submission guidelines asked that each artifact be packaged so as to make evaluation as easy as possible. Typically, this involved the creation of a virtual machine image, but other means were also accepted. Each artifact was accompanied by the accepted version of its associated paper so that the AEC could evaluate each artifact against its paper’s claims. A total of 29 artifacts were submitted for evaluation.

The AEC had almost three weeks, until October 26, to render judgments. The AEC expected artifacts to be “consistent with the paper; as complete as possible; documented well; and easy to reuse, facilitating further research.” The AEC members bid on artifacts and the chairs selected two reviewers for each one. Artifact evaluation had two phases. During the first, “installation phase”, the committee simply tried to download, build, and launch the artifacts. The committee reported any errors that occurred in an initial review and authors had an opportunity to reply with solutions. During the second phase, the AEC tried to repeat some or all of the experiments described in the artifact’s paper. AEC members were cognizant that it would be difficult to reproduce certain results, e.g., benchmarks that were run on high-performance machines. For four artifacts, the AEC purchased Amazon EC2 servers to recreate experimental conditions at a total cost of $310.10. Other artifacts ran successfully on the committee’s personal computers. After all reviews were submitted, the AEC held an intense online discussion to decide, for each artifact, if it met, exceeded, or fell below the expectations set by its paper.

Of the 29 submitted artifacts, 27 were judged to meet or exceed expectations. The papers that describe these artifacts can be recognized by the AEC badge they bear (created by Matthias Hauswirth).

We thank the authors of the 29 submitted artifacts for their work in preparing and documenting their research output. We hope that they found the feedback from the AEC to be helpful. We also thank David Walker for his support, and the members of the AEC for their energy and enthusiasm.

Arjun Guha
Jan Vitek

POPL’15 AEC Chairs