May 9, 2017

Reproducibility in ACM MMSys Conference

Science is a collective action. A researcher aims at writing papers, which inspire other researchers and eventually help them to also make progresses. We are all part of a large collaborative movement toward a better understanding on how things work. In the case of the Multimedia System community, we deal with animated images, more generally objects that active our senses and more specifically how to encode, transport and process them.

Despite this evidence, the scientific community is driven by competitive processes, which sometimes lead to secrecy and unwillingness to freely discuss future work. In particular, since exploiting a dataset is a key asset to get papers accepted, the competitive process may conduct researchers to keep a valuable dataset (or a valuable software) to themselves in the fear that other may exploit it better and faster. This (natural) behavior makes the science progress slower than if a collaborative process was in place.

The “open dataset and open software track” is a tentative to fix this issue in the ACM MMSys conference. The track aims at favoring and rewarding researchers who are willing to share. It aims at making science progress faster, still in the competitive process (we accepted only a subset of the submitted datasets for presentation), but with the collaboration in mind.

The movement for the promotion of reproducible research is ongoing and we are very glad to see that the number of submitted open artifacts has increased since 2011 (the first open dataset track in MMSys history). Previous datasets for MMSys can be found here. This year, we accepted ten papers, which describe dataset and software.

To get one step further, we have embraced the new initiative launched by ACM Digital Library related to reproducibility badges. In very short, the authors of an accepted paper that let other researchers check the artifact they used can be rewarded by obtaining a badge for their paper. We have implemented badges in two tracks in 2017 ACM MMSys.

Badges for Dataset Track

In the Open Dataset Track, we have selected the badges "Artifacts Evaluated – Functional", which means that the dataset (and the code) has been tested by reviewers, who had no problem executing, testing, and playing with it, and "Artifacts Available", which means that the authors decided to publicly release their dataset and their code.

During the selection process, we have acted as usual in academic conference. We have invited a dozen of researchers (who I know are committed to a more reproducible research) to join the committee. Then, we have associated three reviewers to each paper, a paper being a description of a dataset, which is available in a public url. Reviewing an artifact is not the same experience as reviewing an academic paper. To capture a bit more the experience of engaging into the artifact, we have added some unusual questions in the reviewing form, typically:
Relevance and reusability (from 1. Artifact on a niche subject to 4. A key enabler on a hot topic)
Quality of the documentation (from 1. The documentation is below expectations to 3. Crystal-clear)
Artifact behavior (from 1. Bad experience to 3. Everything works well)

Then, as usual in academic conferences, we selected the dataset papers that got the best appreciation from the reviewers. This year, four of them are related to 360° images and video, the currently hottest topic in multimedia community. Such datasets have been cruelly missing so far, so we are very happy to fill this gap. Two artifacts are related to health, two to transport systems and two to increasingly popular human activity.

Badges for Papers in Research Track

In parallel, the organizers of the MMSys conferences have accepted to badge some of the papers that have been accepted in the main "Research Track" of the conference. In this case, the process has been different. First, we waited to know which papers have been accepted. Then, and only then, we have contacted the authors of these accepted papers and proposed them a deal: if you want a badge, you have to first release the artifact in a public website and also to write a more detailed documentation on how to use this artifact. But since we know that this latter instruction could prevent authors to apply for the badge, we authorize those who applied to get extra-pages as Appendix in their papers.

The authors gave us access to a pre-version of the camera-ready version of their papers, then, I contacted another member of the program committee and we both tested the artifact. In that case, we do not have to consider whether the dataset matters for the community or whether it is an enabler. Since the paper has already been accepted, our only mission is to test the dataset and to check if the documentation is enough for any scientist to play with it.

Three papers have followed the process until the end and we are proud to offer them the badges.

January 27, 2017

Attending an MPEG meeting as an academic researcher

I recently attended an MPEG meeting for the first time. I am now used to attending academic conferences (for the best and the worst) but I had never attended a meeting of a standardization group before. Overall, my feedback is very positive and I will probably embrace a bit more the standard circus in the future (hopefully I will not wait forty years before attending another standard meeting).

I especially appreciate the commitment of researchers during the MPEG sessions. The attendees are engaged into a "technical/scientific conversation" with the researcher who presents his contribution. It is in no way comparable to the experience of most academic talks. I identified some of the key differences between a meeting group at MPEG and a standard session in an academic event:
  • The scope of a meeting group is very narrow. For example, I attended the meeting of the ad-hoc group in charge of discussing projections of 360° videos into 2D maps. Every attendee had good reasons to attend this meeting in particular, so free riders were minority. In an academic conference, the Program Chairs tries to schedule the presentations so that papers sharing a similar topic are gathered, but the objectives of these academic papers are often quite different. Instead the contributions during a standard session share the same objectives, which inevitably invite researchers who are experts in the domain to argue about the pros and cons of every contribution, including their owns.
  • The presentation is not the end, it is the beginning of something wider, which is to eventually contribute to a common (un-authored) document. The chairman is in charge of writing a consensual document after the meeting and a presenter aims at convincing attendees that his contribution is worth being included in this document without reserve. In an academic conference, the motivation of the presenter is to be present so that an accepted paper is not withdrawn from the digital library due to no-show.
  • When a presenter is invited to introduce his contributions, it is no showtime. He usually stays at his seat and he scrolls over the document that every attendee has previously opened (most people had a look at the contributions beforehand). There is no talk, no slides, no formalism. Only the presenter, his contribution, and engaged attendees. The debate related to a contribution can be two-minutes long or one hour-long. I found it much more lively than well-formatted slide-based talks.
I also appreciate this feeling of being useful as a "public scientist" in a population that is mostly comprised of private researchers. A scientist has various ways to disseminate the knowledge he is supposed to produce with respect to his public funding and salary. Academic conference is the most common way. Some scientists create start-ups. Some scientists develop strong ties with companies and spend most of their energy collaborating in projects. Good reasons to disseminate in standard meetings include:
  • The contribution from a public academic researcher is (usually) not driven by mercantile private interests. We are supposed to provide something that is closer to The Scientific Truth than what other researchers from competing companies can claim. I understand that one of the missions of a public researcher attending a standard meeting is to ensure that what will eventually become a widely used standard is not an aggregation of patented technologies but rather a scientifically solid and open solution.
  • Every scientist hopes that the fruit of his research will be eventually exploited, whether indirectly to contribute to a better knowledge or directly by integration into an object that is useful to the society. In applied research topics such as computer science, the academic conferences are not necessarily the best way to convey ideas to the companies that are in capacity to exploit a scientific result. The academic world is mostly fuzzy and closed. A standard group appears as a direct way to enable the exploitation of scientific ideas, without restriction.
Of course, the experience of attending an MPEG meeting also includes annoyances: A lot of time is spent at orchestrating the various standard sub-groups, some guys can ruin a whole meeting by interfering with every presenter, the circus is full of jargon and bizarre usages, which prevent a newcomer to join, political and business games exist ... but the advantages are also numerous (including but not restricted to the above). Overall, the balance is in my opinion positive.