November 28, 2011

Incremental improvements for CS conferences

Scientists like to debate about the general organization of academic life. Lately, some have called for a clean-slate revolution based on open archives. Yet, as for the majority of clean-slate proposals on well-established processes, I am doubtful that such a shift can occur. But in the meantime, nothing is done to actually fix the issues of the current process. In particular, I have the feeling that academic conferences in computer science (at least in my communities, which span networking, multimedia and distributed systems) are getting worse, and it seems that nobody cares because the most active researchers in this area are too busy preparing their utopian clean-slate revolution.

So, let me try to give below four incremental improvements that every serious conference should implement, for the sake of a better academic life. Two are quite easy:
  • no more deadline extension. A deadline extension is the irrefutable proof that a conference is crappy. A deadline extension means indeed that either the conference does not attract enough solid submissions or the scientists who submit in this conference are unable to finish a work on time. In both cases, it would be a shame to be associated with such a conference. Furthermore deadline extensions bring at least three very negative effects.
    • it creates an unfair gap between the happy fews who are in the awareness and the others. A scientist who knows in July 2011 that ICC deadline will be Sep 28 has a different schedule than the other scientist who naively thinks the deadline is Sep 6. 
    • it is now folklore to announce an extension a few hours before the deadline. This is highly irrespecutful for the (unaware) authors. Week-ends can be ruined to fulfill a deadline, which you discover on Monday has been extended for two weeks.
    • the day before a submission is stressful. A (lately announced) deadline extension multiplies the number of deadline-stressful days by two. Deadline extensions are killing me.
  • a list of accepted papers on the conference webpage the day of the notification. Why is it so hard? An ugly txt-formatted list of accepted papers is just what most scientists want for. From such a list, it is possible to find a link toward an ArXiV or a technical report on the webpages of the authors of accepted papers. Moreover, titles are inspiring, the sooner every scientist can read the titles, the more inspiring it is. And don't forget curiosity of course. Who did pass the cut this year?
Two other improvements are less incremental, but I think their impact would be worth.
  • no blindness at all. The debate about single vs. double blind is a classic. But very few scientists discuss the blindness of reviewers. There is however a raise of complains about the reviews that are too harsh, scientifically wrong and impolite. It is not hard to believe that if the reviews were signed by their authors, they would be written more carefully. Some argue that this would bring potential desires of revenge among scientists. This ridiculous argument assumes that scientists are no better than kids unable to recognize argued criticisms and unable to retain their negative thoughts. If you are not optimistic about human nature, you should notice that research communities are enlarging. So, the revenge desires of a few bad scientists have really few chances to affect you because the probability that these bad guys represent a majority of reviewers for one of your paper is actually very low. Not mentioning that, academic revengers being stupid people, they are probably not in the committees of top-conferences, so you have nothing to lose. And if you face a majority of reviewers who want to unfairly reject your papers because of your previous bad reviews, well it may be time to consider writing better reviews.
  • open access to papers. I have already signed this pledge about open access. I know that academic professional societies (ACM, IEEE and so) have to re-invent themselves but we will not wait them to do it. We cannot degrade the quality of the scientific activity just because a few jobs are in stake.
I think it is the role of the program committee members to alert their chairmen that the academic life would be far better if conferences stick to these simple rules.

November 9, 2011

What's up in networks (3/3): dash

The last post in this mini-series. After openFlow and hetnets, here is dash.

DASH or Dynamic Adaptive Streaming over HTTP
Although it is not exactly what the MPEG scientists have promoted for a decade, most of today's video traffic is based on HTTP and TCP (Netflix player, Microsoft Smooth Streaming and Adobe OSMF). And it works. The video traffic is exploding: adaptive streaming already represents more than one third of the Internet traffic at peak time, and it is expected to prevail, even on mobiles. Facing this plebiscite, the MPEG consortium has launched the process of standardizing DASH into MPEG.

In short, for a given movie, the video server publishes a manifest file in which it declares several video formats. Each format corresponds to a certain encoding, so a certain quality and a certain bit-rates. All these different videos of the same movie are cut into chunks. A client requesting a movie selects a given video format and then starts downloading the chunks. On a periodic manner, the client tries to estimate whether this video encoding fits the capacity of the network link between her and the server. If she is not satisfied, she considers switching to another encoding for the next chunks. What is the best chunk size, how to estimate the link capacity, what is the best delay between consecutive estimation, how to react to short-term bandwidth changes, how to switch to another encoding… are among the questions that have not received the attention of the scientific community, so every DASH client implements some magic parameters without any concern for potential impacts on the network.

Despite the multimedia scientific community and the video standardization group are large lively communities, many research issues related to DASH have not been anticipated and sufficiently addressed. Among them, I highlight:
  • When several concurrent DASH connections share the same bottleneck, the congestion control mechanism of TCP may be compromise. In fact, a DASH connection is based on TCP, which implements an adaptive congestion control with proven convergence toward a fair sharing of the bottleneck among concurrent connections. By incessantly adapting the flow bit-rate DASH may prevent the convergence of TCP. If network bottlenecks locate on links that are shared by hundreds of concurrent DASH flows, the lack of convergence of the congestion control mechanism is a risk. I may overestimate the impact, but at least understanding the impact of DASH adaptive policy (which seems to use a lot of random parameter settings) on the eventual convergence of a congestion control policy is an exciting scientific topic.
  • When multiple servers store different video encodings of the same movie, the client may incessantly switch from a video encoding to another. A DASH connection works especially well when the bottleneck is always the same, whatever the chosen video encoding. In this case, the adaptive mechanism converges toward the video encoding that fits the bottleneck capacity. But in today's Internet, the content can be located in various distinct locations: CDN servers, Internet proxies, and content routers with caching capabilities. If the links toward the different encodings have different congestion level, the DASH adaptive algorithm may become crazy. 
  • A DASH connection does not support swarming. Swarm downloading (one client fetching a large video content from multiple servers) was expected to be enabled by both the multiple copies of the same content and the chunk-based video format. If every chunk comes from a different server, the congestion cannot be accurately measured. In fact, DASH cannot implement a consistent behavior when multiple paths are used to retrieve the video chunks. 
By the way, DASH is yet another point in favor of HTTP, which is becoming the de facto narrow waist of the Internet. The motivations for using HTTP include its capacity to traverse firewalls and NATs, its nice human-readable names and its capacity to leverage on Internet proxies and CDNs. Somehow, DASH adds congestion control and adaptive content, making the HTTP protocol even more powerful. But the gap between its huge utilization over the Internet and the lack of understanding of its behavior at large scale has the potential to scare network operators. I guess it is the way Internet has always evolved.

November 2, 2011

What's up in networks (2/3): hetnet

Here is the second chapter of the mini-series about some (not-so-fresh) topics in networking area. After openFlow, hetnet.

Hetnet, or the Heterogeneous Cellular Networks:
I am probably not the only one to get bored by GSM cellular networks: they have been created by phone engineers who disliked Internet, they are full of acronyms, they are controlled by an operator, they just works. But cellular networks are now the most common way to access to the Internet. Moreover the devices using these networks are full-featured computers, which are managed by owners who install a lot of applications. The number of devices connected to cellular networks is expected to grow dramatically.

Next-generation cellular networks have good chances to differ from our plain old GSM networks. Here are two technologies that may change the game:
  • femto base stations are small and cheap base stations that anybody can buy and install on its own wired Internet connection (for example here). It means that the clients of a wireless service provider pay (base stations + landline Internet communication + consumed electric power) to improve the infrastructure of the carrier and to have an excellent quality of service at home. Carriers are all jumping into this idea. I still don't understand why would a user prefer to buy a base station and connect to Internet through the 4G although she can use wifi. The main argument is that, wifi wireless spectrum being free and badly managed, a local network can have poor performances because too many wifi access points compete or because too many devices share the pool of wireless channels. The 4G spectrum is licensed and managed by the operator, so some wireless channels can be "reserved" to a user. But if everybody has its own femtocell at home, licensed channels will become scarce too, and nobody will tolerate paying for a femtocell that interfere with the neighbors' ones. In order to tackle this issue, nearby femto base-stations should collaborate to share the wireless spectrum and react to changes in the radio environment (especially when neighbors decide to turn on/off their femto base stations). All scientists interested in peer-to-peer and ad-hoc networks will have fun with the problem of channel allocation: end-users form the infrastructure, ensuring a fair sharing of scarce resources is a challenging objective, clever distributed algorithms should solve the problem, incentives to turn on/off the femtocells should be taken into account. As shown in this article, both deployment and management of femto hetnets are still unclear. Those who are not afraid of acronym orgies can look at these slides for a summary of 3GPP standard and a nice telco-oriented overview of the research problems.
  • direct device-to-device wifi communication is a long-awaited feature. Hurrah, WiFi Direct, which is the official name of this feature in the WiFi alliance, is included in the new Android OS version. At least, wifi direct transmission between devices is becoming a reality, which means that the thousands of academic papers about mesh networks and hybrid ad-hoc cellular networks are suddenly worth reading. However, things have changed. Extending the coverage area of base stations, which has been the most frequent motivation in previous works related to mesh networks, is no longer the main concern of mobile carriers. It is now all about mobile data offloading, that is, avoiding communication via the macro base station. In this context, network operators may combine wifi direct and data caching in devices to reduce the amount of requests sent to the Internet. In other words, strategies related to information-centric networks may turn out to be useful in the wireless world.
In a broader perspective, the over-utilization of wireless networks for accessing the Internet highlights an interesting paradox: the wireless transmission is inherently broadcasting (all devices near the wireless router may hear all messages) although the Internet applications are usually designed for unicast communication (a message has only one destination). The capacity of a mobile carrier to leverage on the broadcasting feature of base stations in their cellular networks may become a key asset.