November 28, 2011

Incremental improvements for CS conferences

Scientists like to debate about the general organization of academic life. Lately, some have called for a clean-slate revolution based on open archives. Yet, as for the majority of clean-slate proposals on well-established processes, I am doubtful that such a shift can occur. But in the meantime, nothing is done to actually fix the issues of the current process. In particular, I have the feeling that academic conferences in computer science (at least in my communities, which span networking, multimedia and distributed systems) are getting worse, and it seems that nobody cares because the most active researchers in this area are too busy preparing their utopian clean-slate revolution.

So, let me try to give below four incremental improvements that every serious conference should implement, for the sake of a better academic life. Two are quite easy:
  • no more deadline extension. A deadline extension is the irrefutable proof that a conference is crappy. A deadline extension means indeed that either the conference does not attract enough solid submissions or the scientists who submit in this conference are unable to finish a work on time. In both cases, it would be a shame to be associated with such a conference. Furthermore deadline extensions bring at least three very negative effects.
    • it creates an unfair gap between the happy fews who are in the awareness and the others. A scientist who knows in July 2011 that ICC deadline will be Sep 28 has a different schedule than the other scientist who naively thinks the deadline is Sep 6. 
    • it is now folklore to announce an extension a few hours before the deadline. This is highly irrespecutful for the (unaware) authors. Week-ends can be ruined to fulfill a deadline, which you discover on Monday has been extended for two weeks.
    • the day before a submission is stressful. A (lately announced) deadline extension multiplies the number of deadline-stressful days by two. Deadline extensions are killing me.
  • a list of accepted papers on the conference webpage the day of the notification. Why is it so hard? An ugly txt-formatted list of accepted papers is just what most scientists want for. From such a list, it is possible to find a link toward an ArXiV or a technical report on the webpages of the authors of accepted papers. Moreover, titles are inspiring, the sooner every scientist can read the titles, the more inspiring it is. And don't forget curiosity of course. Who did pass the cut this year?
Two other improvements are less incremental, but I think their impact would be worth.
  • no blindness at all. The debate about single vs. double blind is a classic. But very few scientists discuss the blindness of reviewers. There is however a raise of complains about the reviews that are too harsh, scientifically wrong and impolite. It is not hard to believe that if the reviews were signed by their authors, they would be written more carefully. Some argue that this would bring potential desires of revenge among scientists. This ridiculous argument assumes that scientists are no better than kids unable to recognize argued criticisms and unable to retain their negative thoughts. If you are not optimistic about human nature, you should notice that research communities are enlarging. So, the revenge desires of a few bad scientists have really few chances to affect you because the probability that these bad guys represent a majority of reviewers for one of your paper is actually very low. Not mentioning that, academic revengers being stupid people, they are probably not in the committees of top-conferences, so you have nothing to lose. And if you face a majority of reviewers who want to unfairly reject your papers because of your previous bad reviews, well it may be time to consider writing better reviews.
  • open access to papers. I have already signed this pledge about open access. I know that academic professional societies (ACM, IEEE and so) have to re-invent themselves but we will not wait them to do it. We cannot degrade the quality of the scientific activity just because a few jobs are in stake.
I think it is the role of the program committee members to alert their chairmen that the academic life would be far better if conferences stick to these simple rules.

November 9, 2011

What's up in networks (3/3): dash

The last post in this mini-series. After openFlow and hetnets, here is dash.

DASH or Dynamic Adaptive Streaming over HTTP
Although it is not exactly what the MPEG scientists have promoted for a decade, most of today's video traffic is based on HTTP and TCP (Netflix player, Microsoft Smooth Streaming and Adobe OSMF). And it works. The video traffic is exploding: adaptive streaming already represents more than one third of the Internet traffic at peak time, and it is expected to prevail, even on mobiles. Facing this plebiscite, the MPEG consortium has launched the process of standardizing DASH into MPEG.

In short, for a given movie, the video server publishes a manifest file in which it declares several video formats. Each format corresponds to a certain encoding, so a certain quality and a certain bit-rates. All these different videos of the same movie are cut into chunks. A client requesting a movie selects a given video format and then starts downloading the chunks. On a periodic manner, the client tries to estimate whether this video encoding fits the capacity of the network link between her and the server. If she is not satisfied, she considers switching to another encoding for the next chunks. What is the best chunk size, how to estimate the link capacity, what is the best delay between consecutive estimation, how to react to short-term bandwidth changes, how to switch to another encoding… are among the questions that have not received the attention of the scientific community, so every DASH client implements some magic parameters without any concern for potential impacts on the network.

Despite the multimedia scientific community and the video standardization group are large lively communities, many research issues related to DASH have not been anticipated and sufficiently addressed. Among them, I highlight:
  • When several concurrent DASH connections share the same bottleneck, the congestion control mechanism of TCP may be compromise. In fact, a DASH connection is based on TCP, which implements an adaptive congestion control with proven convergence toward a fair sharing of the bottleneck among concurrent connections. By incessantly adapting the flow bit-rate DASH may prevent the convergence of TCP. If network bottlenecks locate on links that are shared by hundreds of concurrent DASH flows, the lack of convergence of the congestion control mechanism is a risk. I may overestimate the impact, but at least understanding the impact of DASH adaptive policy (which seems to use a lot of random parameter settings) on the eventual convergence of a congestion control policy is an exciting scientific topic.
  • When multiple servers store different video encodings of the same movie, the client may incessantly switch from a video encoding to another. A DASH connection works especially well when the bottleneck is always the same, whatever the chosen video encoding. In this case, the adaptive mechanism converges toward the video encoding that fits the bottleneck capacity. But in today's Internet, the content can be located in various distinct locations: CDN servers, Internet proxies, and content routers with caching capabilities. If the links toward the different encodings have different congestion level, the DASH adaptive algorithm may become crazy. 
  • A DASH connection does not support swarming. Swarm downloading (one client fetching a large video content from multiple servers) was expected to be enabled by both the multiple copies of the same content and the chunk-based video format. If every chunk comes from a different server, the congestion cannot be accurately measured. In fact, DASH cannot implement a consistent behavior when multiple paths are used to retrieve the video chunks. 
By the way, DASH is yet another point in favor of HTTP, which is becoming the de facto narrow waist of the Internet. The motivations for using HTTP include its capacity to traverse firewalls and NATs, its nice human-readable names and its capacity to leverage on Internet proxies and CDNs. Somehow, DASH adds congestion control and adaptive content, making the HTTP protocol even more powerful. But the gap between its huge utilization over the Internet and the lack of understanding of its behavior at large scale has the potential to scare network operators. I guess it is the way Internet has always evolved.

November 2, 2011

What's up in networks (2/3): hetnet

Here is the second chapter of the mini-series about some (not-so-fresh) topics in networking area. After openFlow, hetnet.

Hetnet, or the Heterogeneous Cellular Networks:
I am probably not the only one to get bored by GSM cellular networks: they have been created by phone engineers who disliked Internet, they are full of acronyms, they are controlled by an operator, they just works. But cellular networks are now the most common way to access to the Internet. Moreover the devices using these networks are full-featured computers, which are managed by owners who install a lot of applications. The number of devices connected to cellular networks is expected to grow dramatically.

Next-generation cellular networks have good chances to differ from our plain old GSM networks. Here are two technologies that may change the game:
  • femto base stations are small and cheap base stations that anybody can buy and install on its own wired Internet connection (for example here). It means that the clients of a wireless service provider pay (base stations + landline Internet communication + consumed electric power) to improve the infrastructure of the carrier and to have an excellent quality of service at home. Carriers are all jumping into this idea. I still don't understand why would a user prefer to buy a base station and connect to Internet through the 4G although she can use wifi. The main argument is that, wifi wireless spectrum being free and badly managed, a local network can have poor performances because too many wifi access points compete or because too many devices share the pool of wireless channels. The 4G spectrum is licensed and managed by the operator, so some wireless channels can be "reserved" to a user. But if everybody has its own femtocell at home, licensed channels will become scarce too, and nobody will tolerate paying for a femtocell that interfere with the neighbors' ones. In order to tackle this issue, nearby femto base-stations should collaborate to share the wireless spectrum and react to changes in the radio environment (especially when neighbors decide to turn on/off their femto base stations). All scientists interested in peer-to-peer and ad-hoc networks will have fun with the problem of channel allocation: end-users form the infrastructure, ensuring a fair sharing of scarce resources is a challenging objective, clever distributed algorithms should solve the problem, incentives to turn on/off the femtocells should be taken into account. As shown in this article, both deployment and management of femto hetnets are still unclear. Those who are not afraid of acronym orgies can look at these slides for a summary of 3GPP standard and a nice telco-oriented overview of the research problems.
  • direct device-to-device wifi communication is a long-awaited feature. Hurrah, WiFi Direct, which is the official name of this feature in the WiFi alliance, is included in the new Android OS version. At least, wifi direct transmission between devices is becoming a reality, which means that the thousands of academic papers about mesh networks and hybrid ad-hoc cellular networks are suddenly worth reading. However, things have changed. Extending the coverage area of base stations, which has been the most frequent motivation in previous works related to mesh networks, is no longer the main concern of mobile carriers. It is now all about mobile data offloading, that is, avoiding communication via the macro base station. In this context, network operators may combine wifi direct and data caching in devices to reduce the amount of requests sent to the Internet. In other words, strategies related to information-centric networks may turn out to be useful in the wireless world.
In a broader perspective, the over-utilization of wireless networks for accessing the Internet highlights an interesting paradox: the wireless transmission is inherently broadcasting (all devices near the wireless router may hear all messages) although the Internet applications are usually designed for unicast communication (a message has only one destination). The capacity of a mobile carrier to leverage on the broadcasting feature of base stations in their cellular networks may become a key asset.

October 31, 2011

What's up in networks (1/3): openflow

I found time to go a bit deeper into several (not-that-fresh) topics. I hope this quick summary will be of interest for those who did not. First of this mini-series: OpenFlow

OpenFlow, or the Software-Defined Networks:
Thanks to OpenFlow, I now understand the "control plane vs. data plane" idea, which I thought were mysterious magic words allowing telco engineers to recognize themselves. In the OpenFlow world, there are some dumb switches that route packets according to a routing table, and there is a clever controller, which orchestrates these switches. Switch-Controller communication uses the OpenFlow protocol.

The first novelty is that the OpenFlow protocol has been designed at Stanford, therefore (i) it is cool, (ii) software engineers have heard about it, and (iii) it is endorsed by a buzz concept, namely software-defined networking. The second novelty, but a noteworthy one, is that the main network equipment vendors integrate OpenFlow API in their switches (at least Juniper and Cisco). So, it is becoming real: software developers will really be able to control a network remotely.

OpenFlow is both networks and software:
  • In the network area, there is only one truth: every new concept is something already done twenty years ago. Good news for OpenFlow: it looks like MPLS. Therefore OpenFlow is a networking concept. \qed
  • Computer scientists are driven by vaporous concepts like model abstraction, composition and semantic. Guess what? OpenFlow designers dangerously embrace them. Even worse, network scientists have started publishing in POPL and ICFP.
More seriously, OpenFlow meets a demand. More and more "independent" networks have specific needs that cannot been addressed by router vendors. For example the network in a data-center. Private enterprise networks and even next-generation home networks are also complex networks, which would work better if they could be managed according to the wishes of their owner. OpenFlow provides the friendly interface that allow anybody (should (s)he knows programming) to become the network operator for any such network. Needless to say, this perspective brings a lot of excitements and uncertainties (see for example here and here).

October 21, 2011

Was P2P live streaming an academic bubble?

Or is the academic community just disconnected from the reality?

In brief, the motivation for peer-to-peer live streaming is that servers are unable to deliver a live video at large-scale. I know, it sounds crazy in a You-Tube world. In peer-to-peer system, clients should help the poor video provider broadcast its video, without much delay nor quality degradation. To have more fun, no server at all is authorized.

Believe it or not, but Google finds more than 50,000 scientific documents dealing with this issue or one of its variants. Today, only a handful of systems based on a peer-to-peer architecture are used, mostly to illegally broadcast sport events. As far as I know, these systems (released before the crazy scientific boom on the topic) do not implement one thousandth of the scientific proposals described in these 50,000 articles. It seems that the small teams of developers behind these programs haven't found the time to download/read/understand any of these articles.

Was this abundant scientific production useless? Probably not. First, scientists made some practical achievements. For example, the P2P-Next project has released under L-GPL tons of codes implementing state-of-the-art solutions, including the multiparty swift protocol. A protocol is also in the standardization process at IETF. Consequently, the next generation of peer-to-peer programs should be able to cut down TV media industry as it did for music industry. Second, these studies have produced interesting scientific results beyond the P2P streaming applications, for example the robustness of randomized processes for broadcasting data flows in networks. It reminds me the golden era of ad-hoc networks (2000-2005), where scientists had a lot of funs playing with graphs and information, even if only militaries have found these protocol useful. We do understand networks better now!

But, did it deserve 50,000 articles? Of course not. Under-the-spotlights start-ups (Joost) and publicly-funded pioneering companies (BBC) switched back to centralized architecture four years ago although they had a decisive technological advance. It looks like there is no bankable application out there. Maybe it was for the beauty of science, but whoever has funded these research works can only hope that randomized processes in networks will eventually find a way to improve human conditions in the world. Or maybe it was just a good idea to occupy people?

So, yes, P2P live streaming was a bubble. Here are three quick observations, which would deserve a more accurate analysis:
  • An academic bubble starts like a financial bubble. In the latter, no company can take the risk to not invest in an area if all competitors do. In an academic bubble, neither funding agency nor program committee can challenge an abrupt growth in the number of papers in a given area. Therefore scientists obtain quick fundings, publications, and citations, which fuel the bubble. However the academic bubble differs from the financial one because there is no critical damage when the number of papers abruptly drops. The bubble does not hurt when it explodes. So, nobody tries to understand what went wrong. In other words, this bubbling trend can only grow, and the next bubble (content-centric networking?) has good chances to be even bigger.
  • Tracking the next bubble is attractive. Scientists are rewarded on their impact on the community. In this context, the authors of seminal works in this area, for example Chord (nearly 9,000 citations despite distributed hash table has found few usefulness) or SplitStream (more than 1,000 citations for a system relying on a video encoding that has only been used by academics), are rock-stars. Anticipating the sheepish behavior of scientists has become a key academic skill.
  • Scientists are still incapable to focus their energy toward their right client, who were the aforementioned small teams of hackers in this case. This is yet another motivation for revamping the way scientific results are delivered in computer science. Giving free access to papers, releasing the code that has been used in the paper, participating in non-academic events or finding echoes in other communities are among the solutions. Not only to be meaningful, but also to prevent bubbles.
Just an idea: when the bubble is officially there, would it be possible to officially forbid the bullshit motivation paragraphs in the paper? I wish authors would admit that they just want to have fun developing a new model in a useless bubbling scenario.

August 21, 2011

A warm feedback from Sigcomm

The SIGCOMM conference just finished two days ago. Papers, slides, and the video of the talks are online for free. As could be expected, there is no comparison to my experience at ICC. Despite video recording prevented presenters to move on the stage, the talks were excellent: long enough, well prepared, and in a perfect english. For every talk, many questions immediately raised and people actually debate during the coffee break and social events. In brief, Sigcomm is a conference that is worth the price (registration and travel). A series of remarks below:
  • a Sigcomm paper should present "novel results firmly substantiated by experimentation, simulation or analysis." My understanding is that "substantiating ideas" now prevails, and that the novelty has become debatable. Some ideas, which are remarkably substantiated, do not open enough perspectives. For example, deploying wireless antenna on top of data-center racks is a cool idea, but I would not include it in my list of major scientific breakthroughs. Sigcomm program committees are expected to prefer papers that are "exciting but flawed" to the "correct but boring" ones, yet exciting is not always synonyms of inspiring. In this vein, the program includes three papers related to bit-torrent. Come on, we are in 2011! How many scientists are still interested in such an overwhelmingly addressed research area?
  • Europe is back, with six papers. I already mentioned that EU-funded FP7 STREP projects match the characteristics of a competitive Sigcomm paper. This year's program demonstrate the benefits of writing Sigcomm-compatible FP7 project deliverables as all accepted European papers are (sometimes partially) funded by FP7 framework. Such fundings give the opportunity to evaluate a well-identified idea though large-scale deployment. The twenty-six other papers come from prestigious american institutions, which are probably the only places that combine a unique skill in the Art of Writing academic papers and the capacity to substantiate any idea with a bunch of outstanding experimentations.
  • I am not really into measurements, and I will undoubtedly not be. That's probably why I struggle to identify the scientific point behind the six papers that deal with measurement in the program. Indeed, it seems that the main contribution is the result of the measure, not the way these measures have been obtained. They do not present a novel super-approach to make a brand new measurement set. Rather, the idea is that these measures provide key insights of the behavior of a particular application. I agree, but does it deserve a 14-pages LaTeX-written paper? Measurement papers would probably better fit with an infographics (like this one), wouldn't they?
  • I enjoyed some presentations, especially the controversial model that explains the evolution of protocol adoptions, the scheduling of network flows in data-centers, the synchronization of multiple distant data-centers, and the reduction of redundant data transfer.

    July 8, 2011

    Leveraging collaborative projects to produce better academic research

    Opposing industrial and academic research worlds is a classic discussion. Academics have recently been suspected to address unmotivated problems because they do not manipulate the technologies that are at the core of their research activities. The importance of having an "industrial motivation" behind an academic research is reflected by a statistic: papers authored by at least one industrial researcher represent approximately half of accepted papers in the best conferences in operating systems (15 out of 32 for OSDI'11) and networking (16 out of 32 for Sigcomm'11). These papers monopolize the technical sessions related to new trends, especially datacenter and production network for OSDI, cloud computing and user measurement for Sigcomm.

    In these applied science areas, the best conferences accept papers addressing industry-relevant problems if and only if (i) authors demonstrate the timeliness and relevance of the problem, and (ii) authors carefully evaluate their proposed solutions.
    • problem motivation: a scientist who is only reading papers about a technology can hardly formulate a relevant important problem related to this technology. In order to have an accurate view of the problems faced by companies, a first idea is to spend time there as a visiting researcher, as it is promoted in Google. Another idea is to work with industrials in projects like  FP7 STREP project. I mean, actually work together, and not pretend working together.
    • solution evaluation: a NS2 simulation is no longer enough for a Sigcomm paper. Nowadays, some large-scale infrastructures give free access to scientists (for example Open Cirrus for a large data-center, Planet Lab for an Internet-scale network, Grid 5000 for a grid, Imagin'Lab for a 4G/LTE cellular network). There is no excuse to not test solutions over real infrastructures. However, the access to infrastructure is not sufficient, evaluations should also be based on realistic user patterns. Author of the excellent Hints and tips for Sigcomm authors claims "use realistic traffic models"! Besides using available real traces (for example the amazing network traces from Caida), the idea is again to leverage on a project collaboration with industrials that are able either to deploy a prototype on real clients, or to provide exclusive traces of their real clients.
    Hence, short-term focused collaborative projects are ideal if one wants to write well-motivated well-evaluated industry-relevant papers. But, in this case, why have I never been in position to submit a competitive paper to Sigcomm although I participated in many collaborative projects? Probably because:
    • some of my industrial partners were not really industrial. In large companies, R&D labs are frequently disconnected from the real operational teams, so researchers in these labs are unable to provide substantiated arguments about the criticality of the project, to successfully deploy a prototype, and even to obtain traces from their real clients.
    • in a consortium, every partner has its own agenda. Receiving fundings while minimizing efforts may be the only point all partners agree. I rarely feel that all partners share a strong commitment to make the project actually work. More frequently, the funding acceptance is considered as the final positive outcome, the project itself being only a pain.
    • the project work-plan does not include the writing of a scientific paper. Scientific production is usually seen as a dissemination activity, under the responsibility of an academic partner, although writing a top-class paper requires a precise planning of the contributions of every partner (including milestones and deliverables).
    Now that I understand why successful collaborative projects are critical and why my recent projects have (relatively) failed, I hope I will be able to leverage collaboration with industrials to do better research (a.k.a. write better papers).

    July 4, 2011

    My (disappointing) experience of attending a large conference

    Last month, I attended ICC at Kyoto. ICC is the kind of large-scale academic conference, where more than 1,400 people are expected to meet and collaborate for the sake of networking science. In the meantime, several other major conferences held in the so-called Federated Conferences at San Jose, which gathered approximately the same number of researchers. Several academic scientists have already reported their enthusiasm about such big events (or raised many positive thoughts).

    On my side, my experience was fairly negative. The technical and scientific discussions were rare, mostly because the conference scope is so large that the probability to chat with people sharing your scientific concerns is low. Actually, I have wondered what I was doing there during three quarters of conference time. Finally, I saw three reasons to attend big academic events:

    • awarding scientists: I think that scientists have been excellent students, that their commitment to excellence has again risen once they embrace an academic career, and that they are still not paid accordingly. Scientists do not receive bonuses in cash, however they frequently travel in wonderful places, with great banquet and rooms in palace. Conferencing is an award, which can be typically offered to a worthy PhD student. Similarly, professors do not hesitate to self-award with a full paid one-week conference (grants and funding allow traveling a lot, lets enjoy it). For those who like big hotels and international cities, big conferences are perfect.
    • meeting people from your local community: in a crowded amazingly large banquet, people first tend to cluster through languages or institutions. French chat with French, Chinese with Chinese, and members from University X with other members from University X. These "local cluster conversations" are easy to start (what plane did you take, how bad is the food in your hotel, how jetlagged are you, etc.). These local cluster conversations have at least one benefit: you have time to chat with people who you are used to meeting in local events without any chance to really discuss with. Therefore, a meeting in Japan is the place where you enhance your social network with people that live at less than 200 kilometers from your office, which is 10,000 kilometers away from Japan.
    • grenouilling during hallway conversations: it seems that the best translation of grenouiller is to plot.  In most multi-track conferences, many people prefer to stay in the lobby and do not attend talks. They are not wrong, because many scientific talks are actually bad, and I don't think that a series of talks is the best way to foster scientific conversations. However I am afraid that conversations in the lobby are not worth a trip of thousands of kilometers neither. A large part of conversations I heard or participated was related to research administration: what will be the next event-to-attend, am I in the Program Committee of next big event, what are the latest news about the next national funding call, where could my post-doc find a decent position, if I invite you in my steering committee, would you include me in your steering committee, what are the latest transfers in the academic world, etc.
    Probably because I expected some scientific enlightenments from meeting so many smart people, I have been disappointed. In particular, I definitely disagree with the scientists who argue for more maxi conferences. Next month, I will attend Sigcomm, which is a single-track reasonably-crowded (500 people) conference. Lets see if middle-size conferences are worth degrading my carbon footprint.

    April 28, 2011

    On the attractiveness of research institutions

    The job market for research tenured position is changing. First, as emphasized in a recent Nature issue, the number of graduating PhD is exploding. However the overall number of accepted papers in top-ranked conference is still desperately low. Hence, the vast majority of graduated PhD have a few minor publications and a h-index below 5. These young ambitious researchers are in the long tail of the scientists. Second, as illustrated by the closing of Intel Labs, the number of scientific jobs in private companies is dangerously decreasing. It seems that the future of research in private companies is about maintaining partnerships with universities and about outsourcing scientific studies to the right experts. In the meantime, the number of job openings in academic institutions is relatively stable. Third, academic institutions now hire scientists from all over the world. One of the consequences of the Shanghai ranking is that institutions do no longer focus on native candidates, and that many scientists look for jobs out of their native country.

    Please forget the most prestigious universities and the best young scientists. Both know how to match each other. Let's rather observe the second league: the thousands of more or less famous ambitious institutions, which want to attract the best scientists, and the thousands of more or less unknown ambitious scientists who want to join the best institutions. We are in a typical assignment problem, where institutions and scientists of similar ranking should agree.

    Many indicators have been proposed for the comparison of scientists. As well, many indicators or classification exist for ranking institutions for under-graduating students. But, I don't know how to measure the attractiveness of an academic institution for candidates to a tenure-track position.

    Of course, three criteria prevail:
    • the salary. The romantic vision of the scientist who does not care of money is wrong. Scientists are humans living in a capitalist world.
    • the location. It includes weather, probability that the family members can enjoy (employment, schools, etc.), cultural life, and so on.
    • the prestige of the institution. A scientist builds a career, her resume should maximize the number of famous entries and minimize unknown ones.
    Now, some more specific criteria include:
    • the number of free PhD students. Here, free means that the scientist does not have to produce any effort to have the guarantee that this number of PhD students will be under her advices in her lab.
    • the volume of teaching. It should not be too high because teaching must have no impact at all on the paper productivity. But it should not be too low because teaching is also a way to meet future PhD students.
    • the quality of students. Every scientist knows that bad students can be a significant waste of time, although great students can boost the productivity without much efforts.
    I am quite suspicious about the importance of having top-class colleagues within the same area. Ambitious scientists have their own research area, and a majority of them have their own agenda, without regards to other scientists. This thought leads me to another criteria:
    • the autonomy. From a scientific point of view, a researcher prefers to define her own research axis, and to write her own research proposals, without having to justify anything. From a more practical point of view, a researcher is likely to receive grants for her research, but this money goes first through her institution, which can constraint the expenses. Consequently, a scientist might be not free to buy her own equipment, not free to travel as she wishes, not free to set the salary of a post-doc, ... 
    Is there any other criteria? Of course, every researcher can introduce her own weight on this criteria, depending on her personal priority.

    It is now easy to analyze institutions. Typically, my current employer, Telecom Bretagne (member of Institut Telecom) :

    Good salary when you join, low increasing though
    France is great, I do recommend Brest for a family with kids, but it is actually considered as a sub-attractive place in France
    I am afraid that recent branding operations have significantly affected the reputation of the institution, nobody knows Institut Telecom
    Nb of free PhD students
    No free PhD student at all, but it is not hard to obtain funding for one PhD student from the institution
    Volume teaching
    It is highly negotiable with your colleagues, you can be involved in research-oriented project management rather than formal time-wasted courses
    Student quality
    Very good engineering students, however very few are interested with research
    You are definitely free to do what you want, you are almost free to manage your own budget (personal, travel, equipment)

    And Orange Labs, Issy-les-Moulineaux

    Good salary when you join, good opportunities to increase
    Paris is one of the most attractive cities in the world
    France Telecom was a strong actor of the research, Orange is a well-known brand, but the research center is no longer a key academic player
    Nb of free PhD students
    No free PhD student at all, very hard to obtain it without significant efforts
    Volume teaching
    No teaching at all, but not difficult to find some courses in nearby universities
    Student quality
    No contact with students, but some students (not the best, though) might be interested with research internships
    Not free at all. You have to justify your research axis, and, worse, you have to justify any expense, even for funded projects

    Now, we should create a website in order to gather ratings from several scientists, a kind of tripadvisor for scientific institutions.

    April 20, 2011

    inside the FP7 evaluation (last part): how to build funded proposals

    Don't expect any miracle from this post, and don't expect anything for posts with similar titles: there is no unique recipe for winning proposal. However, I can sketch a few tips that I will at least try to apply to my own proposals:
    • In the specific case of the FP7 STREP, every proposal is read by only three reviewers (no more no less) randomly picked among a set of very diverse reviewers, from experienced academics to young industrials (and vice versa) from various countries. Contrarily to the funding process of Google, there is no single reviewer target. Therefore, proposals should cover several "reading styles". Hopefully, there is no page limit, so don't hesitate to explain things several times in several ways.
    • The Criteria 1 ("Scientific and Technical Quality") can kill a proposal, but it can hardly make a winner. Your objective is to not give any opportunity for reviewer to criticize, so don't waste your energy there, just do a clean job. Recall that STREP is not about new scientific breakthrough, it is about incremental but sure progresses beyond State-of-the-Art. No reviewer can argue against a series of incremental loosely-consistent progresses in several domains. A bad note in Criteria 1 is more often due to a faulty or unconvincing workplan. Don't try to produce super-clever ambitious workplan, but describe things (scientific ideas and, more important, processes) that have 100% chances to be implemented. Revise your workplan a lot because some reviewers harshly track inconsistencies.
    • You have to differentiate, and the best place is Criteria 3 ("Exploitation and Dissemination"). The majority of proposals do not provide any market analysis, only claim standard dissemination plans, and describe very vague exploitation plans. The best proposals include real-world experimentations during the projects (which is actually appreciated), contain some partners that are actually involved in standardization processes, or have already identified some third-party companies to include in the proposal as external partners (or in a associated committee). The exploitation plan should be preferentially written at the beginning of the project, because it has a direct impact on the partners in the consortium, on the perimeter of the required scientific progresses and on the workplan (e.g. a real-world experimentation requires a specific work-package and a early prototype from technical work-packages). When the exploitation plan is strong, the whole project is fully consistent.
    I know that the usual process is definitely not this one. We usually work a lot on the scientific breakthroughs, trying to copy-paste endless bibliographic works written by students, to incorporate old scientific friends and to make the whole stuff as consistent as possible. Then, we let every partner write its own work package, and we argue a lot about the number of men/months and fundings. Finally, we have a few hours to write some crappy paragraphs about the exploitation. And we have wasted at least one month because the resulting proposal is all but a winner...

    March 14, 2011

    Inside the FP7 evaluation (part 2): the process

    The European Commission and committees of major academic conferences face a same challenge: select in a fair manner a dozen of proposals out of 100 (including 90 serious candidates). The process consists of three stages:
    1. reviewing the proposals. As I explained in my previous post, each proposal is a 100-pages long document, which is written by "artists". Three criteria are evaluated: one about the scientific soundness, one about the consortium quality and one about the exploitation. Every criteria is evaluated on a scale ranging from 0 to 5 where half-marks are accepted.
    2. reaching a consensus. For each proposal, five people (the three reviewers, one recorder and a moderator) meet during one hour. The goal is to reach a consensus, which should result in a unified text and a final score for every criteria. The role of the recorder is crucial. She has not read the proposal, but she looked at the reviews, so she knows the main trends. From the meeting discussion, she tries to extract some statements, then her text is revised "live" by the reviewers (and sometimes by the moderator). Wording is considered as important, so some sentences require up to 15 minutes to be accepted by every reviewer. In general, meetings are lively because some reviewers disagree, and it is common that reviewers actually argue. A consensus is reached in most meetings, but frequently in a unpleasant way because an enthusiastic reviewer has few chances to convince both other reviewers, and a positive-but-not-that-much consensus does not produce a winning project. In case of unreachable consensus, additional reviewers are invited to read the proposal. Eventually, a score is voted. 
    3. deciding. The panel committee meeting is like a program committee meeting except that a ranking is produced (even rejected proposals are ranked). The overall note ranges from 0 to 15, but of course, most proposals are between 8.5 and 13.5. There is a critical tie on a high score (around 13) because only a fraction of proposals having this score can be funded. A specific algorithm is used to break ties. In our case, proposals are ranked based on:
      1. the highest score in the Exploitation criteria, then, if tie again,
      2. the highest score in the Scientific criteria
      3. the largest ratio of industries
      4. the largest ratio of SMEs
      5. the largest ratio of partners from new member states of EU
      The role of the reviewers is actually marginal during this meeting: checking the consistency between final texts and final score for every proposal. Downgrading (or upgrading) a proposal after a quick cross-reading is very rare, and deserves a long agreement discussion from the panel. 
    The overall process suffers from a drawback: reviewers spend a lot of time on bad proposals. Every proposal, even the worst one, requires one hour of consensus meeting. Saving this time could let reviewers read a subset of the best proposals, and increase the quality of the final choice. In the panel meeting, a long time is also wasted on revising the text for every proposal, even the ones that will not be funded, although panelists do not have sufficient time to discuss the borderline proposals.

    The consensus part is funny. The overall result of the consensus meetings is rarely a "blind union" of three independent reviews (as it is done in most conferences, like averaging three scores). For example, three reviewers adopted a 4.0 for the scientific part (which is a very good score) but they reached a consensus with a score of 2.5 (which is below the threshold) because the flaws they identified were complementary, or because they discovered that they share an overall lack of excitement about the proposal, so they took time to detect actual flaws justifying the reject.

    The overall process is fair, and there is no way to express any subjective opinion, like this topic is funnier than the other, or these guys should be assisted because their country bankrupts, or I don't like this crappy acronym

    March 1, 2011

    inside the FP7 evaluation (part 1?)

    I am involved in the process of project evaluation for the European Commission for the first time. Some selected remarks:
    • there is an art of writing proposals. Scientists know that the art of writing academic papers has become a key skill in the modern science battlefield. The art of writing proposal is also widely admitted, but I had never faced it. Now I know. The top proposals I reviewed have a lot in common, including the approach and the style. Generally, these projects are about an exciting but obscured concept (vaporware?), supported by a very standard research from high-h-ranked scientists (business as usual), in cooperation with fresh SMEs (old students), and the classic large company (whose role is... hmm, well, to be there). These proposals have probably been powerpoint in a previous life: objectives are presented in a bullet-mode emphatic way, the proposal is un-verbose (so less risk of inconsistency), every page contains a figure or a table. As can be expected, the workpackage organization is perfect with an ideal balance of man-month by workpackage and by partners. It is difficult to know the future of such project. The academic work is probably already under submission. Several web revolutions will occur until the end of the project. The final software will probably fail, because it will be coded by un-managed students in University. However, in the evaluation form of European Commission, such a proposal deserves a "check" for every critical parameter, so at the end, they have good chances to win.
    • it is innovation, it is not about research. For those who had doubts. The best proposals are ambitious. Most of them include some attractive real-world experimentations, which requires committing a lot of developers. As the overall funding is constrained, the research-oriented demand is minimal. Moreover, as previously said, no inconsistency is tolerable, therefore every dozen of claimed man-month should be justified, and related with the remaining of the project, which is very short-term. Therefore, the scientific topic of every participating "researcher" is approximately defined in advance. Is it research? Of course not, it is the so-called innovation by research. I tend to be in favor of such early development project, however the other funding agencies (local area and national) follow the same objective, so is there a way to make un-purposed research? And why the hell are there so few innovations by research from Europe although it is already the seventh similar program?
    • the reviewing process is very short, but well paid. I detected one bad consequence from being a paid reviewer: reviewers have incentives to evaluate more projects, although they have no time to evaluate them carefully. Delays are very tight: I had seven projects to review in less than twelve days, each project being a hundred pages document, which details a three years long study by a consortium ranging from six to twelve partners. Moreover, the quality of the proposals was excellent, even for the worst one. Hence, one half day for a complete reviewing is minimum for a (slow?) young scientist like me. The risk is to produce a quick evaluation based on the strategy of "killing a project for any small detail". My personal reviewing strategy (I saw several people doing the same) was to first have a very quick first pass on the whole document, then to go into details once the overall concept was clear. In this context, no inconsistency is tolerable, it is better to have only one proposal writer, preferentially someone who... now come back to the first point of this post.
    I will probably have more to say after my week at Brussels for the final decision.

    January 23, 2011

    Computer science and engineers: the (bad) French exception

    French engineers are famous in all domains but computer science. This is especially a shame in the 21st century, isn't it?

    French engineers come from Grandes Ecoles, a French selective scholarship from undergraduate to graduating diploma. Unfortunately, many reasons explain that grandes écoles are unable to produce good engineers in computer science. Among others,
    • students do not know anything about computer science when they join a grande école. Neither programming, nor algorithmic... neither logic, nor graph theory... How can you learn a scientific domain from scratch in only three years? You cannot teach English literature to people who don't know English in the first place. 
    • most grande écoles are generalists. The broad extent of the French engineering culture is an advantage, but also a major drawback in the case of computer science. Actually, teachers have just the time to either overview fundamentals, or to look at the first steps in software development, which corresponds to what billions of developers know.
    • each grande école is linked to a few successful French companies that massively hire their students. Unfortunately, in France, no successful company exist in computer science. There is neither IBM nor Microsoft out there. Actually, admittedly Orange is a leading company in Information Technology, but, from my experience working there for a while, I can claim that this company does not deal with computer science at all. Engineers involved in software at Orange just coordinate outsourced workers. Most of them would be unable to develop the quarter of what they outsource. As a matter of fact, Orange has never understood software innovations.
    • educational engineering system is self-fueled. Directors of grandes écoles and people in government are the same who have received this engineering learning and who have absolutely no skill in computer science. Therefore, computer science suffers from a dramatic lack of understanding. I ignore those who confound computer science with Microsoft Office, and I struggle with those who assimilate software engineering with indian outsourced low-level programming.
    As a matter of fact, the prestigious grandes écoles that are the closest to computer science and software engineering (Institut Telecom (which is my current employer), SupelecEnsimag and Enseeiht) focus on producing so-called "project managers", who are expected to drive low-level developers, and who consider software development as a dirty task. They are probably ideal people for French sub-innovative companies, which passively survive. But, as it is said here and there, they can forget working at Facebook or Google. And they will not be able to boost the rare French innovative companies, e.g. Dailymotion or XWiki. 

    We need more geeks. Actually. In a Internet era, this under-representation of geeks in the engineering grade severely penalizes France. A first idea would be to dramatically increase courses related with computer science and software engineering at the undergraduate level. Another idea would be to radically transform the mission of institutions like mine (Telecom Bretagne) so that computer science becomes the top priority. Finally, I wish bureaucrats at the governments (especially education and industry) were not from old-fashion grandes écoles