top of page

ICLR 2020 Online: a first review

As Zeta Alpha, we joined the ICLR conference with (almost) the whole team, illustrating one of the upsides of going online. A team of a dozen people could participate in the online version of a conference for the budget that would otherwise easily be spent for a single individual traveling to a physical event. A great opportunity not only for smaller companies, but probably even more so for students and members of under-funded institutions! Here is a personal review on the ICLR 2020 conference, along with some quantitative data analysis.

Fully Online, a First

The ICLR conference in its 2020 edition was the first edition to take place fully online. It was not the very first Machine Learning-related conference that was forced to that move, but still a pioneer in this area. The organizers did an amazing job to build a unique online conference platform in record time with video presentations of posters and keynotes for asynchronous access, visualization, browsing and search in all accepted papers, video chats with authors on Zoom, and a group chat platform on Rocket Chat. So how did this work out for the participants? As good as the real thing?


After all, conferences are not only about presentations of state-of-the-art papers, but also an occasion for socialising in the community. You can meet the people behind these amazing new works, ask them to clarify things you would like to understand in more depth, and have fun together. How would that work online? Whereas mingling seems much harder in an online event, it might actually even be easier for some.

Approaching others in person can be challenging at a large conference. People are busy, and whether or not you manage to find and actively take your chance to discuss with that one author whose brain you wanted to pick, often depends on whether you both happen to coincide. If not, that is bad luck, and a missed opportunity for both.


As data enthusiasts at Zeta Alpha, we have found the text-based part of the online communication beneficial in another way: using the public chat rooms to perform some quantitative sociological analyses on the conference promised fun for nerds, but also some actually interesting insights!

Before ICLR 2020 started, the largest ever in terms of participants and accepted papers, we used our platform for finding interesting papers. We identified already famous and influential papers up-front, and used insights coming from our semantic search engine to approximate relevance of papers from different angles.

Now that the conference is over, its chat system has provided another perspective with which we are happy to get our hands dirty for more insights.


For starters, each conference poster had a dedicated chat room in which the authors could be found. Definitely during defined time slots, but often also outside of these. Like in real life, discussions sometimes went on even after the authors had left. In order to see which posters were most actively discussed, we looked at the activity in each of these rooms. To give the data scientist in us a particular pleasure, the distribution follows a nice Zipf distribution pattern.

Here are the twenty most actively discussed papers in the ICLR chat fora:

  1. Generalization through Memorization: Nearest Neighbor Language Models by Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis

  2. Your classifier is secretly an energy based model and you should treat it like one by Will Grathwohl, Kuan-Chieh Wang, Joern-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, Kevin Swersky

  3. Deep Double Descent: Where Bigger Models and More Data Hurt by Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever

  4. Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks by Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin

  5. StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding by Wei Wang, Bin Bi, Ming Yan, Chen Wu, Jiangnan Xia, Zuyi Bao, Liwei Peng, Luo Si

  6. Self-labelling via simultaneous clustering and representation learning by Asano YM., Rupprecht C., Vedaldi A.

  7. VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning by Luisa Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson

  8. Behaviour Suite for Reinforcement Learning by Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado Van Hasselt

  9. Learning from Rules Generalizing Labeled Exemplars by Abhijeet Awasthi, Sabyasachi Ghosh, Rasna Goyal, Sunita Sarawagi

  10. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators by Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

  11. Online and stochastic optimization beyond Lipschitz continuity: A Riemannian approach by Kimon Antonakopoulos, E. Veronica Belmega, Panayotis Mertikopoulos

  12. Plug and Play Language Models: A Simple Approach to Controlled Text Generation by Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu

  13. Identity Crisis: Memorization and Generalization Under Extreme Overparameterization by Chiyuan Zhang, Samy Bengio, Moritz Hardt, Michael C. Mozer, Yoram Singer

  14. Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML by Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals

  15. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut

  16. Hamiltonian Generative Networks by Aleksandar Botev, Irina Higgins, Andrew Jaegle, Sebastian Racaniere, Danilo J. Rezende, Peter Toth

  17. Mogrifier LSTM by Gábor Melis, Tomáš Kočiský, Phil Blunsom

  18. On Identifiability in Transformers by Gino Brunner, Yang Liu, Damian Pascual, Oliver Richter, Massimiliano Ciaramita, Roger Wattenhofer

  19. Jelly Bean World: A Testbed for Never-Ending Learning by Emmanouil Antonios Platanios, Abulhair Saparov, Tom Mitchell

  20. Multiplicative Interactions and Where to Find Them by Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero, Yee Whye Teh, Tim Harley, Razvan Pascanu'

Interestingly, the most discussed paper by Khandelwal et al. was not in our list of 20 most-cited papers before, and it was not the only new contribution to attract a lot of attention. ALBERT, on the other hand, a much cited contender in the lightweight class of the BERT model league, apparently did not require as much discussion as some others during the conference.

Not having done any qualitative analysis, we leave further conclusions up to the reader. Feel free to share your own thoughts with us!

Social Events

Similar to the poster sessions, the social events also had their own chat rooms. Discussions took place during the whole conference, and we looked at which communities were most active. The 35 “social” channels were formed around various dimensions such as specific subfields, geographics, as well as social, ethical, and political topics. The most active ones were:

  1. Topics in Language Research

  2. Open source tools and practices in state-of-the-art DL research

  3. BlackInAI Meet-Up

  4. The RL Social

  5. The Bitter Lesson for AI

This list shows that NLP is, maybe not surprisingly given the surge of attention to this field, the main topic at the ICLR. Open Source Software development is the norm in both academic and industrial research. Furthermore, we see that topics around diversity, most prominently represented by the “BlackInAI” channel, and meta-topics (“The Bitter Lesson for AI”) have been discussed very actively. Reinforcement learning remains a hot topic, illustrated not only by the amount of conference papers, but also by the chat activity.

Online: Pros and Cons

Summing up, and looking back at the first virtual ICLR conference as a whole, here is our summary of the advantages and disadvantages.


  • Lower hurdle for participation (financially, visa-related)

  • No traveling: less environmentally harmful and less individual stress

  • (Mostly) asynchronous: little individual adaptation to schedule required

  • Amazing technical platform; thanks for the great organization! This alone is a great addition to the regular conference experience.


  • Still no major Machine Learning/NLP conference in Africa

  • Harder to socialize, at least for some (see above)

  • Harder to fully focus on conference when being physically at your normal workplace

I’m curious what other people think, so feel free to discuss with us on Twitter.

Also, if you have participated in ICLR 2020, don’t forget to take the survey. This will help the organizers to learn and improve for next year's ICLR conference. While it’s unclear if that can be a physical event again, and whether or not we would even want it to be, at least the programme chairs are already known.

We are totally excited for next years ICLR! Hopefully a good mix of in person meeting and the best of this year's awesome online experience.

194 views0 comments

Recent Posts

See All


bottom of page