Updated: Jan 21, 2021
Looking forward to 2021, what are the key developments in the field of AI and ML? What are the trends that will carry forward into next year? Distilling the work of an entire field in a blog post is straight up impossible, but we think some pieces will stick with us.
Despite all pandemic setbacks, AI is one of the lucky fields where most work can be done from anywhere with a computer and an internet connection. We can see how research output from this year has remained strong, ending the year with close to 44 thousand AI related papers on arXiv. We predict that 2021 will bring us close to 60 thousand!
And what do these thousands of publications tell us? Among the many contributions from all this literature, there are some broad themes we believe emerged in 2020 and are going to stay with us: Transformers, extravagant compute, biomedical AI and much more!
1. Transformers beyond language
The transformer revolution shows no signs of slowing down. If 2019 was the year of BERT, a key trend from 2020 has been that of Transformers beyond language models. Any problem that can benefit from large amounts of self-supervised pre-training, will probably benefit from the using some huge variant of a transformer architecture. Whether that will make sense or not computationally, that’s a different story…but with a flood of more efficient sophisticated Transformer models like Longformer, Reformer and Big Bird that is unlikely to stop progress in this area. The key idea here is that fully connected attention layers are not necessary...
This year we’ve seen the Transformer architecture in vision, such as An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , on graphs such as Graph Transformer Networks  or on 3D point clouds such as Point Transformer .
As Yoshua Bengio describes it, the Transformer architecture gives neural nets the ability to reason about set of objects and the relationships between them. The takeover of this architecture to such a wide range of problems has been impressive, and we can expect it to continue throughout next year.
2. Compute with a capital C
Throughout 2019 and the first half of 2020, it seemed like a record on model size was broken every other month. GPT-3  has culminated this escalation with its staggering 175 billion parameters and a training/development cost beyond the millions of euros. There’s no way around it, sheer compute power continues to surprise, even win best paper awards at NeurIPS, and the performance ceiling does not seem near. With the new Gshard framework from Google  and DeepSpeed from Microsoft, there is a clear path towards training models with a trillion or more parameters in 2021.
However, we also see a movement to reverse this trend, where even the largest companies are struggling with the model size. For instance, OpenAI’s GPT-3 infamously did not re-train their model after finding a bug in their training dataset due to cost considerations. Much energy needs to be devoted to more efficient models, software and hardware. This should continue to drive AI adoption in applications and ensure broad access for the less affluent. Remember that a human brain only uses around 20 Watts on average...
3. A commoditization of architectures
After the introduction of the ADAM optimizer  in 2014, it became the default go-to choice for many. Nowadays, the choice of optimizer occupies exactly one short sentence in papers: “we use ADAM with scheduled learning rate […]”. Is it always the best optimizer? Certainly not. Is it almost certainly good enough? Most probably. Paying attention to the optimizer in use has largely become uninteresting in comparison to paying attention to other research questions.
Similarly, the choice of architecture is becoming more of an uninteresting choice to make: “we choose an off-the-shelve ResNet-X”, “Transformer with N layers, D embedding”. Increasingly often, a model’s architecture takes a back seat, conceding the spotlight of research interest to other aspects, such as a new loss function, a data augmentation technique, or the ablations with respect to an interesting parameter. While papers presenting architectural innovations will not stop anytime soon, they will probably become rarer as most of the low hanging fruit in that department has been collected. And as we've said before "The Future will be Self-Supervised": we believe our ability to define learning to benefit from unlabeled data will be more important than the details of the network architecture.
4. Graph Neural Networks (GNNs) gain momentum
Many systems such as networks are best represented as graphs; that is, by entities and their relationships, and GNNs make it possible to learn from such structured data at scale.
Interestingly, graphs are one of the few areas of AI where logic and ML based approaches can coexist. Check out the fantastic comprehensive survey from 2020 on Knowledge Graphs . The world is full of highly structured data to learn from, unlocking the possibilities behind it is exciting and GNNs are an excellent example of it. Ambitious graph benchmarks were also released, such as the Open Graph Benchmark , which will be a key ingredient in the near future in the area.
5. Biomedical applications
The world of ML applied to biology is blossoming: medical imaging, protein folding, drug discovery, etc. One of the most prominent milestones of the year is AlphaFold from DeepMind, which represents a huge leap in the state of the art for predicting what physical shape a protein will fold into given its amino acid sequence.
A lot of work in AI has also been focused on contributing to the battle with COVID-19, either by providing better access to the medical literature, in image and signal processing for diagnostic purposes, but also in drug discovery, with dedicated workshops or activities in top tier conferences such as EMNLP or NeurIPS. We see this application domain growing stronger in the next year. As the recently minted saying goes “AI will not replace doctors, but doctors with AI tools will replace doctors without them”.
6. A spotlight on the ethics of AI
As algorithmic predictions and decisions are gaining influence on our daily lives in the modern age, public discussions about societal implications are for sure heating up. We can really recommend watching The Social Dilemma for a highly accessible starting point of these discussions. The use of AI technology to monitor and control people by authoritarian governments is also a topic worthy of ample public debate. Also in the R&D community itself, especially after the controversial exit of Timnit Gebru from Google, discussions about the ethics of AI, academic freedom in industrial research labs, inclusion and fairness, and our field’s contribution to the greater good of society are gaining momentum. The old hand-waving argument that problems in AI come from bias or noise in the data, or from evil people applying neutral technology, and not from the algorithms or systems themselves seems like yesterday’s position and no longer defensible.
At NeurIPS — arguably the most influential ML conference — the discussion about how the field should tackle broader societal impacts of AI was more prominent than ever, and we hope this leads to more productive and responsible research in the years to come.
While investment in AI startups and scaleups is still very strong, and some notable AI companies such as C3.ai, Palantir and Scale.AI have had very successful IPOs this year, AI has really become mainstream, and needs to make business sense like any other technology. This is positive for the many companies that prove to solve real problems with AI, but has also popped a few balloons at the party.
Self-driving vehicles are an example: while miles driven by self-driving vehicles keep increasing, and companies like Tesla keep pushing driving assistance into the mainstream, it’s now clear that ubiquitous fully self-driving cars might still take some time to appear, and will continue to require very serious investments. This, for instance, led Uber, which had earlier bet heavily on self-driving cars, to completely stop these efforts and announce they’d be selling the unit to Aurora. Similar needs for continued large investments have led OpenAI into a stronger tie up on GPT-3 with Microsoft, the ServiceNow acquisition of Element AI, and Google to assert it will continue to pour money into DeepMind. Hey, who said reaching AGI would be cheap?
The continued strong development of the open-source landscape in AI in 2020 is notably worth mentioning, with frameworks such as Huggingface, PyTorch, TensorFlow, OpenMined and many others. These are lowering the bar of entry for people all around the world to work in AI and to directly benefit from and build on the work of the huge Big Tech R&D labs. For the progress of the field, we certainly hope that this open access model will continue to flourish in 2021.
We hope you enjoyed the 2020 roller coaster ride in AI. We see all the above trends strongly persist in 2021, but as the field continues to evolve so rapidly, with progress measured in months and weeks rather than years, we are also quite certain to be surprised by what 2021 will offer. Stay tuned, and follow us on @ZetaVector.
 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Alexey Dosovitskiy et al. 2020
 Graph Transformer Networks. Seongjun Yun et al. 2019
 Point Transformer. Hengshuang Zhao et al. 2020
 Language Models are Few-Shot Learners. Tom B. Brown et al. 2020
 GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. Dmitry Lepikhin et al. 2020
 Adam: A Method for Stochastic Optimization. Diedrik P. Kingma et al. 2014
 A Survey on Knowledge Graphs: Representation, Acquisition and Applications. Shaoxiong Ji et. al 2020
 Open Graph Benchmark: Datasets for Machine Learning on Graphs. Weihua Hu et al. 2020