Transformers at Work

Zeta Alpha is hosting a Deep Learning for NLP Workshop. 17th of January 2020 at the Event Space ('Night') of the Startup Village at Science Park 608 in Amsterdam (see travel information below). Open for pre-registered attendees only. We are completely full and on site registration is not possible.

Workshop Program:

14:30-15:00 Arrival and Registration

15:00-15:30 Welcome, Short overview: Jakub Zavrel (Zeta Alpha)

15:30-16:00 Thomas Wolf (Hugging Face)
"Transfer Learning in NLP: Concepts, Tools and Trends"

Over the last two years, the field of Natural Language Processing (NLP) has witnessed the emergence of transfer learning methods and architectures which significantly improved upon the state-of-the-art on pretty much every NLP task. The wide availability and ease of integration of these transfer learning models are strong indicators that these methods will become a common tool in the NLP landscape as well as a major research direction. In this talk, I'll present a quick overview of modern transfer learning methods in NLP and review examples and case studies on how these models can be integrated and adapted in downstream NLP tasks, focusing on open-source solutions."

----

16:00-16:30 Angela Fan (Facebook)

"Long Form Question Answering"

We will discuss long-form question answering, a task requiring elaborate and in-depth answers to open-ended questions. The dataset comprises 270K threads from the Reddit forum ``Explain Like I'm Five'' (ELI5) where an online community provides answers to questions which are comprehensible by five year olds. Compared to existing datasets, ELI5 comprises diverse questions requiring multi-sentence answers. We provide a large set of web documents to help answer the question. Automatic and human evaluations show that an abstractive model trained with a multi-task objective outperforms conventional Seq2Seq, language modeling, as well as a strong extractive baseline. However, our best model is still far from human performance since raters prefer gold responses in over 86% of cases, leaving ample opportunity for future improvement. In subsequent work, we propose constructing a local graph structured knowledge base for each query, which compresses the web search information and reduces redundancy. We show that by linearizing the graph into a structured input sequence, models can encode the graph representations within a standard Sequence-to-Sequence setting. We apply this approach to long form question answering. By feeding graph representations as input, we can achieve better performance than using retrieved text portions.

16:30-16:50 Coffee Break

16:50-17:20 Marzieh Fadaee (Zeta Alpha)

"What's in a context? The effects of data on deep learning language models"

Deep learning models have achieved substantial improvements in many NLP tasks in recent years. The performance of these models depends substantially on the availability of relevant and sizeable data. The last decade has been marked by the explosive growth of data as well as an increasing appreciation of data as a valuable resource. An interesting question is how deep learning models use data to learn, and what linguistic properties they are still unsuccessful to capture. In this talk, I look into the learning behaviour of deep learning models and discuss how different properties of the data influence the learning process. Specifically, how these models address different challenges in language understanding such as translation of rare words. I present several data alteration and augmentation techniques to address different challenges and help elevate the learning capabilities of these models.

----

17:20-17:50 Mostafa Dehghani (Google)

"Moving Beyond Translation with Universal Transformers"

In this talk, we will discuss the Universal Transformer model. The Universal Transformer is an extension to the Transformer models which combines the parallelizability and global receptive field of the Transformer model with the recurrent inductive bias of RNNs, which seems to be better suited to a range of algorithmic and natural language understanding sequence-to-sequence problems. Besides, as the name implies, in contrast to the standard Transformer, under certain assumptions the Universal Transformer can be shown to be computationally universal.

----

17:50-18:00 Panel QA and Wrap Up

18:00-19:00 Drinks and Food

19:00-20:00 Brass Rave Unit:
Live acoustic set from one of Amsterdam's hottest Brass Bands.

20:00-21:30 DJ KREMLIN DISKO

This young Amsterdam based DJ is ready to show the crowd what old school Disco is about.

22:00 End

Location and getting there:

The Startup Village at Science Park 608 in Amsterdam is easy to reach by bike from the city center, as well as by public transport and car. The closest train station is Amsterdam Amstel and from there bus 40 takes you to a stop right in front of the Startup Village containers on the south side of the University of Amsterdam campus.

If you come by car, take exit S113 from Amsterdam ring, and follow signs for Science Park. The nearest (paid - and unfortunately quite pricey) parking is P7. Yes, biking or public transport really is the much better option!

The workshop and party are in the 'Night' room of the Startup Village event space. This is the red double-high stack of containers with the tent in front of it, next to the tall Equinix data center.