Skip to main content

NLP Seminar

-
Online

The NLP Seminar Series at AI Sweden is a bi-weekly forum for people who work with or are interested in Natural Language Processing (NLP).

Arranged by: 
AI Sweden and RISE NLP Group

By practitioners, for practitioners

Each seminar features an initial 45-minute presentation by practitioners or researchers followed by discussion. The NLP Seminar Series is organized by AI Sweden and RISE NLP group, and moderated by Magnus Sahlgren, Head of Research, Natural Language Understanding at AI Sweden and Joakim Nivre, Head of Natural Language Processing at RISE.

Proposals for presentations can be sent to magnus.sahlgren@ai.se and francisca.hoyer@ai.se.

 

Speaker February 2: Irina Rish (MILA): Scaling Laws for Foundation Models in Deep Learning

Modern AI systems have achieved impressive results, from superhuman performance in image and speech recognition to beating world champions in chess, poker and Go. However, until recently, such systems remained “narrow specialists” incapable of generalizing to a wide range of diverse tasks without being specifically trained on them.

This situation started to change recently due to advances in very large-scale self-supervised models (also called “foundation  models”) pretrained on large amounts of diverse data (e.g., 175B-parameter GPT-3 language model, multimodal CLIP, DALL-E, and others). Scaling such models lead to emergence of impressive few-shot generalization capabilities with respect to broad sets of novel tasks. These advances, considered by many a breakthrough that can pave the way towards achieving the ultimate goal of AI, Artificial General Intelligence, became possible due to  significant increase in compute available to industrial research labs such as OpenAI, Google and others. Furthermore, recently discovered empirical neural scaling laws predict such performance improvements, governed by power-laws, as the model size and the  amounts of data and compute increase. While these recent developments are truly exciting, they also pose a new  challenge to academic and non-profit AI research organizations which may not have access to a similar amount of compute that industry has.  This motivated us – a rapidly growing international collaboration across several Universities and non-profit organizations, including Mila/University of Montreal, EleutherAI, LAION, AI Sweden and several others – to join forces and initiate an  effort on developing common objectives and tools for developing large-scale foundation models, and exploring  computing resources that can be available to non-profit and academia, such as supercomputers in particular countries.

Our long-term, overarching goal is to develop a wide international collaboration united by the objective of building foundation models that are increasingly more powerful, while at the same time safe, robust and aligned with human values. Such models aim to serve as a foundation for numerous AI applications, from industry to healthcare to scientific discovery – i.e., AI-powered applications of great societal value.

 

Sign me up!