The issue with understanding language from text alone: an overview and potential solutions.
Abstract: We have recently seen the development of complex language models capable of everything between writing news articles and generating stories about unicorns. These models outclass several natural language understanding tasks, and despite this they cannot answer basic questions relating to multiplication, how to get a table through a doorway or what the color of a dove is. Arguments that recently have arisen claim that this is expected, since a model that only ever has seen text cannot be expected to understand the meaning behind the words it sees, since the meaning lies somewhere outside of text. Thus, these powerful language models have recently been referred to as “stochastic parrots”.
A way that has been proposed to mitigate this issue is to connect the model with more information than text, such as images, sounds and effects of actions, that may provide the meaning behind the text. This is also referred to as “grounding the model"". The proposed setup is not entirely dissimilar to how humans can read and write, but also interact with the outside world, by e.g. being able to see and hear. In my presentation, I will give an overview of this problem and the proposed solutions. I will also talk about my contributions to this research area.
Speaker: Lovisa Hagström, Chalmers University of Technology