First interdisciplinary symposium on Information-seeking, curiosity and attention

oudeyer · November 26, 2014, 9:58am

First interdisciplinary symposium on Information-seeking, curiosity and attention (Neurocuriosity 2014)

Date and Location

6-7 Nov. 2014, Inria Bordeaux Sud-Ouest, France

Topics

The past few years have seen a surge of interest in the mechanisms of active learning, curiosity and information seeking, and this body work has highlighted a number of highly significant questions regarding higher cognition and its development (for a recent review, see Tics13). One question is how subjects explore to build explanatory models of their environment, and how these models further constrain the sampling of additional information. A related question is how the brain generates the intrinsic motivation to seek information when physical rewards are absent or unknown, and how this impacts cognitive development in the long term. Our goal is to stimulate discussion on these and related topics and foster further research in this nascent and complex field.dfj

Organizers:

Pierre-Yves Oudeyer (Inria, Bordeaux, France)
Jacqueline Gottlieb (Columbia University, NY, USA
Manuel Lopes (Inria, Bordeaux, France)

Funding

This workshop was partially funded by Inria Associated Team Neurocuriosity grant, and ERC Explorers 240007 grant.

Video Presentations and slides

Full Youtube playlist here

Exploration in Active Tasks

Michael Frank, Brown University, US
Go to abstract, video and slides:
Probing for informativeness on latent states during reinforcement learning

Sam Gershman, MIT, US
Go to abstract, video and slides:
Novelty and inductive generalization in human reinforcement learning

Kevin Gurney, Univ. Sheffield, UK
Go to abstract, video and slides:
Computational models of action discovery in animals

Jacqueline Gottlieb, Columbia University, NY, USA
co-author: Manuel Lopes, Inria, Bordeaux, France
Go to abstract, video and slides:
Parietal neurons identify informative steps in sequential actions

We thank Olivier Mangin and Thibault Munzer for the video recording and editing work.

oudeyer · November 26, 2014, 10:22am

Michael Frank, Brown University, US
Probing for informativeness on latent states during reinforcement learning

Abstract: Learning and action selection are complicated by the fact that observed events (e.g your flight departing on time) are often determined in part by unobserved processes (e.g the weather conditions en route). The advantage of considering latent processes is evident when events are predictable if considered in concert with their latent influences, but appear random otherwise. Recent studies have suggested that `exploratory’ behaviour can be conceptualized as targeting the reduction of various forms of uncertainty. However, little work has examined whether organisms actively select actions that would reduce uncertainty about the state of latent processes. We propose a model that defines action selection in terms of both expected value and mutual information shared between action outcomes and latent state value. Critically, the tradeoff between the two components is a function of belief state uncertainty, quantified the belief state’s entropy. We investigated this hypothesis using a latent structure reinforcement learning task. Participants were asked to repeatedly pick amongst cards that could be drawn from one of two decks. Participants were blind to the deck in play on each trial, but knew how the payoffs varied across decks. Results reveal an exploitative strategy when inferred belief state uncertainty was low, and an increased probability of foregoing possible reward in favour of cards with informative outcomes when belief state uncertainty was high. This pattern was observed across a broad range of task parameters, and was even observed when the hidden state value had no bearing on the optimal policy (e.g the optimal policy was identical across decks). These results suggest that belief state uncertainty predicts the prioritization of information relative to reward, and that the reduction of uncertainty may be valued in its own right. I will present preliminary EEG data testing neural correlates of our model.

Slides: https://www.dropbox.com/s/z1df1hkb6fbthor/Frank_Bordeaux_probing.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 10:52am

Sam Gershman, MIT, US
Novelty and inductive generalization in human reinforcement learning

Abstract: In reinforcement learning, a decision maker searching for the most rewarding option is often faced with the question: what is the value of an option that has never been tried before? One way to frame this question is as an inductive problem: how can I generalize my previous experience with one set of options to a novel option? I show how hierarchical Bayesian inference can be used to solve this problem, and describe an equivalence between the Bayesian model and temporal difference learning algorithms that have been proposed as models of reinforcement learning in humans and animals. According to this view, the search for the best option is guided by abstract knowledge about the relationships between different options in an environment, resulting in greater search efficiency compared to traditional reinforcement learning algorithms previously applied to human cognition. In two behavioral experiments, I test several predictions of the model, providing evidence that humans learn and exploit structured inductive knowledge to make predictions about novel options. In light of this model, I suggest a new interpretation of dopaminergic responses to novelty.

Slides: https://www.dropbox.com/s/rofd1fe2x0a5lam/Gershman_Neurocuriosity_Nov14.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 10:54am

Kevin Gurney, Univ. Sheffield, UK
Computational models of action discovery in animals

Abstract: How can animals acquire a repertoire of actions enabling the achievement of their goals? Moreover, how can this be done spontaneously without the animal being instructed, or without having some overt, primary reward assigned to successful learning? The relation between actions and outcomes are presumed to be held in internal models, encoded in associative neural networks. In order for these associations to be learned, representations of the motor action, sensory context, and the sensory outcome must be repeatedly activated in the relevant neural systems. This requires a transient change in the action selection policy of the agent, so that the to-be-learned action is selected more often than other competing actions; we dub this policy change - ‘repetition bias’. A key component in this scheme is a set of sub-cortical nuclei - the basal ganglia. There is evidence to suggest the basal ganglia may be subject to reinforcement learning, with phasic activity in midbrain dopamine neurons constituting a reinforcement signal. We propose that this signal encodes a sensory prediction error, thereby suggesting how learning
may be intrinsically motivated by exploration of the environment. These ideas have recently been quantified in a model of intrinsically motivated action learning in basal ganglia, and tested in a simple autonomous agent whose behaviour is constrained to mimic that of rats in an in vivo experiment. The model can account for much of the in vivo data, and shows a complex interplay of mechanisms that we believe are responsible for repetition bias and biological action
discovery.

Slides: https://www.dropbox.com/s/zg5fghkp4lnj1v9/Gurney_Bordeaux_2014.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 10:55am

Jacqueline Gottlieb, Columbia University, NY, USA
co-author: Manuel Lopes, Inria, Bordeaux, France
Parietal neurons identify informative steps in sequential actions

Abstract: Economic analysis has long recognized that information has value because it allows individuals to make choices that yield higher payoffs than would be obtained in the absence of the information. This logic is evident in complex behaviors such as reading or hiring a consultant before making a decision; it is also evident in more mundane behaviors such as orienting a sensory receptor to sample task-relevant information (e.g., looking at the traffic before crossing a street).w However, a distinction can be made based on the fact that obtaining information requires cognitive engagement (to discriminate and interpret the information) which need not be required for all rewarded steps. We show that parietal cortical neurons implicated in eye movement control honor this distinction. The neurons have predictive responses that encode the gains in information expected after a saccade, in ways that cannot be explained by the cumulative future rewards or reward prediction errors associated with the saccade. The findings indicate that the brain distinguishes informative relative to rewarded steps and may preemptively recruit cognitive resources to process these steps.

Slides: https://www.dropbox.com/s/zpskomt8cact2m5/2014_Bordeaux_GottliebLopes.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 10:58am

Teodora Gliga, Birbeck College, London
co-authors: Katarina Begus and Victoria Southgate
The ontogeny of human curiosity: a few baby steps

Abstract: I will first give a brief overview our research programme, which broadly addresses the following questions: (1) what are the earliest means trough which infants seek information from others, (2) what neural mechanisms underlie the drive for information (or the lack of) and how do they affect learning, (3) what are infants curious about and (4) what contributes to the emergence of differences in trait curiosity. I will then go on to describe the experimental approaches we took to answer the first two of these questions. In a first series of studies we demonstrated that pointing is one of the earliest expressions of human curiosity, by showing that it is driven by a desire for information and leads to better learning. To understand the neural mechanisms accompanying information seeking, in a second series of studies we measured EEG theta band activity while infants were expecting information. Theta band oscillations have been previously related to expectations and exploration and they are believed to reflect the tuning of cortical processing mechanisms to the incoming information. Less theta activity was measured in anticipation of non-informative communication, suggesting active tuning down of the information uptake. In a different study we showed that the power of theta band measured during object exploration correlated with subsequent measures of learning. I will conclude by placing these findings in the perspective of studies on the neural bases of information processing, attention and motivation and propose new avenues for the investigation of neuro-devo-curiosity.

Slides: https://www.dropbox.com/s/3hzl6h5qj8wt6wi/Gliga_bordeaux_14.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:01am

Pierre-Yves Oudeyer, Inria, Bordeaux, France
co-author: Linda Smith, Univ. Indiana, US
The impact of curiosity-driven learning on the self-organization of developmental process: robotic models

Abstract: Infants’ own activities create and actively select their learning experiences. I will review here recent models of embodied curiosity-driven learning and information-seeking, and show that these mechanisms have deep implications for development and evolution. First, I will discuss how they can self-organize epigenesis with emergent ordered behavioral and cognitive developmental stages. I will outline a robotic experiment studying the hypothesis that progress in learning in and for itself generates intrinsic rewards: the robot learner probabilistically selects experiences according to their potential for reducing uncertainty. We show that a learning curriculum adapted to the current constraints of the learning system automatically forms, and at the same time constrains learning and shapes the developmental trajectory, sharing many properties with infant development, with a mixture of regularities and diversities in the developmental patterns. In particular, it leads the learner to successively discover object affordances and vocal interaction with its peers. I will also present an experiment with a model of vocal development in the young infant, which shows how the interaction between curiosity-driven explortion of vocalization and imitation of speech sounds produced by social peers self-organizes important stages of vocal development. Finally, I will argue that such emergent developmental structures can guide and constrain evolution. In particular, they constitute a reservoir of behavioral and cognitive innovations that can be recruited later for functions not yet anticipated, including primitive forms of language.

Slides: https://www.dropbox.com/s/xk2m426hg4rt9fr/NeurocuriosityWorkshopOudeyer14.pptx?dl=0

Selected publications associated with this talk:

How Evolution may work through Curiosity-driven Developmental Process
Oudeyer, P-Y. and Smith. L. (in press)
Topics in Cognitive Science.

Information Seeking, Curiosity and Attention: Computational and Neural Mechanisms
Gottlieb, J., Oudeyer, P-Y., Lopes, M., Baranes, A. (2013)
Trends in Cognitive Science, , 17(11), pp. 585-596. Bibtex

Self-organization of early vocal development in infants and machines: the role of intrinsic motivation
Moulin-Frier, C., Nguyen, S.M., Oudeyer, P-Y. (2014)
Frontiers in Psychology (Cognitive Science), 4(1006).

What is intrinsic motivation? A typology of computational approaches
Oudeyer P-Y. and Kaplan F. (2007)
Frontiers in Neurorobotics, 1:6,

Intrinsic Motivation Systems for Autonomous Mental Development
Oudeyer P-Y, Kaplan , F. and Hafner, V. (2007)
IEEE Transactions on Evolutionary Computation, 11(2), pp. 265–286.

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:03am

Celeste Kidd, University of Rochester, US
Rational approaches to learning and development

Abstract: Good decision-making requires the decision-maker to generate accurate expectations about what is likely to happen in the future. Adults’ decisions, especially those pertaining to attention and learning, are guided by their substantial experience in the world. Very young children, however, possess far less data. In this talk, I will discuss work that explores the mechanisms that guide young children’s early attentional decisions and subsequent learning. I present eye-tracking experiments that combine behavioral methods and computational modeling in order to test competing theories of attentional choice. I present evidence that young learners rely on rational utility maximization both to build complex models of the world starting from very little knowledge and, more generally, to guide their behavior. I will also discuss recent results from related on-going projects about learning and attention in macaque learners, as well as some data on other sorts of decision-making processes in children.

Slides: https://www.dropbox.com/s/mw2yz6cusjhc969/CKidd_InriaCuriosity_6Nov2014.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:04am

Gert Westermann, Univ. Lancaster, UK
Some thoughts on curiosity in infants and neural network models

Abstract: Studies on young infants’ learning of objects and categories often present infants with a sequence of individual stimuli in a fixed or randomized order. However, according to the formalization of curiosity based learning infants should select stimuli systematically on the basis of their prior knowledge in order to optimize learning. It is therefore possible that the results from experimental studies represent an epiphenomenon of the underlying curiosity-based learning process. Likewise, computational models of infant learning which aim to reveal the mechanisms underlying the learning process rely on sequential presentation of stimuli in a way similar to the experimental work and might therefore be unable to capture curiosity as a driver for learning.
Here I will discuss how infants’ display of learning in traditional tasks relates to potential curiosity based learning. I also will discuss how the formalization of curiosity relates to learning in neural network models, and I hope to be able to present some pilot data from curiosity based modelling.

Slides: https://www.dropbox.com/s/4070ruw01hyqrj9/Westermann_Bordeaux_talk.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:05am

Laura Schulz, MIT, USA
Curiosity, intrinsic motivation and learning

Abstract: to come

Slides: to come

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:07am

Manuel Lopes, Inria Flowers, France
An overview of active learning approaches in machine learning

Abstract: In this survey we present different approaches that allow an intelligent agent to explore autonomous its environment to gather information and learn mul- tiple tasks. Different communities proposed differ- ent solutions, that are in many cases, similar and/or complementary. These solutions include active learn- ing, exploration/exploitation, online-learning and so- cial learning. The common aspect of all these ap- proaches is that it is the agent to selects and de- cides what information to gather next. Applications for these approaches already include tutoring systems, autonomous grasping learning, navigation and map- ping and human-robot interaction. We discuss how these approaches are related, explaining their similari- ties and their differences in terms of problem assumptions and metrics of success. We consider that such an integrated discussion will improve inter-disciplinary research and applications.

Slides: https://www.dropbox.com/s/mzu3una3jehwl6n/Lopes_14activelearningexploration.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:09am

Jochen Triesch, Frankfurt Institute of Advanced Studies, Germany
Active efficient coding

Abstract: The goal of perceptual systems is to provide useful knowledge about the environment and to encode this information efficiently. As such, perception is an active process that often involves the movement of sense organs such as the eyes. This active nature of perception has typically been neglected in popular theories describing how nervous systems learn sensory representations. Here we present an approach for intrinsically motivated learning during active perception that treats the learning of sensory representations and the learning of movements of the sense organs in an integrated manner. In this approach, a generative model learns to encode the sensory data while a reinforcement learner directs the sense organs so as to make the generative model work as efficiently as possible. To this end, the reinforcement learner receives an intrinsic reward signal that measures the encoding quality currently obtained by the generative model. In the context of binocular vision, the approach is shown to lead to a self-calibrating stereo vision system that learns a representation for binocular disparity while at the same time learning proper vergence eye movements to fixate objects. The approach is quite general and can be applied to other types of eye movements such as smooth pursuit and may be extended to different sensory modalities. Somewhat surprisingly, the approach also offers a new perspective on the development of imitation abilities.

Slides: to come

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:09am

Friederich Sommer, UC Berkeley, US
Information-theory based policies for exploratory learning in closed sensori-motor loops

Abstract: Over the last two decades great progress has been made in understanding how sensory representations are learned in the brain driven by the principle of efficient coding. In contrast, we are still lacking theories of how motor output is guided for optimizing learning in closed sensor-motor loops. My talk will first review foundational work that defined information gain and proposed it for guiding optimal experimental design and for driving learning in action-perception loops. Second I will present our recent work on exploratory learning of agents in unknown environments, whose actions optimize information gain within a multi-step time horizon. Finally I will discuss the extension of this work to the exploration of unbounded state spaces.

Slides: https://www.dropbox.com/s/ek2ckslw57gtp7r/Sommer_bordeaux14.pdf?dl=0

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:11am

Axel Kacelnik, University of Oxford, UK
Truthful information can induce irrational choice, but only when it cannot be used

Abstract: Previous research has shown that if two sources of probabilistic, delayed rewards differ in that one signals forthcoming outcomes immediately after being chosen and the other does not, pigeons and starlings prefer the informative option even if its reward probability is much lower. This is striking because they seem to be hungry for information that is given after a choice and cannot be used to alter outcomes, and is thus useless. The preference is extremely robust and is evidenced both as choice in simultaneous encounters and differential response times in sequential encounters. The effect is consistent with aversion to length of time under uncertainty: when this time is shortened, preference switches towards the higher probability option. We show that the observed preferences would maximize gains if subjects computed expectations as if post-choice information were usable, as when predators abandon a chase when sure of the prey escaping, neglecting temporal costs common to the alternatives. Associative learning can account for these phenomena if attention mechanisms ensure that predictable time costs that are avoidable in nature are edited out, and not credited to stimuli preceding the choice.

Slides: to come

Back to the symposium’s table of contents

oudeyer · November 26, 2014, 11:12am

Jacqueline Gottlieb (talk no 2), Columbia University, NY, USA
Sampling useless information: promises and challenges in current research

Abstract: In an active task context, subjects sample information in order to improve the chance of success of their future actions, creating a close association between gains in information and gains in rewards. To break this association, recent studies have turned to lottery-like tasks where subjects may choose to obtain or forego information but they cannot use that information to increase their future rewards. I will review studies using this approach in monkeys and humans, including a published study and ongoing experiments in our laboratory, and discuss some of its strengths and limitations.

Slides: https://www.dropbox.com/s/nki12hnfuyz1zql/2014_Bordeaux_Gottlieb.pdf?dl=0

Back to the symposium’s table of contents

forestier · December 22, 2014, 9:11pm

Hi,
I’ve written some notes about this workshop here.
Best,
Sébastien Forestier

First interdisciplinary symposium on Information-seeking, curiosity and attention