Selected Publications

Intelligent behaviour is fundamentally tied to the ability of the brain to make decisions in uncertain and dynamic environments. In neuroscience, the generative framework of Bayesian Decision Theory has emerged as a principled way to predict how the brain acts in the face of uncertainty. In the first part of my thesis, I study the question of how humans learn to perform a visual object categorisation task. I present a novel experimental paradigm to assess whether people use generative Bayesian principles as a general strategy. We found that humans indeed perform in a generative manner, but resort to approximate inference when faced with complex computations. In the second part, I consider how one would build a Bayesian ideal observer model of human haptic perception and object recognition, using MuJoCo as an environment. Our model can, using only noisy contact point information on the surface of the hand and noisy hand proprioception, simultaneously infer the shape of simple objects together with an estimation of the true hand pose in space. This is implemented using a recursive Bayesian estimation algorithm, inspired by simultaneous localisation and mapping (SLAM) methods in robotics, which can operate on computer-based physical simulations as well as experimental data from human subjects.
Thesis

Dynamic tactile exploration enables humans to seamlessly estimate the shape of objects and distinguish them from one another in the complete absence of visual information. Such a blind tactile exploration allows integrating information of the hand pose and contacts on the skin to form a coherent representation of the object shape. A principled way to understand the underlying neural computations of human haptic perception is through normative modelling. We propose a Bayesian perceptual model for recursive integration of noisy proprioceptive hand pose with noisy skin–object contacts. The model simultaneously forms an optimal estimate of the true hand pose and a representation of the explored shape in an object–centred coordinate system. A classification algorithm can, thus, be applied in order to distinguish among different objects solely based on the similarity of their representations. This enables the comparison, in real–time, of the shape of an object identified by human subjects with the shape of the same object predicted by our model using motion capture data. Therefore, our work provides a framework for a principled study of human haptic exploration of complex objects.
In Haptics: Perception, Devices, Control, and Applications. Lecture Notes in Computer Science (Finalist for best paper at EuroHaptics 2016).

Recent Publications

  • Learning from Demonstration in the Wild (under review)

    Details

  • Human Visual classification reflects Bayesian generative representations (under submission)

    Details

  • Dynamics of uncertainty in sensorimotor estimation across time (under submission)

    Details

  • Reverse-Engineering Human Visual and Haptic Perceptual Algorithms

    Details

  • Haptic SLAM: An Ideal Observer Model for Bayesian Inference of Object Shape and Hand Pose from Contact Dynamics

    Details PDF Book chapter

  • The emergence of decision boundaries is predicted by second order statistics of stimuli in visual categorization

    Details PDF

  • Haptic SLAM for Context-Aware Robotic Hand Prosthetics - Simultaneous Inference of Hand Pose and Object Shape Using Particle Filters

    Details

  • Visual categorization reflects second order generative statistics

    Details

Recent Talks

Projects

All

End-to-end control: A3C-MuJoCo

Applying end-to-end learning to solve pixel-driven control where learning is accomplished using Asynchronous Advantage Actor-Critic (A3C) method with sparse rewards.

Recent Posts

Successor representations were introduced by Dayan in 1993, as a way to represent states by thinking of how “similarity” for TD learning is similar to the temporal sequence of states that can be reached from a given state. Dayan derived it in the tabular case, but let’s do it when assuming a feature vector $\phi$. We assumes that the reward function can be factorised linearly: $$r(s) = \phi(s) \cdot w$$

CONTINUE READING

Now

What I’m doing now

(This is a now page!)

I am currently working on end-to-end deep reinforcement learning, applied to controlling a Jaco robotic arm from pixel observations. After teaching a Jaco arm to reach to positions from pixels, I’m trying to make it grasp and lift objects instead. It’s surprisingly hard to find a good cost function!