Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Action selection performance of a reconfigurable basal ganglia inspired model with Hebbian-Bayesian Go- NoGo connectivity
Stockholm University, Faculty of Science, Numerical Analysis and Computer Science (NADA). Royal Institute of Technology, Sweden.
Stockholm University, Faculty of Science, Numerical Analysis and Computer Science (NADA). Royal Institute of Technology, Sweden.
2012 (English)In: Frontiers in Behavioral Neuroscience, ISSN 1662-5153, E-ISSN 1662-5153, Vol. 6, 65Article in journal (Refereed) Published
Abstract [en]

Several studies have shown a strong involvement of the basal ganglia (BG) in action selection and dopamine dependent learning. The dopaminergic signal to striatum, the input stage of the BG, has been commonly described as coding a reward prediction error (RPE), i.e., the difference between the predicted and actual reward. The RPE has been hypothesized to be critical in the modulation of the synaptic plasticity in cortico-striatal synapses in the direct and indirect pathway. We developed an abstract computational model of the BG, with a dual pathway structure functionally corresponding to the direct and indirect pathways, and compared its behavior to biological data as well as other reinforcement learning models. The computations in our model are inspired by Bayesian inference, and the synaptic plasticity changes depend on a three factor Hebbian-Bayesian learning rule based on co-activation of pre- and post-synaptic units and on the value of the RPE. The model builds on a modified Actor-Critic architecture and implements the direct (Go) and the indirect(NoGo) pathway, as well as the reward prediction (RP) system, acting in a complementary fashion. We investigated the performance of the model system when different configurations of the Go, NoGo, and RP system were utilized, e.g., using only the Go, NoGo, or RP system, or combinations of those. Learning performance was investigated in several types of learning paradigms, such as learning-relearning, successive learning, stochastic learning, reversal learning and a two-choice task. The RPE and the activity of the model during learning were similar to monkey electrophysiological and behavioral data. Our results, however, show that there is not a unique best way to configure this BG model to handle well all the learning paradigms tested. We thus suggest that an agent might dynamically configure its action selection mode, possibly depending on task characteristics and also on how much time is available.

Place, publisher, year, edition, pages
2012. Vol. 6, 65
Keyword [en]
basal ganglia, behavior selection, reinforcement learning, Hebbian-Bayesian plasticity, Bayesian inference, BCPNN, direct-indirect pathway, dopamine
National Category
Computer Science Neurosciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:su:diva-82966DOI: 10.3389/fnbeh.2012.00065ISI: 000310727200001OAI: oai:DiVA.org:su-82966DiVA: diva2:576764
Note

AuthorCount:3;

Available from: 2012-12-13 Created: 2012-12-03 Last updated: 2017-12-06Bibliographically approved
In thesis
1. Computational Modeling of the Basal Ganglia: Functional Pathways and Reinforcement Learning
Open this publication in new window or tab >>Computational Modeling of the Basal Ganglia: Functional Pathways and Reinforcement Learning
2015 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

We perceive the environment via sensor arrays and interact with it through motor outputs. The work of this thesis concerns how the brain selects actions given the information about the perceived state of the world and how it learns and adapts these selections to changes in this environment. Reinforcement learning theories suggest that an action will be more or less likely to be selected if the outcome has been better or worse than expected. A group of subcortical structures, the basal ganglia (BG), is critically involved in both the selection and the reward prediction.

We developed and investigated a computational model of the BG. We implemented a Bayesian-Hebbian learning rule, which computes the weights between two units based on the probability of their activations. We were able test how various configurations of the represented pathways impacted the performance in several reinforcement learning and conditioning tasks. Then, following the development of a more biologically plausible version with spiking neurons, we simulated lesions in the different pathways and assessed how they affected learning and selection.

We observed that the evolution of the weights and the performance of the models resembled qualitatively experimental data. The absence of an unique best way to configure the model over all the learning paradigms tested indicates that an agent could dynamically configure its action selection mode, mainly by including or not the reward prediction values in the selection process. We present hypotheses on possible biological substrates for the reward prediction pathway. We base these on the functional requirements for successful learning and on an analysis of the experimental data. We further simulate a loss of dopaminergic neurons similar to that reported in Parkinson’s disease. We suggest that the associated motor symptoms are mostly causedby an impairment of the pathway promoting actions, while the pathway suppressing them seems to remain functional.

Place, publisher, year, edition, pages
Stockholm: Numerical Analysis and Computer Science (NADA), Stockholm University, 2015. 134 p.
Series
Trita-CSC-A, ISSN 1653-5723
Keyword
computational neuroscience, modelisation, reinforcement learning, basal ganglia, dopamine
National Category
Bioinformatics (Computational Biology)
Research subject
Computer Science
Identifiers
urn:nbn:se:su:diva-123747 (URN)978-91-7649-184-3 (ISBN)
Public defence
2016-01-25, F3, Sing Sing, KTH Campus, Lindstedtsvägen 26, Stockholm, 14:00 (English)
Opponent
Supervisors
Funder
EU, FP7, Seventh Framework Programme, FP7-237955
Note

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.

 

Available from: 2015-12-29 Created: 2015-12-04 Last updated: 2016-01-25Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Berthet, PierreLansner, Anders
By organisation
Numerical Analysis and Computer Science (NADA)
In the same journal
Frontiers in Behavioral Neuroscience
Computer ScienceNeurosciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 62 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf