A Multi-Region Brain Model to Elucidate the Role of
Hippocampus in Spatially Embedded Decision-Making

Yi Xie^1,2, Jaedong Hwang¹, Carlos Brody^2,3, David Tank², Ila Fiete¹,

¹MIT, ²Princeton Neuroscience Institute, ³Howard Hughes Medical Institute

Model variants diagram — Our multi-region model extends Vector-HaSH (Chandra et al., *Nature*, 2025), a mechanistic model of entorhinal, hippocampal, and neocortical circuitry, into a reinforcement learning (RL) framework to evaluate (i) behavioral performance on the task and (ii) alignment of hippocampal representations with experimental data. It incorporates an additional action-selection RNN (representing the prefrontal cortex in abstraction) that maps the hippocampal vector representation into an action. The model predicts that grid cells jointly encode both self-movement velocity (a physical variable) and decision evidence increments (a cognitive variable).

Intelligent behavior emerges from interactions among interconnected brain regions—not from isolated activity. Yet most decision-making studies focus on single areas. To uncover how structured multi-region circuits support decision-making in physical space, we build a mechanistic framework of hippocampal-centered interactions, with the resulting prediction now being tested in neurophysiology experiments with our collaborators.

Summary

Brains excel at robust decision-making and data-efficient learning. Understanding the architectures and dynamics underlying these capabilities can inform inductive biases for deep learning. We present a multi-region brain model that explores the normative role of structured memory circuits in a spatially embedded binary decision-making task from neuroscience.

We counterfactually compare the learning performance and neural representations of reinforcement learning (RL) agents with brain models of different interaction architectures between grid and place cells in the entorhinal cortex and hippocampus, coupled with an action-selection cortical recurrent neural network. We demonstrate that a specific architecture—where grid cells receive and jointly encode self-movement velocity signals and decision evidence increments—optimizes learning efficiency while best reproducing experimental observations relative to alternative architectures.

Our findings thus suggest brain-inspired structured architectures for efficient RL. Importantly, the models make novel, testable predictions about organization and information flow within the entorhinal-hippocampal-neocortical circuit: we predict that grid cells must conjunctively encode position and evidence for effective spatial decision-making, directly motivating new neurophysiological experiments.

Everyday choices—like deciding when to turn down a hallway—rely on two skills: knowing where we are and adding up clues about what to do next. Yet, how the brain merges these streams of information across multiple brain regions to accomplish this everyday task of making decisions in physical space remains a mystery.

We built a “virtual brain” inside a reinforcement‑learning agent that links three key brain circuits: grid cells that track location, the hippocampus that stores memories, and a small cortical network that picks actions. The agent practises a classic mouse task: walking down a T‑shaped maze lined with visual towers ("evidence") on both sides, then turning at the end toward the side with more towers (“decision”).

After testing all the possible brain circuit designs, we found that learning was fastest—and runs were shortest—when each grid cell encoded both position and a running tower count. This “dual‑coding” design also reproduced the unusual firing patterns recorded in the real mouse hippocampus during the same task, whereas other designs did not.

For neuroscience, our results predict that biological grid cells combine “Where am I?” with “How much evidence have I gathered?”—a hypothesis that future neurophysiology experiments can directly test. For machine learning, adding such structured memory maps to learning agents can reduce training demands, enabling lighter, more efficient AI that makes decisions in the real world.

Accumulating‑Towers Task

In this spatial decision‑making task, an agent walks down a virtual corridor and is stochastically presented with towers on the left and right. At the end of the corridor, it must turn to the side with more towers—requiring the integration of evidence over space and time. This task has become a shared benchmark across a growing experimental and theoretical neuroscience community, enabling standardized, reproducible studies of memory, integration, navigation, and decision‑making across multiple brain systems.

Joint vs disjoint grid‑cell representations — Example: The agent moves three positions forward, encountering evidence values +1, –1, +1. In joint coding (top), each module tracks both variables; in disjoint coding (bottom), modules specialize.

Grid Cell Coding Schemes

Grid cells are organized into K periodic two-dimensional one-hot modules, so the resulting grid vector is K-hot. While we include hypotheses that assume the conventional role of grid cells encoding physical position (M1 and M2), we also test the hypothesis that grid cells conjunctively encode both position and accumulated evidence (M3, M4, M5).

There are two forms of conjunctive coding: In the joint coding scheme, each grid module encodes both position and evidence along separate axes, producing “wiggling” activation patterns on hexagonal lattices. In contrast, the disjoint scheme allocates separate modules to position or evidence only, so that each module encodes one variable along a single axis. We evaluate these coding strategies in our counterfactual models: M4 implements the disjoint scheme, whereas M3 and M5 implement joint coding.

Alternative Multi-Region Interaction Hypotheses

We systematically evaluate competing hypotheses about how grid cells and the entorhinal-hippocampal system contribute to spatially embedded decision-making. Our counterfactual models explore whether grid cells encode only position (M1, M2) as following the conventional understanding, conjunctively encode position and decision evidence (M3–M5), or whether the hippocampus receives additional sensory input from the lateral entorhinal cortex (M2, M4, M5). In the case of conjunctive grid code, we also distinguish between joint integration models, in which individual grid modules encode both variables simultaneously (M3, M5), and disjoint models, in which separate grid modules encode position and evidence independently (M4). These architectural variations allow us to test their roles in supporting task performance and hippocampal coding patterns observed experimentally.

model diagram — Counterfactual models of hypotheses on neural code and information flow tested in our paper.

Proposed Multi‑Region Model (M5)

Our final model, M5, builds on the Vector‑HaSH architecture and incorporates a cortical action‑selection RNN to learn the accumulating‑towers task via reinforcement learning. M5 embodies the hypothesis that grid cells conjunctively encode self‑movement velocity and decision‑evidence increments—enabling both efficient learning and accurate reproduction of hippocampal activity patterns observed in vivo.

Results

Joint Integration Model Induces Efficient Learning

In training, only models whose grid cells jointly encode both position and evidence (M3 & M5) achieve near-perfect success while minimizing exploration time. The additional sensory projection to the hippocampus additionally induces fast navigation in M5, in comparison to M3. The perfomrnace of baseline RNNs (M0, M0+) and disjoint-coding variants (M4) lag far behind, while the models with position-only grid code don't learn the task at all (M1, M2).

Training curves for model variants — **(A)** Cumulative success rate in training. **(B)** Average steps per episode (a low value indicates fast navigation).

Only Joint Integration Models Reproduce Experimental Hippocampal Maps

Conjunctive tuning of grid cells to both position and evidence (right, M5 & M3) is necessary and sufficient to generate the hippocampal firing-field structure reported by Nieh et al. (2021). Disjoint (middle, M4) models fail to capture these place‑cell sequences.

Hippocampal firing field comparison — **(1)** Experimental data from Nieh et al., showing hippocampal cells are tuned to both position (top) and evidence (bottom) and exhibit choice-specific place-field sequences. **(2)** Disjoint-grid model (M4) reproduces position and evidence fields but lacks the choice-specific pattern. **(3)** Joint-grid models (M5 shown; M3 qualitatively similar) capture both conjunctive firing fields and the choice-specific property.

Only Joint Integration Model With Activated EC Pathway Exhibits Well-Separated Low-Dimensional Co-representation of Task Variables

Principal-component projections of hippocampal population activity reveal that only M5 organizes task variables (position & local evidence velocity) into well-separated clusters in low dimensions. In contrast, none of the other models exhibit low-dimensional separation of task variables (e.g., M4 as shown).

Related Work

Nieh et al. observed conjunctive coding of position and evidence in rodent hippocampus during the same task.

Chandra et al. proposed Vector-HaSH, a mechanistic model for the entorhinal-hippocampal-neocortical circuit, on which our model builds.

BibTeX

@inproceedings{xie2025multi,
            title={A Multi-Region Brain Model to Elucidate the Role of Hippocampus in Spatially Embedded Decision-Making},
            author={Yi Xie and Jaedong Hwang and Carlos D. Brody and David W. Tank and Ila R. Fiete},
            booktitle={Proceedings of the 42nd International Conference on Machine Learning (ICML)},
            year={2025},
            }