Breadcrumbs Section. Click here to navigate to respective pages.

Knowledge Representation

Knowledge Representation

DOI link for Knowledge Representation

Get Citation

Knowledge representation is fundamental to the study of mind. All theories of psychological processing are rooted in assumptions about how information is stored. These assumptions, in turn, influence the explanatory power of theories. This book fills a gap in the existing literature by providing an overview of types of knowledge representation techniques and their use in cognitive models. Organized around types of representations, this book begins with a discussion of the foundations of knowledge representation, then presents discussions of different ways that knowledge representation has been used. Both symbolic and connectionist approaches to representation are discussed and a set of recommendations about the way representations should be used is presented. This work can be used as the basis for a course on knowledge representation or can be read independently. It will be useful to students of psychology as well as people in related disciplines--computer science, philosophy, anthropology, and linguistics--who want an introduction to techniques for knowledge representation.

TABLE OF CONTENTS

Chapter 1 | 26  pages, foundations, chapter 2 | 32  pages, spatial representations, chapter 3 | 30  pages, featural representations, chapter 4 | 29  pages, network models, chapter 5 | 36  pages, structured representations, chapter 6 | 34  pages, structure in perceptual representations, chapter 7 | 29  pages, structured concept representations, chapter 8 | 31  pages, general and specific information in representations, chapter 9 | 29  pages, mental models, chapter 10 | 24  pages, using representation.

  • Privacy Policy
  • Terms & Conditions
  • Cookie Policy
  • Taylor & Francis Online
  • Taylor & Francis Group
  • Students/Researchers
  • Librarians/Institutions

Connect with us

Registered in England & Wales No. 3099067 5 Howick Place | London | SW1P 1WG © 2024 Informa UK Limited

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Conceptual knowledge representation: A cross-section of current research

Affiliations.

  • 1 a Department of Psychology , University of Wisconsin , Madison , WI , USA.
  • 2 b Intelligent Systems Center , Johns Hopkins University Applied Physics Laboratory , Laurel , MD , USA.
  • PMID: 27454108
  • DOI: 10.1080/02643294.2016.1188066

How is conceptual knowledge encoded in the brain? This special issue of Cognitive Neuropsychology takes stock of current efforts to answer this question through a variety of methods and perspectives. Across this work, three questions recur, each fundamental to knowledge representation in the mind and brain. First, what are the elements of conceptual representation? Second, to what extent are conceptual representations embodied in sensory and motor systems? Third, how are conceptual representations shaped by context, especially linguistic context? In this introductory article we provide relevant background on these themes and introduce how they are addressed by our contributing authors.

Keywords: Semantic memory; cognitive neuroscience; concept representation; embodied cognition.

PubMed Disclaimer

  • Toward a brain-based componential semantic representation. Binder JR, Conant LL, Humphries CJ, Fernandino L, Simons SB, Aguilar M, Desai RH. Binder JR, et al. Cogn Neuropsychol. 2016 May-Jun;33(3-4):130-74. doi: 10.1080/02643294.2016.1147426. Epub 2016 Jun 16. Cogn Neuropsychol. 2016. PMID: 27310469

Similar articles

  • Linking somatic and symbolic representation in semantic memory: the dynamic multilevel reactivation framework. Reilly J, Peelle JE, Garcia A, Crutch SJ. Reilly J, et al. Psychon Bull Rev. 2016 Aug;23(4):1002-14. doi: 10.3758/s13423-015-0824-5. Psychon Bull Rev. 2016. PMID: 27294419 Free PMC article. Review.
  • Building semantic memory from embodied and distributional language experience. Davis CP, Yee E. Davis CP, et al. Wiley Interdiscip Rev Cogn Sci. 2021 Sep;12(5):e1555. doi: 10.1002/wcs.1555. Epub 2021 Feb 2. Wiley Interdiscip Rev Cogn Sci. 2021. PMID: 33533205 Review.
  • For a cognitive neuroscience of concepts: Moving beyond the grounding issue. Leshinskaya A, Caramazza A. Leshinskaya A, et al. Psychon Bull Rev. 2016 Aug;23(4):991-1001. doi: 10.3758/s13423-015-0870-z. Psychon Bull Rev. 2016. PMID: 27294420 Review.
  • Bridging the Chasm Between Cognitive Representations and Formal Structures of Linguistic Meanings. Mondal P. Mondal P. Cogn Sci. 2024 May;48(5):e13456. doi: 10.1111/cogs.13456. Cogn Sci. 2024. PMID: 38804002
  • The Puzzling Chasm Between Cognitive Representations and Formal Structures of Linguistic Meanings. Mondal P. Mondal P. Cogn Sci. 2022 Sep;46(9):e13200. doi: 10.1111/cogs.13200. Cogn Sci. 2022. PMID: 36083283

Publication types

  • Search in MeSH

LinkOut - more resources

Full text sources.

  • Taylor & Francis

Other Literature Sources

  • scite Smart Citations

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

representation of knowledge in psychology

1st Edition

Knowledge Representation

VitalSource Logo

  • Taylor & Francis eBooks (Institutional Purchase) Opens in new tab or window

Description

Knowledge representation is fundamental to the study of mind. All theories of psychological processing are rooted in assumptions about how information is stored. These assumptions, in turn, influence the explanatory power of theories. This book fills a gap in the existing literature by providing an overview of types of knowledge representation techniques and their use in cognitive models. Organized around types of representations, this book begins with a discussion of the foundations of knowledge representation, then presents discussions of different ways that knowledge representation has been used. Both symbolic and connectionist approaches to representation are discussed and a set of recommendations about the way representations should be used is presented. This work can be used as the basis for a course on knowledge representation or can be read independently. It will be useful to students of psychology as well as people in related disciplines--computer science, philosophy, anthropology, and linguistics--who want an introduction to techniques for knowledge representation.

Table of Contents

Arthur B. Markman

Critics' Reviews

"Markman is honest from the beginning about the goals of the book. He states in the Preface that he wants to provide a 'Michelin guide to knowledge representation.' At this, Markman succeeds admirably. The book is comprehensive and very well-writen." — Contemporary Psychology "...the book is clearly written and would be useful for the advanced undergraduate or beginning graduate student in psychology." — Journal of Mathematical Psychology "... Knowledge Representation is well-written and interesting. The book covers a very wide range of topics in order to analyze the forms of representation they use and to identify the advantages and disadvantages of each form. It is unique in both its coverage and its perspective, which make for a fresh approach to the study of how knowledge is represented and processed. It seems likely to become the standard text for courses on this topic." — Gregory Murphy, PhD University of Illinois, Urbana-Champaign "Markman's book does an admirable job of explaining the ideas of representation. It provides a rare perspective, focusing on what kinds of representations have been used in different areas of research, and why. The properties and tradeoffs involved in various types of representations are well illustrated with examples drawn from several areas of cognitive science. Anyone interested in cognitive science will find this book an excellent, enlightening resource." — Kenneth D. Forbus, PhD Northwestern University

About VitalSource eBooks

VitalSource is a leading provider of eBooks.

  • Access your materials anywhere, at anytime.
  • Customer preferences like text size, font type, page color and more.
  • Take annotations in line as you read.

Multiple eBook Copies

This eBook is already in your shopping cart. If you would like to replace it with a different purchasing option please remove the current eBook option from your cart.

Book Preview

representation of knowledge in psychology

The country you have selected will result in the following:

  • Product pricing will be adjusted to match the corresponding currency.
  • The title Perception will be removed from your cart because it is not available in this region.
This paper appeared as R. Davis, H. Shrobe, and P. Szolovits. What is a Knowledge Representation? AI Magazine , 14(1):17-33, 1993. A better formatted version is available in postscript .
Although knowledge representation is one of the central and in some ways most familiar concepts in AI, the most fundamental question about it--What is it?--has rarely been answered directly. Numerous papers have lobbied for one or another variety of representation, other papers have argued for various properties a representation should have, while still others have focused on properties that are important to the notion of representation in general. In this paper we go back to basics to address the question directly. We believe that the answer can best be understood in terms of five important and distinctly different roles that a representation plays, each of which places different and at times conflicting demands on the properties a representation should have. We argue that keeping in mind all five of these roles provides a usefully broad perspective that sheds light on some longstanding disputes and can invigorate both research and practice in the field.

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Cognition and Representation

An overview of knowledge representation issues in cognitive science

  • Conference paper
  • Cite this conference paper

representation of knowledge in psychology

  • Christian Freksa 2 ,
  • Ulrich Furbach 3 &
  • Gerhard Dirlich 4  

Part of the book series: Informatik Fachberichte ((2252,volume 103))

71 Accesses

1 Citations

Central topics and scope of cognitive science are outlined. The role of representations in cognitive systems and metaphors for cognition are discussed. A representation-theoretical approach to knowledge representation is proposed. The notion of a representation system is applied as a framework for specifying, comparing, and transforming representations. We show how this framework can be used to study properties of representations and to make some of the controversies in knowledge representation appear less controversial. Advantages and difficulties of a representation-theoretical approach to cognition are presented.

Research supported, in part, by NASA cooperative agreement NCC2–275 and by ESPRIT LOKI pilot project 107.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unable to display preview.  Download preview PDF.

Similar content being viewed by others

representation of knowledge in psychology

Arguing about representation

Must cognition be representational, references and selected bibliography.

ANDERSON, J.A., HINTON, G.E. 1981. Models of information processing in the brain. In: Hinton, G.E., Anderson, J.A. (1981).

Google Scholar  

ANDERSON, J.R. 1978. Argument concerning representations for mental imagery. Psych. Review 85, 249–277.

Article   Google Scholar  

ANDERSON, J.R. 1983. The architecture of cognition. Harvard, Cambridge, Mass.

BADDELEY, A.D. 1976. The psychology of memory. Basic Books, New York.

BARR, A., FEIGENBAUM, E. 1981. The handbook of artificial intelligence. Vol. 1, Heuristech Press, Stanford.

BERKELEY, G. 1965. Philosophical writings. Macmillan, New York.

BIBEL, W. 1982. Automated theorem proving. Vieweg, Braunschweig.

MATH   Google Scholar  

BLACK, M. 1972. How do pictures represent? In Art, perception and reality, E.H. Gombrich, J. Hochberg, H. Black (eds.), Baltimore.

BLOCK, N.J., ed. 1981. Imagery. HIT Press, Cambridge, Hass.

BOBROW, D.G., COLLINS, A. 1975. Representation and understanding. Academic Press, New York.

BRACHHAN, R.J. 1983. What is a is and isn’t: an analysis of taxonomic links in semantic networks. IEEE Computer 16, 10 (Special issue on knowledge representation).

BRACHHAN, R.J., SMITH, B.C., eds. 1980. Special issue on knowledge representation. SIGART Newsletter No. 70.

CHAFE, W.L. 1975. Creativity in verbalization as evidence for analogic knowledge. TINLAP-1, 158–159.

CLONES, M.B. 1971. On seeing things. Artificial Intelligence 2, 79–116.

COLLINS, A.M., LOFTUS, E.F. 1975. A spreading activation theory of semantic processing. Psych. Review 82, 407–428.

DIRLICH, G., FREKSA, C., FURBACH, U. 1983. A central problem in representing human knowledge in artificial systems: the transformation of intrinsic into extrinsic representations. Proc. 5th Cognitive Science Conference, Rochester, New York.

DREYFUS, H.L. 1979. What computers can’t do¡ a critique of artificial reason. 2nd ed. Harper & Row, New York.

EVANS, T.G. 1968. A program for the solution of a class of geometric-analogy intelligence-test questions. In: Hinsky (1968).

FEIGENBAUH, E.A., FELDHAN, J. 1963. Computers and thought. McGraw Hill.

FREGE, G. 1879. Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. L. Nebert, Halle.

FREGE, G. 1892. Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik, vol. 100, 25–50, Leipzig.

FUNT, B.V. 1980. Problem-solving with diagrammatic representations. Artificial Intelligence 13, 201–230.

FUNT, B.V. 1983. Analogical modes of reasoning and process modeling. IEEE Computer 16, 10, 99–105.

FURBACH, U., DIRLICH, G., FREKSA, C. 1984. Towards a theory of knowledge representation systems. To appear in: Bibel, W. & Petkoff, B. (eds.) Artificial intelligence methodology systems applications — AIHSA ‘84. North-Holland, Amsterdam.

GELERNTER, H. 1963. Realisation of a geometry-theorem proving machine. In: Feigenbaum & Feldman (1963), 153–163.

HAUGELAND, J., ed. 1981. Hind design. HIT Press, Cambridge, Hass.

HAYES, P.J. 1974. Some problems and non-problems in representation theory. Proc. AISB Summer Conference, Univ. of Sussex, 63–79.

HINTON, G.E. 1980. Some demonstrations of the effects of structural descriptions in mental imagery. University of California, San Diego.

HINTON, G.E., ANDERSON, J.A., eds. 1981. Parallel models of associative menory. Erlbaum, Hillsdale, N.J.

HOBBS, J. 1980. In: Brachman & Smith (1980), p.44.

HUNT, M. 1982. The universe within — a new science explores the human sind. Simon and Schuster, New York.

JOHNSON-LAIRD, P.N., WASON, P.C., eds. 1977. Thinking. Cambridge University Press.

JOHNSON-LAIRD, P.N. 1980. Mental models in cognitive science. Cognitive Science 4, 1, 71–115.

KAY, P. 1981. Color perception and the meaning of color words. University of California, Berkeley.

KOSSLYN, S.U. 1975. Information representation in visual images. Cognitive Psychology 7, 341–370.

KOSSLYN, S.U. 1980. Image and mind. Harvard University Press.

KOSSLYN, S.M., BALL, T.M., REISER, B.J. 1978. Visual images preserve metric spatial information: evidence from studies of image scanning. J. Exp. Psych.: Human perception and performance 4, 46–60.

KOWALSKI, R. 1974. Predicate logic as a programming language. IFIP-74, 569–574, North-Holland, Amsterdam.

LUCE, R.D., BUSH, R.R., GALANTER, E., eds. 1963. Handbook of mathematical psychology. Wiley, New York.

MARR, D. 1982. Vision. Freeman, San Francisco.

MCCARTHY, J., HAYES, P. 1969. Some philosophical problems from the standpoint of artificial intelligence. Machine Intelligence 4, 463–502.

MILLER, G.A. 1956. The magical number seven, plus or minus two. Psych. Review 63, 81–97.

MINSKY, M.L. 1970. Form und content in computer science. ACM Turing lecture. JACM 17, 2, 197–215, 1970.

Article   MATH   Google Scholar  

MINSKY, M.L. 1975. A framework for representing knowledge. In: Winston, P. (ed.) The psychology of computer vision. McGraw Hill, New York.

MYLOPOULOS, J., LEVESQUE, H. 1983. An overview of knowledge representation. GWAI-83, 143–157. Springer, Heidelberg.

NEISSER, U. 1967. Cognitive Psychology. Appleton-Century-Crofts, New York.

NEWELL, A. 1980. Physical symbol systems. Cognitive Science 4, 2, 135–183.

NEWELL, A., SIMON, H. 1972. Human problem solving. Prentice Hall.

NORMAN, D.A., RUMELHART, D.E., eds. 1975. Explorations in cognition. Freeman, San Francisco.

PALMER, S.E. 1975. The nature of perceptual representation: an examination of the analog/propositional controversy. TINLAP-1, 165–173.

PALMER, S.E. 1978. Fundamental aspects of cognitive representation. In: Rosch, E. & Lloyd, S.S. (eds.) Cognition and categorization. Lawrence Erlbaum, Hillsdale.

PYLYSHYN, Z. 1973. What the mind’s eye tells the mind’s brain: a critique of mental imagery. Psych. Bull. 80, 1, 1–24.

PYLYSHYN, Z. 1975. Do we need images and analogs? TINLAP-1, 174–177.

PYLYSHYN, Z. 1978. Imagery and artificial intelligence. In: Savage, C.W. (ed.), Perception and cognition: issues in the foundation of psychology. University of Minneapolis Press.

PYLYSHYN, Z.W. 1981. The imagery debate: analog media versus tacit knowledge. In: Block (1981).

PYLYSHYN, Z. 1984. Computation and cognition. MIT Press, Cambridge, Mass.

RUMELHART, D.E., NORMAN, D.A. 1980. Analogical processes in learning. CHIP 97, University of California, San Diego.

RUMELHART, D.E., NORMAN, D.A. 1983. Representation in memory. CHIP 116, University of California, San Diego.

RUSSELL, B. 1905. On denoting. Mind N.S. 14, 479–493.

SEARLE, J. 1980. Minds, brains, and programs. The Behavioral and Brain Sciences 3, 3, 417–457. Also in: Haugeland (1981).

SHEPARD, R.N., COOPER, L.A. 1982. Mental images and their transformations. MIT Press, Cambridge.

SIMON, H.A., NEWELL, A. 1956. Models: their use and limitations. In: L.D. White (ed.) The state of the social sciences, 66–83, University of Chicago Press.

SLOMAN, A. 1971. Interactions between philosophy and artificial intelligence: the role of intuition and non-logical reasoning in intelligence. IJCAI-2, 270–278. London. Also in: Artificial Intelligence 2, 209–225.

SLOMAN, A. 1975. Afterthoughts on analogical representations. TINLAP-1, 178–182.

SOWA, J.F. 1983. Conceptual structures. Information processing in mind and machine. Addison-Wesley.

STICH, S. 1984. Folk psychology and cognitive science: the case against belief. MIT Press, Cambridge, Mass.

WEYHRAUCH, R.W. 1980. Prolegomena to a theory of mechanized formal reasoning. Artificial Intelligence 13, 133–170.

Article   MathSciNet   MATH   Google Scholar  

WINOGRAD, T. 1975. Frames and the declarative-procedural controversy. In: Bobrow and Collins (1975).

WINOGRAD, T. 1983. Language as a cognitive process. Vol. 1 Syntax. Addison-Wesley.

Download references

Author information

Authors and affiliations.

Institut für Informatik, TU München, Arcisstr. 21, 8000, München 2, Germany

Christian Freksa

Fachbereich Informatik, HSBw München, Werner-Heisenberg-Weg 39, 8014, Neubiberg, Germany

Ulrich Furbach

Abteilung Biostatistik, MPI für Psychiatrie, Kraepelinstr. 2, 8000, München 40, Germany

Gerhard Dirlich

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Institut für Informatik, Universität Stuttgart, Herdweg 51, D-7000, Stuttgart 1, Germany

Joachim Laubsch

Rights and permissions

Reprints and permissions

Copyright information

© 1985 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper.

Freksa, C., Furbach, U., Dirlich, G. (1985). Cognition and Representation. In: Laubsch, J. (eds) GWAI-84. Informatik Fachberichte, vol 103. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-46546-8_10

Download citation

DOI : https://doi.org/10.1007/978-3-642-46546-8_10

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-540-15669-7

Online ISBN : 978-3-642-46546-8

eBook Packages : Springer Book Archive

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 14 August 2024

Abstract representations emerge in human hippocampal neurons during inference

  • Hristos S. Courellis   ORCID: orcid.org/0000-0001-5963-679X 1 , 2   na1 ,
  • Juri Minxha 1 , 2 , 3   na1 ,
  • Araceli R. Cardenas 4 ,
  • Daniel L. Kimmel   ORCID: orcid.org/0000-0002-3914-6061 3 , 5 ,
  • Chrystal M. Reed   ORCID: orcid.org/0000-0002-7157-3645 6 ,
  • Taufik A. Valiante   ORCID: orcid.org/0000-0002-3443-3790 4 ,
  • C. Daniel Salzman 3 , 5 , 7 , 8 , 9 ,
  • Adam N. Mamelak 1 ,
  • Stefano Fusi   ORCID: orcid.org/0000-0002-3035-6652 3 , 8 , 9   na2 &
  • Ueli Rutishauser   ORCID: orcid.org/0000-0002-9207-7069 1 , 2 , 6 , 10   na2  

Nature volume  632 ,  pages 841–849 ( 2024 ) Cite this article

19k Accesses

423 Altmetric

Metrics details

  • Cognitive neuroscience
  • Hippocampus
  • Neural decoding

Humans have the remarkable cognitive capacity to rapidly adapt to changing environments. Central to this capacity is the ability to form high-level, abstract representations that take advantage of regularities in the world to support generalization 1 . However, little is known about how these representations are encoded in populations of neurons, how they emerge through learning and how they relate to behaviour 2 , 3 . Here we characterized the representational geometry of populations of neurons (single units) recorded in the hippocampus, amygdala, medial frontal cortex and ventral temporal cortex of neurosurgical patients performing an inferential reasoning task. We found that only the neural representations formed in the hippocampus simultaneously encode several task variables in an abstract, or disentangled, format. This representational geometry is uniquely observed after patients learn to perform inference, and consists of disentangled directly observable and discovered latent task variables. Learning to perform inference by trial and error or through verbal instructions led to the formation of hippocampal representations with similar geometric properties. The observed relation between representational format and inference behaviour suggests that abstract and disentangled representational geometries are important for complex cognition.

Similar content being viewed by others

representation of knowledge in psychology

Emergence of abstract rules in the primate brain

representation of knowledge in psychology

The generative grammar of the brain: a critique of internally generated representations

representation of knowledge in psychology

Geometry of abstract learned knowledge in the hippocampus

Humans have a remarkable capacity to make inferences about hidden states that describe their environment 3 , 4 , 5 and use this information to adjust their behaviour. One core cognitive function that enables us to perform inference is the construction of abstract representations of the environment 5 , 6 , 7 . Abstraction is a process through which relevant shared structure in the environment is compressed and summarized, while superfluous details are discarded or represented so that they do not interfere with the relevant ones 8 , 9 . This process often leads to the discovery of latent variables that parsimoniously describe the environment. By performing inference on the value of these variables, frequently from partial information, the appropriate actions for a given context can rapidly be deployed 5 , 10 , thereby generalizing from past experience to new situations.

What would be the signature of an abstract neural representation that enables this kind of adaptive behaviour? The simplest form of abstraction is one in which all the irrelevant information is discarded—for example, when the representations of pedestrian crossings in left-driving (UK) and right-driving (USA) nations are two unique and distinct patterns of neural activity that do not depend on sensory details (such as whether the crossing is in the city or countryside) (Fig. 1a ). Looking left or right before crossing (two actions separated by a plane in the activity space that represents a linear readout) would readily generalize to the countryside after visiting a city. However, this kind of invariance is rarely observed in the brain. A more general geometric definition of an abstract representation has been proposed 11 ; consider the non-trivial geometrical arrangement in Fig. 1b , in which the geographical area (city or countryside) and the nation of a crossing are represented along two orthogonal directions (the two variables are disentangled). The activity projected along one of these directions is invariant with respect to the value of the other variable. This type of invariance has important computational properties: it allows a simple linear readout to generalize to new situations. We therefore use this property as the defining characteristic of an abstract representation: a representation of a particular variable is abstract if a linear decoder trained to report the value of that variable can generalize to new conditions. The novel conditions are defined by the values of other variables. Representations with these properties have been observed in monkeys 11 , 12 , 13 , in rodents 14 , 15 and in artificial neural networks 11 , 16 , 17 . Are these abstract representations also observed in the human brain? How are they formed as a function of learning, and do they matter for behaviour? The hippocampus is thought to be critical for the implementation of abstraction and inference-related computations 10 , 11 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , but it remains unknown whether abstract representations can emerge in the hippocampus in the timescales needed for rapid learning.

figure 1

a , b , Possible definitions of abstraction as clustering ( a ) or generalization ( b ). In the latter, the two variables are orthogonal to each other and preserved, whereas one of the variables (geographic area) is discarded in the former. c , Task and example trial. Blocks of trials alternated between the two contexts. In each trial, the stimulus remained on the screen until participants pressed a button, followed by the outcome. d , e , Task structure. d , Each stimulus (A–D) is associated with a single correct response and results in either a high or low reward if the correct response is given. e , Stimulus–response relationships are inverted between contexts 1 and 2. f , Behaviour. Accuracy is shown separately for inference present ( n  = 22) and absent ( n  = 14) sessions for the last trial before the context switch, the first trial after the context switch and for the remaining three inference trials averaged over all trials in each session (mean ± s.e.m. across sessions). The dashed line marks chance. The black box indicates inference trial 1. ** P  < 0.005 for rank-sum inference sbsent versus present over sessions. g , Electrode locations. Each dot denotes a microwire bundle. Locations are shown on the same hemisphere (right) for visualization purposes only. h – j , Example neurons that encode response ( h ), context ( i ) and mixtures of stimulus ID (indicated by A–D) and context (indicated by 1 or 2) ( j ). Error bars are ± s.e.m. across trials. t  = 0 is stimulus onset. Black points indicate P  < 0.05 of one-way ANOVA of plotted variables. k , Number of units recorded in each brain area. l , Number of single units across all brain areas showing significant main or interaction effects to at least one variable ( n -way ANOVA, P  < 0.05, Methods ). Variables tested: response (R), context (C), outcome (O), and stimulus ID (S). Brain areas assessed: amygdala (AMY), dorsal anterior cingulate cortex (dACC), hippocampu (HPC), presupplementary motor area (preSMA), and ventromedial prefrontal cortex (vmPFC).

Task and behaviour

We recorded the activity of populations of neurons in the brains of patients with epilepsy while they learned to perform a reversal learning task (17 patients, 42 sessions; Supplementary Table 1 ). Patients viewed a sequence of images and indicated for each whether they thought that the associated action was a ‘left’ or ‘right’ response (Fig. 1c ). Participants discovered from the feedback provided after each response what the correct response was for a given image. There were two fixed mappings (stimulus–response maps) between each of the four stimuli: the associated correct response and reward given for a correct response (Fig. 1e ). Which of the two fixed mappings was active changed at random times (Fig. 1c ), requiring participants to infer when a context switch occurred from the feedback received. The two stimulus–response maps were systematically related: all stimulus–response pairings were inverted between the two contexts (Fig. 1e ). Therefore, the participants who had learned the stimulus–response maps could make a mistake immediately after the switch but then, following one instance of negative feedback, they could infer that the context had changed and update their stimulus–response associations according to the other map. Therefore, if participants were performing inference, they could respond accurately to stimuli that they had not yet experienced in the new context. We refer to the trials in which a given stimulus was encountered for the first time following a context switch as inference trials (excluding the first trial that resulted in the negative feedback) and to the remaining trials as non-inference trials. Patients performed with high accuracy in non-inference trials in 36 of the recorded sessions (Extended Data Fig. 1a,b and Methods state exclusion criteria). Each of the 36 included sessions was classified as either an ‘inference present’ or ‘inference absent’ session depending on the accuracy by which patients responded to the first inference trial (Fig. 1f (timepoint 2) and  Methods ).

Single-neuron recordings

Neural data recorded in the 36 included sessions yielded 2,694 (of 3,124) well-isolated single units, henceforth neurons, distributed across the hippocampus (494 neurons), amygdala (889 neurons), presupplementary motor area (269 neurons), dorsal anterior cingulate cortex (310 neurons), ventromedial prefrontal cortex (463 neurons) and ventral temporal cortex (VTC, 269 neurons) (Fig. 1g,k ). Action potentials discharged by neurons were counted during two 1 s long trial epochs: during the baseline period (base, −1s to 0 s before stimulus onset), and during the stimulus period (stim, 0.2 to 1.2 s after stimulus onset).

Single-neuron responses during the two analysis periods were heterogeneous. During the stimulus period, some neurons showed selectivity to one or several of the four variables stimulus identity, response, (predicted) outcome and context (Fig. 1h–j and Extended Data Fig. 1j show example neurons tuned to response and context). Other neurons were modulated by combinations of these variables (Fig. 1j , example neuron tuned to conjunction of stimulus and context). Across all brain areas, 54% of units (1,447 out of 2,694) were tuned to one or several task variables, with 26% of units (706 out of 2,694) showing only interaction effects, 17% (449 out of 2,694) showing only main effects and 11% (292 out of 2,694) showing both when fitting a three-way analysis of variance (ANOVA) for response, context and outcome (Fig. 1l , RCO column, chance was 135 out of 2,694 units, factor significance at P  < 0.05 and Extended Data Fig. 1t shows each brain area separately). These findings indicate diverse tuning to many task variables simultaneously across all brain regions (Extended Data Fig. 1k–p,t–u ).

Geometric analysis approach

We analysed neural pseudo-populations constructed by pooling all recorded neurons across patients ( Methods ). In our task, the geometry of a representation was defined by the arrangement in the activity space of the eight points that represented the population responses in different experimental conditions (Fig. 2d ). Low-dimensional disentangled geometries would be abstract because they confer on a linear readout the ability to cross-generalize. Consider a simplified situation with three neurons (the axes) and two stimuli in two contexts (Fig. 2a–c ). Imagine that the four points (two per context) are arranged on a relatively low-dimensional square (the maximal dimensionality for four points is three), with the context encoded along one side and stimulus along the two orthogonal sides (Fig. 2a ). Then, a linear decoder for stimulus (A versus B), trained only on context 1 conditions, can readily generalize to context 2 (Fig. 2b ) and the stimulus is said to be abstract. This ability to generalize is due to the parallelism of the stimulus coding directions in the two contexts (Fig. 2c ). Moreover, context and stimulus are represented in orthogonal subspaces, and hence, they are called disentangled variables 24 , 25 . This means that context is also abstract.

figure 2

a , Example neural state space formed by three neurons. Points represent the response patterns in various task conditions. Black arrows mark coding vectors. b , CCGP. A decoder is trained to differentiate between stimulus A and B in context 1 and evaluated in context 2. If context is represented in an abstract format, then the decoder generalizes, yielding high CCGP for context. c , PS. In disentangled representations, the coding vectors (arrows) are parallel. d , Illustration of the dichotomies (variables) context, stim pair and parity with class labels indicated. See Extended Data Fig. 2 for all dichotomies. e , f , Neural representation during stimulus period in hippocampus. Context and stim pair are decodable in inference present sessions ( e ) and are encoded in an abstract format ( f ). Each dot shows one of the 35 dichotomies. The horizontal black line shows shattering dimensionality. Grey bars denote the 5th–95th percentile of the null distribution. Stars denote named dichotomies that are above chance in inference present sessions and are significantly different from their corresponding inference absent value ( P RS  < 0.05/35, two-sided rank-sum test, Bonferroni corrected for multiple comparisons across all dichotomies). g , Decodability of all dichotomies for the other brain areas. AMY, amygdala. See e for notation. h , i , Neural representation during baseline period in hippocampus is decodable in inference present ( h ) and encoded in an abstract format ( i ). Trials are labelled according to the previous trial. See e , f for notation. Context differed significantly between present and absent ( P  = 1.1 × 10 −33 and P  = 2.4 × 10 −34 , respectively). j , Hippocampal population response during the stimulus period in inference absent and present sessions shown using MDS ( Methods ). Points correspond to stimuli and context combinations, black lines show hypothetical hyperplanes for context and stimulus pair decoders. In all panels, neuron counts are balanced between inference absent and inference present sessions for every brain area to make values comparable. * P  < 0.05.

We use two metrics to assess whether information was represented in this way in the neural data: cross-condition generalization performance (CCGP), which assesses the ability of a linear decoder to generalize across conditions, and the parallelism score (PS), which measures the cosine similarity of different coding vectors. High CCGP and PS are defining characteristics of an abstract representation of a variable. We considered the representational geometry of all 35 possible variables. Each variable corresponds to one possible balanced split (dichotomy) of the eight task conditions into two groups of four conditions each (Fig. 1b,d and Extended Data Fig. 2 ). We highlight in Fig. 2d the interpretable variables that turned out to be important in the analysis: context, behaviourally relevant stimulus grouping (stim pair) and parity (which measures the degree of nonlinear interactions of variables in the neural population). Last, we refer to the average decodability across all possible variables as shattering dimensionality 11 , 26 , a metric that assesses the dimensionality of the representation.

Context is abstract in hippocampus

We first compared the decodability of variables between inference present and absent sessions in the hippocampus. Following stimulus onset, shattering dimensionality was larger in inference present sessions (Fig. 2e , inference absent versus present, 0.57 versus 0.62, P RS  = 2.7 × 10 −3 , rank-sum over dichotomies). The two variables that increased the most in decodability were latent context (Fig. 2e , red, inference absent versus present, P RS  = 2.9 × 10 −27 , P Absent  = 0.12, P Present  = 5.1 × 10 −5 ; P Absent and P Present are non-parametric significance tests versus chance based on the empirically estimated null distribution and P RS is a pair-wise comparison using a two-tailed rank-sum test) and stim pair (Fig. 2e , purple, inference absent versus present, P RS  = 5.0 × 10 −27 , P Absent  = 0.015, P Present  = 7.9 × 10 −7 , t). A third dichotomy also became more decodable: parity (Fig. 2d,e , orange; P RS  = 1.5 × 10 −21 , P Absent  = 0.27, P Present  = 0.0055). The parity dichotomy is an indicator of the expressiveness of a neural representation because it probes for nonlinear interactions. Generalizing this finding, dividing different dichotomies into increasing levels of ‘difficulty’ reveals that average decoding accuracy is highest for the most difficult dichotomies in the hippocampus (Extended Data Fig. 5 ).

We next examined the format of the dichotomies context, stim pair and parity in the hippocampus. During the stim period, CCGP (Fig. 2f and Extended Data Fig. 3d ) was significantly elevated for both the context (Fig. 2f , red, inference absent versus present, P RS  = 2.0 × 10 −28 , P Absent  = 0.51, P Present  = 0.02) and stim pair (Fig. 2f , purple, inference absent versus present, P RS  = 2.0 × 10 −28 , P Absent  = 0.17, P Present  = 0.0011) variables in inference present but not in inference absent sessions. Similarly, during the prestimulus baseline period, context alone was encoded in an abstract format only in sessions in which participants could perform inference (Fig. 2h,i and Supplementary Note  1 ).

This difference in representation between inference absent and inference present sessions was unique to the hippocampus. No other recorded region showed a significant change in shattering dimensionality (Fig. 2g , black line, all P  > 0.05) or decodability of the variable context or parity (Fig. 2g , red and orange). Although other task variables were also represented in an abstract format in other brain regions, only the hippocampus simultaneously represented the two variables context and stim pair in an abstract format (Extended Data Fig. 3a,b and Supplementary Results ). These two variables are thus represented in roughly orthogonal subspaces (Fig. 2j shows a summary of this roughly disentangled neural geometry).

Context is absent in error trials

We next examined error trials to test whether the presence of context as an abstract variable in the hippocampus was associated with trial-level performance. Contrasting correct with error trials in inference present sessions revealed that decodability and format of the relevant dichotomies in error trials was similar to that in inference absent sessions both during the stimulus period (Extended Data Fig. 3e,f ) and during the baseline period (Extended Data Fig. 3i,j ). This includes, in particular, the context and parity dichotomy (Extended Data Fig. 3e,f , see legends for statistics). These findings demonstrate that both the content and format of the hippocampal neural representation are correlated with behaviour on a trial-by-trial basis.

Stimulus and context are abstract

Many individual hippocampal neurons in humans encode the identity of visual stimuli 27 , 28 . In our data, 109 out of 494 (22%) of neurons in the hippocampus were tuned to stimulus identity (Fig. 3a and Extended Data Fig. 6g,h show examples). We therefore next asked how the variable context interacted with stimulus identity and how this interaction changed with the ability to perform inference. As the four visual stimuli do not share any apparent structure, we do not expect to observe any structured geometry within each context. For this reason, we studied the geometry of pairs of stimuli (for example, stimulus A versus B) in the two contexts. To contrast with the hippocampal results, we examined the responses in the VTC, in which 195 out of 269 (73%) of neurons (Fig. 3d and Extended Data Fig. 6i,j show examples) were modulated by stimulus identity. At the population level, VTC neurons encoded stimulus identity-related balanced dichotomies in an abstract format (Extended Data Fig. 3u–y , purple, brown, pink, P Absent/Present  < 10 −10 ). The dichotomy context, however, was not decodable in the VTC in both inference present and absent sessions (baseline period, Extended Data Fig. 3u , red; compare with Fig. 2h and stimulus period, Extended Data Fig. 3v , red, compare with Fig. 2e ). Furthermore, error trial analysis showed that stimulus-related dichotomies were still decodable during errors in VTC (Extended Data Fig. 3y , purple, brown, pink, P Present(Error)  < 10 −10 ) but not the hippocampus (Extended Data Fig. 3e , stim pair dichotomy). Context was therefore encoded as an abstract variable in the hippocampus but not in the VTC in correct trials. In error trials, VTC but not hippocampus represented stimulus identity. This contrast provides us with an opportunity to examine what changes in the hippocampus specifically when the behaviourally relevant variable context was represented in an abstract format.

figure 3

a – f , Encoding of stimulus identity across contexts. a – c , Responses in hippocampus (HPC) following stimulus onset carry information about stimulus identity. a , Example hippocampal neuron encoding stimulus identity. b , c , Representational geometry of stimulus identity across contexts. Analysis is conducted over pairs of stimuli in each context (legend). Significance of differences is tested using a two-sided rank-sum test comparing inference absent and present over all stimulus pairs (* P  < 0.05, NS otherwise). All other conventions identical to those in Fig. 2 . b , c , CCGP ( P RS  = 0.041) ( b ) and PS ( P RS  = 0.040) ( c ) for stimulus coding across contexts significantly increased in inference present compared to inference absent sessions. d – f , Responses in VTC following stimulus onset carry information about stimulus identity. d , Example VTC neuron encoding stimulus identity. e , f , CCGP ( P RS  = 0.15) ( e ) and PS ( P RS  = 0.39) ( f ) for stimulus coding across contexts does not differ significantly between inference absent and inference present sessions. g , h , Same analysis as in a – f , but for encoding of context across stimulus pairs for hippocampus (see b , c for plotting conventions). CCGP ( P RS  = 0.012) ( g ) and PS for context coding vectors between pairs of stimuli ( P RS  = 0.015) ( h ) both significantly increase from inference absent to inference present sessions. i , Summary of changes in neural geometry in hippocampus. Shown is the MDS of condition-averaged responses of all recorded neurons shown for inference absent and present sessions. Points are average population vector responses to combinations of stimuli and context. Lines connect the same stimuli across context. Abstract coding of stimulus across contexts (solid arrows) and context across stimuli (dashed arrows) are highlighted for one pair of stimuli (C and D). The data in this plot are identical to those of Fig. 2j . Error bars in a , d are ±s.e.m. across trials. All P RS values are from a two-sided rank-sum test.

We next conducted a geometric stimulus-pair analysis to study the interaction of stimulus identity and context coding in the same neural population. The stimulus-pair analysis was designed to detect the presence of simultaneous abstract coding of stimulus identity across contexts and abstract coding of context across stimuli (see Extended Data Fig. 6a–f for an illustration).

The average stimulus decoding accuracy across all individual stimulus pairs in the hippocampus did not differ significantly between inference absent and inference present sessions (0.73 versus 0.76; Extended Data Fig. 6m , P RS  = 0.13, rank-sum over stimulus pairs). By contrast, the geometry of the stimulus representation changed: it became disentangled from context as indicated by significant increases in stimulus CCGP (Fig. 3b , P RS  = 0.041) and stimulus PS (Fig. 3c , P RS  = 0.040) in the inference present sessions. This finding suggests that the representation of stimulus identity was reorganized with respect to the emerging context variable. Note that context was not decodable in inference absent sessions as a balanced dichotomy (Fig. 2e , red). Nevertheless, stimulus decoders did not generalize well across the two contexts in inference absent sessions. This result indicates that context did modulate stimulus representations in the hippocampus, but in a way that was entangled with stimulus identity. This effect was specific to the hippocampus: in VTC, the neural population geometry was unchanged, as indicated by no significant differences in stimulus decodability (Extended Data Fig. 6n , P RS  = 0.15), stimulus CCGP (Fig. 3e , P RS  = 0.15), stimulus CCGP (Fig. 3e , P RS  = 0.15) and stimulus PS (Fig. 3f , P RS  = 0.39).

The presence of abstract coding for one variable (stimulus identity) does not necessarily imply that the other variable (context) is also present in an abstract format. Therefore, we next examined the variable context separately for each pair of stimuli. In the hippocampus, context was decodable for individual pairs of stimuli both during inference absent and inference present sessions, without a significant difference between the two (Extended Data Fig. 7a ; 0.63 versus 0.67; P RS  = 0.065). However, in inference present sessions, the format of the variable context changed so that it was abstract across stimulus pairs as indicated by increases in context CCGP (Fig. 3g , P RS  = 0.012) and context PS (Fig. 3h , P RS  = 0.015) relative to the inference absent group. By contrast, in the VTC, whereas context was decodable from some stimulus pairs (Extended Data Fig. 7b , see legend for statistics), the format of the representation did not change in the way that would be expected for the formation of an abstract variable. Rather, there was a significant decrease in context CCGP (Extended Data Fig. 7c , P RS  = 0.026) and no significant difference in context PS (Extended Data Fig. 7d , P RS  = 0.39).

These findings indicate that the emergence of context as an abstract variable in the hippocampus when patients can perform inference is coupled with the reorganization of stimulus representations so they are also more disentangled, thereby forming a jointly abstracted code for stimuli and context. This transformation of the representation is visible directly in a reduced dimensionality visualization of the data (Fig. 3i , Extended Data Fig. 8 and Supplementary Video  1 ). By contrast, we found no systematic reorganization of stimulus representations in VTC.

Explaining the geometrical changes

We next examined what aspects of neuronal activity changed in the hippocampus to give rise to abstract variables. We considered the following non-mutually exclusive possibilities: (1) increase in distances between conditions (Fig. 4a,b ), (2) decrease in variance of the population response along the coding direction (Fig. 4c ) and (3) increase in parallelism of coding directions (Fig. 4d ).

figure 4

a – d , Changes that could give rise to abstract variables. Shaded circles represent variability, and grey arrows signify changes between inference absent and present.  a , Original, when variable is not abstract. b , Increase in distance. c , Decrease in variance. d , Increase in parallelism. e , Firing rates of hippocampal neurons during stimulus period decreased ( P RS  = 8.3 × 10 −5 , two-sided rank-sum over conditions). Colour indicates task state, with coding indicating identity (for example, task condition C 1 − L describes stimulus C , context 1, outcome \(-\) , response L ). f , Fano factor was not significantly different between inference present and absent sessions (two-sided rank-sum test, P RS  = 0.99). g , Population distance between centroids for all 35 balanced dichotomies. Average distances decrease from inference absent to present ( P RS  = 2.9 × 10 −8 , rank-sum over dichotomies). Grey bars indicate the 5th–95th percentile of the geometric null distribution. h , Context alone is the only dichotomy whose distance significantly increases from inference absent to present (red, P ΔDist  = 0.040, against geometric null of difference). HPC, hippocampus. i , Average trial-by-trial variance projected along the coding direction decreased on average between inference absent and inference present sessions ( P RS  = 6.5 × 10 −13 , rank-sum test). j , k , Same as g , h , but for spike counts during the baseline period. Trials are grouped by identify the previous trial. Distance was significantly reduced across all dichotomies ( j , P RS  = 6.4 × 10 −13 , rank-sum over dichotomies) and context alone shows a distance reduction that is smaller than would be expected by chance ( k , red, P ΔDist  = 0.027, against geometric null of difference). l , Stimulus-tuned neurons in the hippocampus were modulated by context more consistently in inference present sessions ( P RS  = 0.0039) during the stimulus period ( n  = 63, error bars are ±s.e.m. across neurons). m , Illustration of changes in neural state space. Context dichotomy distance increased, variance decreased and consistency of stimulus modulation across contexts increased. In all panels, P RS values are from a two-sided rank-sum test and grey bars indicate the 5th–95th percentile of the geometric null distribution.

We first examined whether mean firing rates across all recorded neurons differed between inference absent and inference present sessions. The firing rate across conditions decreased from 3.37 ± 0.13 to 1.36 ± 0.03 Hz (±s.e.m.): a 60% reduction on average during the stimulus period (Fig. 4e , P RS  = 8.3 × 10 −5 ). Firing rates were also reduced during the baseline period (3.29 ± 0.09 to 1.38 ± 0.02 Hz, 58% reduction, Extended Data Fig. 9q ). This firing-rate reduction was unique to the hippocampus (Extended Data Fig. 9c,r ). The firing-rate reduction led to a decrease in the average distance between class centroids across all dichotomies in inference present sessions except one (5.77 ± 0.22 to 4.17 ± 0.07 Hz, P RS  = 2.9 × 10 −8 , Fig. 4g ). The lone exception was the context dichotomy, for which distance increased (4.3 versus 5.0 Hz, P Absent  = 0.87, P Present  = 0.076, P ΔDist  = 0.040, Fig. 4g,h and Extended Data Fig. 9h ). Indeed, context was the dichotomy with the largest change in distance in firing-rate space when comparing the inference present and inference absent conditions (Fig. 4h ). This isolated significant rise in context separability was not seen in any of the other recorded brain areas during the stimulus period (Extended Data Fig. 9a,b ). Similarly, during the baseline period, the distance between context centroids decreased the least in the hippocampus (5.6 versus 5.0 Hz, P Absent  = 0.68, P Present  = 0.0007, P ΔDist  = 0.027, Fig. 4j,k ) despite the significant decrease in distance over all dichotomies that was also observed here due to the firing-rate reduction (5.85 ± 0.08 to 4.25 ± 0.04 Hz, P RS  = 6.5 × 10 −13 , Fig. 4j ).

Next, we assessed changes in the variability of the population response along the coding direction of each dichotomy. The variance along the coding direction of neuronal responses in the hippocampus decreased for all dichotomies in inference present when compared to inference absent sessions during both the stimulus period (2.51 ± 0.16 versus 1.53 ± 0.06, P RS  = 6.5 × 10 −13 , Fig. 4i ) and the baseline period (2.49 ± 0.09 versus 1.58 ± 0.02, P RS  = 6.5 × 10 −13 , Extended Data Fig. 9k,l ). However, this decrease could be a consequence of the reduction in firing rates under the assumption of Poisson statistics. We conducted a condition-wise fano-factor analysis to assess whether the variance reduction was beyond that expected for the reduction in firing rates. This analysis revealed no significant differences in fano factors between inference absent and inference present sessions during the stimulus period (1.39 ± 0.22 versus 1.36 ± 0.14, P RS  = 0.99, Fig. 4f ) and the baseline period (1.61 ± 0.26 versus 1.45 ± 0.11, P RS  = 0.19). Together, these two findings suggest that the decrease in variance along dichotomy coding directions is explained by the decreases in firing rate.

Did changes in tuning of individual neurons give rise to the increases in parallelism for context across stimuli (see Extended Data Fig. 7e–h for examples)? A stimulus-tuned neuron also modulated by context could do so consistently across all stimuli (for example, firing rate increased for all stimuli), or inconsistently (for example, firing rate increased for some stimuli and decreased for others). We quantified the consistency of context modulation across stimuli for each individual neuron ( Methods ). The consistency of context modulation in the hippocampus increased significantly in inference present sessions (Fig. 4l and Extended Data Fig. 7i , 1.8 ± 0.2 versus 2.9 ± 0.3, P RS  = 0.0049). This effect was specific to hippocampus: in VTC, the same metric decreased significantly (Extended Data Fig. 7i , 2.6 ± 0.3 versus 1.6 ± 0.2, P RS  = 0.0039).

These data indicate that the geometric changes seen in the hippocampus were due to the following (Fig. 4m ): (1) an increase in separation between condition average representations of the two contexts despite relaxing towards the origin (decrease in firing rate), (2) decreases in variance along the coding direction, and (3) neurons becoming increasingly consistent (parallel coding directions) in their modulation across stimulus and context dimensions.

Effect of verbal instructions

Did the format of the representation differ between participants who discovered the underlying latent variable context by themselves and those who only did so after receiving verbal instructions? We provided all patients with verbal instructions detailing the latent task structure after session 1 (Fig. 5 , inset), allowing us to examine this question. We divided patients into three types on the basis of their behaviour: those who showed inference behaviour in the first session (pre-instruction inference, three patients, six sessions; Extended Data Fig. 1i ); those who showed inference behavior after being given verbal instructions (post-instruction inference, five patients, ten sessions; Extended Data Fig. 1g ); and those who did not perform inference even after being provided with verbal instructions (inference ‘not exhibited’, four patients, eight sessions; Extended Data Fig. 1h ). Only patients who performed accurately in non-inference trials in both sessions one and two were included in one of these three groups (Extended Data Fig. 1g–i , ‘last’; five patients excluded, Supplementary Table 1 ). The principal difference between the post-instruction (Fig. 5a and Extended Data Fig. 10a,b ) and inference not exhibited (Fig. 5a and Extended Data Fig. 10h,i ) groups is their ability to perform inference following the verbal instructions, with both groups performing the task accurately otherwise. The pre-instruction inference group, on the other hand, showed above-chance inference performance during both sessions (Fig. 5a and Extended Data Fig. 10o,p ).

figure 5

a , Top, behavioural performance on the first inference trial for patients that performed inference after instructions ( n  = 10 sessions, post-instruction), those that did not perform inference even after instructions ( n  = 8 sessions, not exhibited) and those that performed inference already before instructions ( n  = 6 sessions, pre-instruction). Error bars are ±s.e.m. across sessions and P values are rank-sum session 1 versus 2. Bottom, schematic of the experiment. Session before and after high-level instructions are referred to as sessions 1 and 2, respectively. b – d , Encoding of context during the stimulus period in different groups of patients. The first trial following a switch is excluded from this analysis. * P  < 0.05 against null in any column of a given geometric measure plot estimated empirically from the null distribution. b , Post-instruction group. Context was significantly decodable in session 2 correct but not error trials and also not in session 1 ( P 1  = 0.17, P 1(correct)  = 0.016, P RS  = 3.1 × 10 −19 , P 2(error)  = 0.99). c , Not exhibited group. Context was not significantly decodable ( P 1  = 0.44, P 2  = 0.42). d , Pre-instruction group. Context was decodable in session 1 ( P 1  = 0.014, P Two  = 0.17). e , Summary of changes due in instructions based on the PS for context. Neuron counts are equalized across groups by subsampling. Context PS increases significantly from session 1 to 2 in the post-instruction group ( P Postinstruction,i  = 0.20, P Postinstruction,2  = 0.0028). Context PS is not significantly different from chance for the not exhibited group ( P Not exhibited,1/2  < 0.5) and is different from chance in both sessions for the pre-instruction inference ( P Pre-instruction,1/2  < 0.005) group. All P values are versus chance and are empirically estimated from the null distribution. f , Example hippocampal neuron with univariate context encoding in the session after (bottom) but not before (top) instructions (one-way ANOVA, P One  = 0.40, P Two  = 0.010). Error bars are ±s.e.m. across trials.

In the post-instruction inference group, context was decodable in the hippocampus during the stimulus period on correct trials in the session following the verbal instructions (Fig. 5b , P One  = 0.17, P Two  = 0.016, P RS  = 3.1 × 10 −19 ).This representation of context was in an abstract format, as indicated by significant increases in both CCGP (Extended Data Fig. 10c ; P One  = 0.28, P Two  = 0.047, P RS  = 8.4 × 10 −16 and PS (Extended Data Fig. 10d ; P One  = 0.023, P Two  = 1.2 × 10 −6 ). Successful performance in the task was associated with context being represented abstractly in the hippocampus, as both the decodability (Fig. 5b , P Two(error)  = 0.99, session 2 correct versus error, P RS  = 4.3 × 10 −20 ) and PS (Extended Data Fig. 10d , P Two(error)  = 1.1 × 10 −4 ) of context decreased significantly on error trials in session 2. Context was also encoded in an abstract format during the baseline period in the same performance dependent manner as context in the stimulus period (Extended Data Fig. 10e–g ). By contrast, in patients in the inference not exhibited group, context was not encoded by hippocampal neurons during the stimulus (Fig. 5c and Extended Data Fig. 10j,k , all P One/Two  > 0.05) nor the baseline (Extended Data Fig. 10l,n all P One/Two  > 0.05) periods in session 2. Thus, the ability of post-instruction group patients to perform inference following instructions was associated with the rapid emergence of an abstract context variable in their hippocampus.

This effect could also be appreciated at the single-neuron level in the hippocampus. In the instruction successful group, the proportion of neurons that are linearly tuned to context ( P  < 0.05, one-way ANOVA for context) during both the stimulus (8% (6 out of 75 neurons) versus 18% (17 out of 93 neurons), P  = 0.027) and baseline (7% (5 out of 75 neurons) versus 16% (15 out of 93 neurons), P  = 0.029) periods increased in session 2 versus session 1 (Fig. 5f shows an example). By contrast, in the not exhibited group, there was no significant change in tuning to context at the single-neuron level both during the stimulus period (6% session 1 versus 6% session 2, P  = 0.41) and the baseline period (8% session 1 versus 5% session 2, P  = 0.27).

For the pre-instruction inference patient group, context was already decodable during session 1 (Fig. 5d , P One  = 0.014) and the PS was significant and near the top of the dichotomy rank order in sessions 1 and 2 (Extended Data Fig. 10r , P One  = 1.5 × 10 −9 , P Two  = 1.7 × 10 −6 ). A similar trend was observed with the baseline context representation for these patients (Extended Data Fig. 10s–u ). This finding suggests that the context variable these patients learned experientially during session 1 was in an abstract format.

Last, we compared the geometry of the context representations formed by each of the three patient groups (balancing number of neurons, Methods ). Context PS increased significantly in the post-instruction inference group, from levels not different from chance during session 1 ( P One,Post-inst  = 0.20, Fig. 5e ) to a level comparable to the pre-instruction inference group during session 2 ( P Two,Post-inst  = 0.0028, P Two,Pre-inst  = 0.0035, Fig. 5e ). The PS in the pre-instruction inference group, on the other hand, did not change significantly and was already above chance in session 1 (Fig. 5e , see legend for statistics). These findings suggest that hippocampal neurons in the pre-instruction inference group carried an abstract representation of context before receiving verbal instructions, and retained that geometry after receiving instructions. Furthermore, neurons of participants in the post-instruction inference group encode a task representation whose geometry resembled that of the pre-instruction group, indicating that a similar representational geometry can be constructed through either experience or within minutes through instruction to support inference in a new task.

How can a neural or biological network efficiently encode many variables simultaneously 11 , 29 ? One solution is to encode variables in an abstract format so they can be re-used in new situations to facilitate generalization and compositionality 24 , 30 , 31 , 32 , 33 , 34 . Here we show that such an abstract representation emerged in the human hippocampus as a function of learning to perform inference. The format by which latent context and stimulus identity were represented was predictive of the ability to perform behavioural generalizations that rely on contextual inference. Patients performed well on non-inference trials in all sessions included in the analysis, indicating that they understood the task. Therefore, the difference between the inference present and absent sessions was only in whether they performed inference following the covert context switch (Fig. 1f ). For those sessions in which patients did not perform inference, there was no systematic relationship between context coding vectors across stimuli. For sessions in which patients performed inference, there was alignment of the context coding direction across stimuli (making them parallel), indicating that the context variable had been disentangled from the stimulus identity variable in the hippocampi of these patients (Figs. 2j and 3i ). As a result, the two variables became disentangled, thereby allowing generalization. This representation was implemented by the hippocampus using a broadly distributed code as evidenced by the high context PS (Extended Data Fig. 3f,g,j,n ) and the lack of reliance on univariately tuned context neurons to generate the abstract context representation (Extended Data Fig. 4a–j and Supplementary Note 2 ). Thus, the geometry we study here did not trivially arise from classically tuned neurons.

Inferential reasoning is thought to rely on cognitive maps, which have been observed in the hippocampus and other parts of the brain 19 , 35 , 36 , 37 , 38 , 39 . Cognitive maps are thought to underlie inferential reasoning in various complex cognitive and spatial domains 3 , 10 , 35 , 36 , 40 , 41 . However, little is known about how maps for cognitive spaces emerge at the cellular level in the human brain as a function of learning. Here we show that a cognitive map that organizes stimulus identity and latent context in an ordered manner emerges in the hippocampus. The cognitive map emerges because task states in one context, indexed by stimulus identity, become systematically related to the corresponding task states in the other context through a dedicated context coding direction that is disentangled from stimulus identity (Fig. 3b,c,g–i ). Furthermore, the relational codes between task states (stimuli) in each context are preserved across contexts.

Hippocampal cognitive maps observed in other studies are often different from those that we observed because the encoded variables are observed to nonlinearly interact, a signature of high-dimensional representations. These representations are believed to be the result of a decorrelation of the neural representations (recoding) that is aimed at maximizing memory capacity 42 , 43 , 44 . This form of preprocessing leads to widely observed response properties, such as those of place cells 45 . However, there is some evidence of hippocampal neurons that encode one task variable independently of others 15 , 21 , 46 , 47 , 48 , 49 , 50 , 51 . In these studies, no correspondence was shown between different representational geometries in the hippocampus and differences in behaviour. Here the task representations generated when patients cannot perform inference (but can still perform the task) are systematically different from the abstract hippocampal representations of context and stimulus identity that correlate with inference behaviour 11 . Finally, it is important to stress that we also observed an increase in the shattering dimensionality, which has been in shown in other studies to be compatible with the low dimensionality of disentangled representations 11 , 15 .

We found stimulus identity codes in brain regions other than the hippocampus, but these mostly lacked reorganization as a function of learning to perform inference. This code stability is particularly salient in the VTC, a region analogous to macaque IT cortex, in which neurons construct a high-level representation of visual stimuli 52 , 53 , 54 . Some studies conducting unit recordings in this general region in humans show that neurons show strong tuning to stimulus identity 55 . We similarly find that VTC neurons encode visual stimulus identity (Fig. 3d–f and Extended Data Fig. 6n ). However, these responses were not modulated by latent context in a systematic manner. As a result, despite being decodable for some individual stimulus pairs, context was not represented in an abstract format. Rather, in VTC, context was only weakly decodable for a subset of the stimuli, context decodability did not change between inference absent and inference present sessions (Extended Data Fig. 7b,c ), and stimulus identity geometry was not reorganized relative to context in inference present sessions (Fig. 3e,f ). Our study therefore shows that disentangled context-stimulus representations emerged in the hippocampus, but not in the upstream visually responsive region VTC.

In our study, verbal instructions resulted in changes in hippocampal task representations that correlated with behavioural changes. The emergence of this representation in the session immediately following the instructions in the post-instruction inference group is correlated with their newfound ability to perform inference and suggests that hippocampal representations can be modified on the timescale of minutes through verbal instructions (Fig. 5 ). This change in representation is qualitatively different from the standard approach of studying the emergence of a ‘learning set’, wherein a low-dimensional representation of abstract task structure emerges slowly over days through trial-and-error learning 47 , 56 , 57 . Our finding of similar representational structure in the hippocampus in participants who learned spontaneously and those who only learned after receiving verbal instructions suggests that both ways of learning can potentially lead to the same solution in terms of neural representations. In complex, high-dimensional environments, learning abstract representations through trial and error becomes exponentially costly (the curse of dimensionality), and instructions can be used to steer attention towards previously undiscovered latent structure that can be explicitly represented and used for behaviour. Our findings suggest that when high-level instructions successfully alter behaviour, underlying neural representations can be rapidly modified to resemble one learned through experience.

Participants

The study participants were 17 adult patients who were implanted with depth electrodes for seizure monitoring as part of an evaluation for treatment for drug-resistant epilepsy (Supplementary Table 1 ). No statistical methods were used to predetermine the sample size but this number of patients is large relative to other similar studies. All patients provided informed consent and volunteered to participate in this study. Research protocols were approved by the institutional review boards of Cedars-Sinai Medical Center, Toronto Western Hospital and the California Institute of Technology.

Psychophysical task and behaviour

Participants performed a serial reversal learning task. There were two possible static stimulus–response–outcome (SRO) maps, each of which was active in one of the two possible contexts. Context was latent and switches between context were uncued. Each recording session consisted of 280–320 trials grouped into 10–16 blocks of variable size (15–32 trials per block) with block transitions corresponding to a change in the latent context.

Patients completed 42 sessions of the task, typically in pairs of two back-to-back sessions on the same recording day (mean, 2.4 sessions per day, minimum two, maximum four; Supplementary Table 1 ). New stimuli were used in every session, thus requiring patients to re-learn the SRO maps through trial and error at the start of every session.

Each trial consisted of a blank baseline screen, stimulus presentation, speeded response from the participant, followed by feedback after a brief delay (Fig. 1a ). Responses were either left or right in every trial. In each session, stimuli were four unique images, each chosen from a different semantic category (human, macaque, fruit, car). If a patient performed several sessions, new images not seen before by the patient were chosen for each session. The task was implemented in MATLAB (Mathworks) using PsychToolbox v.3.0 (ref. 58 ). Images were presented on a laptop positioned in front of the patient and subtended roughly 10° of visual arc (300 px 2 , 1,024 × 768 screen resolution, 15.6 inch (40 cm) monitor, 50 cm viewing distance). Patients provided responses using a binary response box (RB-844, Cedrus).

Receipt of reward in a given trial was contingent on the accuracy of the response provided. In each trial, either a high or low reward (25 cents (¢) or 5¢) was given if the response was correct, and no reward (0¢) if incorrect. Whether a given trial resulted in high or low reward if the response was correct was determined by the fixed SRO map (Fig. 1c ). Stimulus–response associations were constructed such that two out of four images (randomly selected) were assigned one response and the other two images were assigned the other (for example, human and fruit, left; macaque and car, right). Thus, in each context, each stimulus was uniquely specified by a combination of its correct response (left or right) and reward value (high or low). Crucially, the SRO maps of the two possible contexts were constructed so that they were the opposite of each other from the point of view of the associated response (Fig. 1c ). To fully orthogonalize also associated reward, half of the reward values stayed the same and the others switched. This structured relationship of stimuli across contexts led to the full orthogonalization of the response, context and reward variables (Fig. 1b,c ). Crucially, the stimulus–response map inversion across contexts provided the opportunity for patients to perform inferential reasoning about the current state of the SRO map, and therefore the latent context.

As rewards were provided deterministically, participants could infer that a context switch had occurred on receiving a single error to the first stimulus they encounter in the new context, and immediately respond correctly to the rest of the stimuli the first time they encounter them in the new context. The behavioural signature of inferential reasoning was thus the accuracy on the trials that occurred immediately after the first error trial. Specifically, we took a participant’s performance on the first instance of the other stimuli encountered in the new context as a measure of that participant’s inference capabilities (Extended Data Fig. 1a ; note that although there are three inference trials after every context switch, each corresponding to a different stimulus, only the first inference trial was used to determine whether a session was in the inference present or absent group: below).

Patients completed several sessions of the task, in each of which new stimuli were chosen. After completion of the first session, the experimenter provided a standardized description of the latent contexts and SRO reversal to the patient ( Supplementary Methods ). These instructions were given regardless of how well the patient performed in the immediately preceding session. Each session took roughly 30 min (mean 1,154 s, range 898–1,900 s), and the inter-session break during which instructions were provided lasted roughly 4 min (Fig. 5a , bottom; mean duration 241 s, range 102–524 s). After this brief interlude, the participants completed the task again with a new set of four stimuli.

Behavioural control

We administered a control version of the task identical to the ‘first session’ described above to n  = 49 participants recruited on Amazon Mechanical Turk (MTurk), who provided informed consent under a protocol approved by the institutional review board of Cedars-Sinai Medical Center (exempt study). No statistical methods were used to predetermine the sample size. We then used this data to calibrate the difficulty of the task. Most (roughly 75%) of the control participants demonstrated proper inference performance, and the remaining 25% demonstrating slow updating of SROs after a context switch, consistent with a behavioural strategy in which each stimulus is updated independently (Extended Data Fig. 1a ).

Analysis of behaviour

Six of the 42 sessions were excluded due to at-chance performance in non-inference trials (binomial test, P  > 0.05). A session was classified as ‘inference present’ if performance on the first of the three possible inference trials that occurred after the context switches was significantly above chance (timepoint 2 in Fig. 1f , binomial test on inference trial 1, P  < 0.05) and as ‘inference absent’ ( n  = 14 sessions, P  > 0.05, binomial test on inference trial 1) otherwise.

Electrophysiology

Extracellular electrophysiological recordings were conducted using microwires embedded within hybrid depth electrodes (AdTech Medical). The patients we recruited for this study had electrodes implanted in at least the hippocampus, as well as in addition subsets of amygdala, dorsal anterior cingulate cortex, upplementary motor area, ventromedial prefrontal cortex and VTC as determined by clinical needs (Supplementary Table 1 ). Implant locations were often bilateral, but some patients only had unilateral implants as indicated by clinical needs. Broadband potentials (0.1 Hz–9 kHz) were recorded continuously from every microwire at a sampling rate of 32 kHz (ATLAS system, Neuralynx). All patients included in the study had well-isolated single neuron(s) in at least one of the brain areas of interest.

Electrode localization was conducted using a combination of pre-operative magnetic resonance imaging and postoperative computed tomography using standard alignment procedures as previously described (using freesurfer v.5.3.0 and v.7.4.1) 59 , 60 . Electrode locations were coregistered to the to the MNI152-aligned CIT168 probabilistic atlas 61 for standardized location reporting and visualization using Advanced Normalization Tools v.2.1 (refs. 59 , 60 ). Placement of electrodes in grey matter was confirmed through visual inspection of participant-specific computed tomography and magnetic resonance imaging alignment, and not through visualization on the atlas.

Spike detection and sorting

Raw electric potentials were filtered with a zero-phase lag filter with a 300 Hz–3 kHz passband. Spikes were detected and sorted using the OSort software package v.4.1 (ref. 62 ). All spike sorting outcomes were manually inspected and putative single units were isolated and used in all subsequent analyses. We evaluated the quality of isolated neurons quantitatively using our standard set of metrics 63 , 64 , 65 , including the proportion of inter-spike interval violations shorter than 3 ms, signal-to-noise ratio of the waveform, projection distance between pairs of isolated clusters and isolation distance of each cluster relative to all other detected spikes. Only well-isolated neurons as assessed by these spike sorting quality metrics were included.

Selection of neurons, trials and analysis periods

Activity of neurons was considered during two epochs throughout each trial: the baseline period (base), defined as −1 to 0 s preceding stimulus onset on each trial and the stimulus period (stim), defined as 0.2 to 1.2 s following stimulus onset on each trial. Spikes were counted for every neuron on every trial during each of these two analysis periods. The resulting firing-rate vectors were used for all encoding and decoding analyses. For the stimulus period, because patients would sometimes respond before 1.2 s (reaction time 1.08 ± 0.04 s over sessions), we determined that 75.15% of all spikes occurred before a response was provided across all recorded neurons, indicating that analyses performed with these spike counts predominantly, but not exclusively, reflect predecision processing. Tests of single-neuron selectivity were conducted using N -way ANOVAs with significance at P  < 0.05, where N was either 2 for models of stim ID (A, B, C, D) and context (1, 2), or 3 for models that included outcome (high, low), response (left, right) and context (1, 2). All variables were categorical, and all models were fit with all available interaction terms included. In Fig. 1l , a unit is marked as linearly tuned if it has at least one significant main effect, and nonlinearly tuned if it has at least one significant interaction term in the ANOVA model.

Population decoding analysis

Single-trial population decoding analysis was performed on pseudo-populations of neurons assembled across all neurons recorded across all patients. We pooled across sessions within each anatomically specified recording area as described previously 59 , 66 . We aggregated neurons across participants into a pseudo-population that consists of all neurons recorded in a given brain area, which allows us to examine populations of several hundred neurons in humans despite inability to record this many neurons simultaneously. This analysis approach is possible because all participants performed the same task, so that conditions could be matched across all relevant variables for a given trial in the pseudo-population (for example, trial 1 might be context 1, correct response, stimulus A, response right, outcome high). The justification for using this approach is threefold. First, independent population codes, in which the information that each neuron provides can be characterized by its own tuning curve, can be understood by recording one neuron at a time and aggregating them for analysis 67 . This is the type of code we are examining. Second, we seek to establish the content and structure of information that is reliably present in a given brain area across participants. This can only be achieved by recording in many participants. Third, in most instances, decoding from pseudo-populations yields the same results from simultaneously recorded neurons 68 , 69 . Results between the two approaches can differ when noise correlations are considered, which can have complex effects on the geometry of the underlying representation 67 . Here noise correlations are not the topic of interest. Noise correlations are present for the subgroups of neurons in the pseudo-population that were recorded simultaneously. To avoid potential effects of these remaining noise correlations, we removed them by randomly scrambling the order of trials for every neuron included in the pseudo-population (as we have described before 59 , 66 ).

Decoding was conducted using support vector machines with a linear kernel and L2 regularization as implemented in MATLAB’s fitcsvm function. No hyperparameter optimization was performed. All decoding accuracies are reported for decoding accuracy for individual trials. Decoding accuracy is estimated out of sample using fivefold cross-validation unless otherwise specified (for example, cross-condition generalization). Many of the decoding analyses in this work consist of grouping sets of distinct task conditions into classes, then training a support vector machine to discriminate between those two groups of conditions. Neurons included in the analysis were required to have at least K correct trials of every unique condition to be included in the analysis ( K  = 15 trials unless otherwise stated). To construct the pseudo-population, we then randomly sampled K trials from every unique condition and divided those trials into the groups required for the current decoding analysis for every neuron independently. Randomly sampling correct trials in this way allowed us to destroy noise correlations that might create locally correlated subspaces from neurons recorded in the same area and session 59 .

To account for the variance in decoding performance that arose from this random subsampling procedure, all reported decoding accuracies are the average resulting from 1,000 iterations of subsampling and decoder evaluation. A similar trial balancing and subsampling procedure was conducted for all analyses that report decoding accuracy on incorrect trials, but with K  = 1 trial or condition required as incorrect for the neuron to be included in analysis. Various other analyses conducted throughout this work, including representation geometry measures, centroid distances and coding direction variances, all rely on this procedure of balanced correct and incorrect trial subsampling, and averaging across 1,000 iterations of the computed metric to study the relationships between task conditions in an unbiased manner. All reported values have been computed with this approach unless otherwise stated.

Balanced dichotomies

Our task has eight possible states (Fig. 1b ). We characterized how neurons represented this task space by assessing how a decoder could differentiate between all possible ‘balanced dichotomies’ of these eight task conditions (Fig. 1b ).

The set of all possible balanced dichotomies is defined by all possible ways by which the eight unique conditions can be split into two groups containing four of the conditions each (for example, four points in context 1 versus 4 points in context 2 is the context dichotomy). There are 35 possible balanced dichotomies (nchoosek(8,4)/2). We considered all 35 possible dichotomies to perform our analysis in an unbiased manner (Supplementary Table 2 shows all dichotomies). Some of the possible balanced dichotomies are easily interpretable because they correspond to variables that were manipulated in the task. We refer to these balanced dichotomies as the ‘named dichotomies’, which are: context, response, outcome, stimulus pair (stim pair) and parity. These dichotomies are shown individually in Extended Data Fig. 2 . The stim pair dichotomy corresponds to the grouping of stimuli for which the response is the same in either context (A and C versus D and B; Fig. 2d ). The parity dichotomy is the balanced dichotomy with the maximal nonlinear interaction between the task variables (Extended Data Fig. 2 ).

For decoding balanced dichotomies during the prestimulus baseline, the task states are defined by the values of the previous trial (not the upcoming, the identity of which is unknown to the participant). The reason for doing so is to examine persistent representations of the previous trial.

Defining decoding difficulty of dichotomies

We quantify the relative degree of nonlinear variable interactions needed by a neural population to classify a given dichotomy using a difficulty metric that rates dichotomies that require proximal task conditions to be placed on opposite sides of the decision boundary as more difficult. Note that proximity of task conditions in task space here is defined with respect to the variables that were manipulated to construct the task space. The conditions corresponding to (response L , outcome low, context 1) and (response L , outcome low, context 2) are proximal because their task specifications differ by a single variable (hamming distance 1) whereas (response L , outcome low, context 1) and (response R , outcome high, context 2) are distal as their task specifications differ by all three variables (Hamming distance 3). With this perspective, we can systematically grade the degree of nonlinearity required to decode a given dichotomy with high accuracy as a function of the number of adjacent task conditions that are on opposite sides of the classification boundary for that dichotomy. For a set of eight conditions specified by three binary variables, this corresponds to the number of adjacent vertices on the cube defined by the variables that are in opposing classes (Extended Data Fig. 5a ). We define this number as the difficulty for a given dichotomy, and can compute it directly for every one of the 35 balanced dichotomies. The smallest realizable dichotomy difficulty is 4, and corresponds only to named dichotomies that align with the axis of one of the three binary variables used to specify the task space. The largest realizable dichotomy is 12, and this corresponds to the parity dichotomy because the dichotomy difficulty (number of adjacent conditions with opposing class membership) is maximized in this dichotomy by definition. All remaining dichotomies lie between these two extremes in difficulty, and computing average decoding accuracy over dichotomies of increasing difficulty gives a sensitive readout of the degree of nonlinear task variable interaction present in a neural population.

Geometric analysis of balanced dichotomies

We used three measures to quantify the geometric structure of the neural representation 11 : shattering dimensionality, CCGP and PS. A high CCGP and PS for a variable indicates that the variable is represented in an abstract format.

Shattering dimensionality is defined as the average decoding accuracy across all balanced dichotomies. It is an index of the expressiveness of a representation, as representations with higher shattering dimensionality allow more dichotomies to be decoded. The content of a representation is assessed by considering which balanced dichotomies are individually decodable better than expected by chance.

CCGP assesses the extent to which training a decoder on one set of conditions generalized to decoding a separate set of conditions. High CCGP for a given variable indicates that the representation of that variable is disentangled from other variables. CCGP is reported in a cross-validated manner by training and testing decoders on single trials. Note that to compute CCGP, all trials from a set of conditions are held out from the training data, which is different from the typical ‘leave-one-out’ type decoding. The remaining held-in conditions are used to train the decoder, and performance is then evaluated on the held-out conditions (trial-by-trial performance). The CCGP for a given balanced dichotomy is the average over all possible 16 combinations of held-out conditions on either side of the dichotomy boundary. One of the four conditions on each side of the dichotomy are used for testing, whereas the remaining three on each side of the dichotomy are used for training. For each of the 16 possible train and test splits, the decoder is trained on all correct trials from the remaining six conditions, and performance is evaluated on the two held-out conditions.

PS assesses how coding directions for one variable are related to each other across values of other variables in a decoder agnostic manner. The PS is defined for every balanced dichotomy as the cosine of the angle between two coding vectors pointing from conditions in one class to conditions in the other for a given dichotomy. The coding directions are estimated using the average activity for each condition. Note that the PS is a direct geometrical measure that focuses on the structure of the representation, whereas the CCGP also depends on the noise and its shape because it is based on single trials. Coding vectors are computed by selecting four conditions (two on either side of the dichotomy), computing the normalized vector difference between the mean population response for each of the two pairs, then computing the cosine between said coding vectors. This procedure is repeated for all possible pairs of coding vectors, and the average over all cosines is reported. As the correct way of pairing conditions on either side of the dichotomy is not known a priori, we compute the cosine average for all possible configurations of pairing conditions on either side of the dichotomy, then report the PS as the maximum average cosine value over configurations.

Null distribution for geometric measures

We used two approaches to construct null distributions for significance testing of the geometric measures shattering dimensionality, CCGP and PS.

For the shattering dimensionality and decoding accuracy of individual dichotomies, the null distribution was constructed by shuffling trial labels between the two classes on either side of each dichotomy before training and testing the decoder. After shuffling the order of the trial labels, the identical procedures for training and testing were used. This way of constructing the null distribution destroys the information content of the neural population while preserving single-neuron properties such as mean firing rate and variance.

For the CCGP and PA, we used a geometric null distribution 11 . Before training, we randomly swapped the responses of pairs of neurons within a given condition. For example, for one task condition, all of neuron 1’s responses are assigned to neuron 2 and all of neuron 2’s responses are assigned to neuron 1; for another task condition, all of neuron 1’s responses are assigned to neuron 3, and so on). This way of randomly shuffling entire condition responses leads to the situation in which neural population responses by condition are held constant, but the systematic cross-condition relationships that exist for a given neuron are destroyed. This way of shuffling creates a maximally high-dimensional representation, thereby establishing a conservative null distribution for the geometric measures CCGP and PS.

All null distributions are constructed from 1,000 iterations of shuffled trial-resampling using either trial-label shuffling (shuffle null) or random rotations designed to destroy low-dimensional structure (geometric null).

Neural geometry alignment analysis

To answer the question of whether the geometry of a variable was common across different groups of sessions, we aligned representations between two neural state spaces. Each state space is formed by non-overlapping sets of neurons, and the two spaces are aligned using subsets of task conditions. A cross-session-group PS was then computed by applying the same alignment to a pair of held-out conditions, one on either side of the current dichotomy boundary. Alignment and cross-group comparisons were performed in a space derived using dimensionality reduction (six dimensions). For a given dichotomy, two groups of sessions with N and M neurons were aligned by applying singular value decomposition to the firing-rate normalized condition averages of all but two of the eight task conditions, one on either side of the dichotomy boundary. The top six singular vectors corresponding to the non-zero singular values from each session group were then used as projection matrices to embed the condition averages from each session group in a six-dimensional space. Alignment between the two groups of sessions, in the six-dimensional space, was then performed by computing the average coding vector crossing the dichotomy boundary for each session group, with the vector difference between these two coding vectors defining the ‘transformation’ between the two embedding spaces. To compare whether coding directions generalize between the two groups of sessions, we then used the data from the two remaining held-out conditions (in both session groups). We first projected these data points into the same six-dimensional embedding spaces and computed the coding vectors between the two in each embedding space. We then applied the transformation vector to the coding vector in the first embedding space, thereby transforming it into the coordinate system of the second session groups. Within the second session group embedding space, we then computed the cosine similarity between the transformed coding vector from the first session group and the coding vector from the second session group to examine whether the two were parallel (if so, the coding vectors generalize). We repeated this procedure for each of the other three pairs of conditions being the held-out pair, thereby estimating the vector transformation of each pair of conditions independently. The average cosine similarity was then computed over the held-out pairs. All possible configurations of conditions aligned on either side of the dichotomy boundary are considered (24 in this case), and the maximum cosine similarity over configurations is returned as the PS for that dichotomy (plotted as ‘cross-half’ in Extended Data Fig. 3z ). As a control, we also computed the PS for held-out conditions within the same embedding space without performing cross-session alignment (plotted as ‘half-split’ in Extended Data Fig. 3z ). Note that the differences in both the average PS and the null distribution when comparing within-session and across-session parallelism are expected behaviours and arise from the increased expressive power of the cross-session approach due to fitting transformation vectors in a relatively low-dimensional (six) space. This step is not performed for the within-session control because there is no need to align neural activity to its own embedding space.

Low-dimensional visualization of neural state spaces was achieved using multi-dimensional scaling (MDS) performed on matrices of condition-averaged neural responses. Pair-wise distances between condition averages were initially computed in N -dimensional neural state space, where N is the number of neurons used to construct the space. Pair-wise distances were then used to compute either a two- or three-dimensional representation of the condition averages using the ‘mdscale’ method in MATLAB. In figures in which two different MDS plots are shown side-by-side, canonical correlation analysis was used to align the axes of the two-dimensionally reduced neural state spaces. This approach was necessary because, in general, neural state spaces constructed with different sets of neurons were being compared. We note that we use MDS only to summarize and visualizing high-dimensional neural representations. All conclusions drawn are based on geometric measures computed in the original full neural state space.

Analysis of incorrect trials

For determining decoding accuracy for trials in which participants provided an incorrect response (‘error trials’), decoders were trained and evaluated out of sample on all correct trials in inference absent and inference present sessions (denoted as inference absent and inference present trials, respectively). The accuracy of the decoder was then evaluated on the left out error trials in the inference present sessions (denoted as ‘inference present (error)’ trials) that were balanced by task condition. Neurons from sessions without at least one incorrect trial for each of the eight conditions were excluded. We did not estimate CCGP separately for correct and incorrect trials. The PS was estimated using only correct trials for inference present and inference absent. For inference present (error), parallelism was computed using one coding vector (difference between two conditions) from correct trials and one coding vector from incorrect trials. All other aspects of the PS calculation remained as described earlier. The very first trial after a context switch was excluded from analysis (it was incorrect but by design, as the participant cannot know when a context switch occurred).

Stimulus identity geometry analysis

We repeated the geometric analysis described above for subsets of trials to examine specifically how the two variables context and stimulus interact with each other. To do so, we considered each possible pair of stimuli (AB, AC, AD, BC, BD, CD) separately. For each stimulus pair, we then examine the ability to decode and the structure of the underlying representation for two variables: stimulus identity (Supplementary Table 3 ) and context (Supplementary Table 4 ) (Fig. 3 ).

For stimulus identity, what is decoded is whether the stimulus identity is the first or second possible identity in each pair (that is, ‘A versus B’ for the AB pair). Stimulus CCGP (Fig. 3b,e ) is calculated by training a decoder to decide A versus B in context 1 and testing the decoder in context 2 and vice versa (the CCGP is the average between these two decoders). Stimulus PS (Fig. 3c,f ) is the angle between the two coding vectors A versus B in context 1 and 2.

For context, decoding accuracy is estimated by training two decoders to decide context 1 versus context 2 for each of the two stimuli in a stimulus pair. The reported decoding accuracy is the average between these two decoders (Extended Data Fig. 7a,b ). For example, for the stimulus pair AB, one such decoder each is trained for all A trials and all B trials. Context CCGP (Fig. 3g and Extended Data Fig. 7c ) is calculated by training a decoder to differentiate between contexts 1 and 2 based on the trials in the first identity of the pair and tested in the second pair, and vice versa. The reported context CCGP value for a given stimulus pair is the average between the two. Similarly, context PS (Fig. 3h and Extended Data Fig. 7d ) is the angle between the two coding vectors context 1 versus context 2 estimated separately for the first and second stimulus in a pair.

Distance and variance analysis

We computed a series of metrics to quantify aspects of the population response that changed between inference absent and inference present sessions. We used (1) the firing rate, (2) distance in neural state space between classes for balanced dichotomies and stimulus dichotomies (dichotomy distance), (3) variance of neural spiking projected along the coding directions for those dichotomies (coding direction variance) and (4) the condition-wise fano factor (Fig. 4 ).

Firing rate (Fig. 4e ) was the mean firing rate averaged across all neurons during the stimulus period, reported separately for correct trials of every unique task condition. Values reported during the baseline (Extended Data Fig. 9q,r ) are computed with an identical procedure using firing rates from before 1 s before stimulus onset.

Dichotomy distance (Fig. 4g,h,j,k ) was defined as the Euclidean distance in neural state space between the centroids of the two classes on either side of the decision boundary for that dichotomy. Centroids were computed by constructing the average response vector for each class using a balanced number of correct trials from every condition included in each class through a resampling procedure (described below). Null distributions reported for dichotomy distances are geometric null distributions.

Coding direction variance (Fig. 4i ) was computed for a given balanced dichotomy by projecting individual held-out trials onto the coding vector of the decoder trained to differentiate between the two groups of the balanced dichotomy being evaluated. The coding direction was estimated by training a linear decoder on all trials except eight (one from each condition either side of the dichotomy). The vector of weights estimated by the decoder (one for each neuron) was normalized to unit magnitude to estimate the coding vector. The projection of the left out trial onto this coding vector was then calculated using the dot product. This process was repeated 1,000 times, generating a distribution of single-trial projections onto the coding vector for each dichotomy. The variance of the distribution of 1,000 projected data points was then computed and reported as the variance for a given balanced dichotomy (Fig. 4i ).

The condition-wise fano factor (Fig. 4f ) was computed separately for each neuron. We used all correct trials for a given balanced dichotomy to estimate the mean firing rate and standard deviation and then took the ratio between the two to calculate the fano factor for each neuron. Reported fano factors are the average of all fano factors across all neurons from that area and/or behavioural condition. Fano factors are computed by condition because grouping trials across conditions could lead to task variable coding (signal) contaminating the fano-factor measurement, which should ideally only reflect trial-by-trial variation around the mean for roughly Poisson-distributed firing rates.

The context-modulation consistency (Fig. 4l ) was also computed separately for each neuron. Context-modulation consistency is the tendency for a neuron’s firing rate to shift consistently (increase or decrease) to encode context across stimuli. For each neuron, it was computed by determining the sign of the difference (±) between the mean firing rate for a given stimulus between the two contexts, and summing the number of stimuli that show the same modulation (either increase or decrease) across the two contexts. This consistency can take on values between 0 (increase in firing rate to encode context for half of the stimuli, decrease in firing rate for the other half) and 4 (either increase or decrease in firing rate for all four stimuli).

Bootstrap resampled estimation of measures and null distributions

All the measures described in the preceding sections were estimated using a trial and neuron-based resampling method. This resampling strategy was used to assure that every measure reported was comparable between a set of conditions by assuring that the same number of neurons and data points were used to train and test classifiers. Metrics were recomputed 1,000 times with resampling and all null distributions were computed with 1,000 iterations of shuffling and recomputing. Plotted boundaries of null distributions correspond to the fifth and 95th percentiles as estimated from the 1,000 repetitions.

A single iteration of the resampling estimation procedure proceeds as follows. For all analyses that involved a comparison of a metric between two behavioural conditions (inference absent versus inference present or session 1 versus session 2), the same number of neurons was included in both conditions by on a region-by-region basis. For a neuron to be included, at least 15 correct trials for each of the eight unique task conditions had to exist (120 correct trials total). Across patients, the number of correct trials per condition varied: minimum 10.9 ± 1.3 trials per condition, mean 25.0 ± 0.6 trials per condition, maximum 39.6 ± 1.2 trials per condition (mean ± s.e.m.). After identifying the neurons that met this inclusion criteria, an equal number were randomly sampled from both behavioural conditions. The number of considered neurons was set to the number of neurons available in the smallest group.

When constructing feature matrices for decoding, 15 trials were randomly selected from each unique condition that was included in the given analysis. Trial order was shuffled independently for every neuron within a condition to destroy potential noise correlations between neurons that were simultaneously recorded. For decoding and shattering dimensionality, out-of-sample accuracy was estimated with fivefold cross-validation. For generalization analyses (CCGP), all trials were used in training as performance was evaluated on entirely held-out conditions. For vector-based measures (dichotomy distance, variance, PS), all trials in relevant conditions were used to compute condition centroids. In the case of variance estimation, all trials except one on either side of the dichotomy boundary were used to learn the coding axis, then the held-out trials were projected onto the coding axis. As previously stated, these procedures were repeated 1,000 times with independent random seeds to ensure independent random sampling of neurons and trials across iterations.

All significance values ( P values) in the paper are estimated as following unless stated otherwise. P values of decodability, CCGP or PS versus chance for the absent and present group are labelled as P Absent or P Present , respectively, and are estimated using a one-sided non-parametric bootstrap test based on the empirically estimated null distribution as described above. The P value of the non-parametric boostrap test is equal to the number of iterations in the null distribution that are larger or equal than the observed value (one-sided), divided by the number of iterations. P values for comparing decodability, CCGP or PS between the inference absent and present conditions are performed using a two-sided Wilcoxon rank-sum test and labelled as P RS . Differences between decodability, CCGP or PS between two conditions are tested using the empirically estimated empirically based on the null distribution estimated as described above and labelled as P ΔVariable .

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

Data used in this study are publicly available at OSF ( https://doi.org/10.17605/OSF.IO/QPT8F ) 70 .

Code availability

Example code to reproduce the results is available as part of the data release (see ‘Data availability’).

Tolman, E. C. Cognitive maps in rats and men. Psychol. Rev. 55 , 189–208 (1948).

Article   CAS   PubMed   Google Scholar  

Chung, S. & Abbott, L. F. Neural population geometry: an approach for understanding biological and artificial neural networks. Curr. Opin. Neurobiol. 70 , 137–144 (2021).

Whittington, J. C. R., McCaffary, D., Bakermans, J. J. W. & Behrens, T. E. J. How to build a cognitive map. Nat. Neurosci. 25 , 1257–1272 (2022).

Tenenbaum, J. B., Kemp, C., Griffiths, T. L. & Goodman, N. D. How to grow a mind: statistics, structure, and abstraction. Science 331 , 1279–1285 (2011).

Article   ADS   MathSciNet   CAS   PubMed   Google Scholar  

Kemp, C. & Tenenbaum, J. B. Structured statistical models of inductive reasoning. Psychol. Rev. 116 , 20–58 (2009).

Article   PubMed   Google Scholar  

McClelland, J. L. et al. Letting structure emerge: connectionist and dynamical systems approaches to cognition. Trends Cogn. Sci. 14 , 348–356 (2010).

Griffiths, T. L., Chater, N., Kemp, C., Perfors, A. & Tenenbaum, J. B. Probabilistic models of cognition: exploring representations and inductive biases. Trends Cogn. Sci. 14 , 357–364 (2010).

Ho, M. K., Abel, D., Griffiths, T. L. & Littman, M. L. The value of abstraction. Curr. Opin. Behav. Sci. 29 , 111–116 (2019).

Article   Google Scholar  

Konidaris, G. On the necessity of abstraction. Curr. Opin. Behav. Sci. 29 , 1–7 (2019).

Vaidya, A. R., Jones, H. M., Castillo, J. & Badre, D. Neural representation of abstract task structure during generalization. eLife 10 , e63226 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Bernardi, S. et al. The geometry of abstraction in the hippocampus and prefrontal cortex. Cell 183 , 954–967.e21 (2020).

Chang, L. & Tsao, D. Y. The code for facial identity in the primate brain. Cell 169 , 1013–1028.e14 (2017).

She, L., Benna, M. K., Shi, Y., Fusi, S. & Tsao, D. Y. Temporal multiplexing of perception and memory codes in IT cortex. Nature 629 , 861–868 (2024).

Nogueira, R., Rodgers, C. C., Bruno, R. M. & Fusi, S. The geometry of cortical representations of touch in rodents. Nat. Neurosci. 26 , 239–250 (2023).

Boyle, L. M., Posani, L., Irfan, S., Siegelbaum, S. A. & Fusi, S. Tuned geometries of hippocampal representations meet the computational demands of social memory. Neuron 112 , 1358–1371.e9 (2024).

Gershman, S. J. & Niv, Y. Learning latent structure: carving nature at its joints. Curr. Opin. Neurobiol. 20 , 251–256 (2010).

Scholz, R., Villringer, A. & Martins, M. J. D. Distinct hippocampal and cortical contributions in the representation of hierarchies. eLife 12 , RP87075 (2023).

Constantinescu, A. O., O’Reilly, J. X. & Behrens, T. E. J. Organizing conceptual knowledge in humans with a gridlike code. Science 352 , 1464–1468 (2016).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Knudsen, E. B. & Wallis, J. D. Hippocampal neurons construct a map of an abstract value space. Cell 184 , 4640–4650.e10 (2021).

Aronov, D., Nevers, R. & Tank, D. W. Mapping of a non-spatial dimension by the hippocampal–entorhinal circuit. Nature 543 , 719–722 (2017).

Nieh, E. H. et al. Geometry of abstract learned knowledge in the hippocampus. Nature 595 , 80–84 (2021).

Courellis, H. S. et al. Spatial encoding in primate hippocampus during free navigation. PLoS Biol. 17 , e3000546 (2019).

Moore, J. J., Cushman, J. D., Acharya, L., Popeney, B. & Mehta, M. R. Linking hippocampal multiplexed tuning, Hebbian plasticity and navigation. Nature 599 , 442–448 (2021).

Article   ADS   CAS   PubMed   Google Scholar  

Higgins, I. et al. Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons. Nat. Commun. 12 , 6456 (2021).

Higgins, I. et al. Towards a definition of disentangled representations. Preprint at https://doi.org/10.48550/arXiv.1812.02230 (2018).

Rigotti, M. et al. The importance of mixed selectivity in complex cognitive tasks. Nature 497 , 585–590 (2013).

Kreiman, G., Koch, C. & Fried, I. Category-specific visual responses of single neurons in the human medial temporal lobe. Nat. Neurosci. 3 , 946–953 (2000).

Fried, I., Rutishauser, U., Cerf, M. & Kreiman, G. Single Neuron Studies of the Human Brain: Probing Cognition (MIT, 2014).

Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35 , 1798–1828 (2013).

Lake, B. M., Ullman, T. D., Tenenbaum, J. B. & Gershman, S. J. Building machines that learn and think like people. Behav. Brain Sci. 40 , e253 (2017).

Ito, T. et al. Compositional generalization through abstract representations in human and artificial neural networks. Adv. Neural Inf. Process. Syst. 35 , 32225–32239 (2022).

Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T. & Wang, X.-J. Task representations in neural networks trained to perform many cognitive tasks. Nat. Neurosci. 22 , 297–306 (2019).

Johnston, W. J. & Fusi, S. Abstract representations emerge naturally in neural networks trained to perform multiple tasks. Nat. Commun. 14 , 1040 (2023).

Muhle-Karbe, P. S. et al. Goal-seeking compresses neural codes for space in the human hippocampus and orbitofrontal cortex. Neuron 111 , 3885–3899.e6 (2023)

Epstein, R. A., Patai, E. Z., Julian, J. B. & Spiers, H. J. The cognitive map in humans: spatial navigation and beyond. Nat. Neurosci. 20 , 1504–1513 (2017).

Behrens, T. E. J. et al. What Is a cognitive map? Organizing knowledge for flexible behavior. Neuron 100 , 490–509 (2018).

O’Keefe, J. & Nadel, L. The Hippocampus as a Cognitive Map (Oxford Univ. Press, 1978).

Wilson, R. C., Takahashi, Y. K., Schoenbaum, G. & Niv, Y. Orbitofrontal cortex as a cognitive map of task space. Neuron 81 , 267–279 (2014).

Igarashi, K. M., Lee, J. Y. & Jun, H. Reconciling neuronal representations of schema, abstract task structure, and categorization under cognitive maps in the entorhinal-hippocampal-frontal circuits. Curr. Opin. Neurobiol. 77 , 102641 (2022).

Vaidya, A. R. & Badre, D. Abstract task representations for inference and control. Trends Cogn. Sci. 26 , 484–498 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Morton, N. W., Schlichting, M. L. & Preston, A. R. Representations of common event structure in medial temporal lobe and frontoparietal cortex support efficient inference. Proc. Natl Acad. Sci. USA 117 , 29338–29345 (2020).

Marr, D. Simple memory: a theory for archicortex. Philos. Trans. R Soc. Lond. B Biol. Sci. 262 , 23–81 (1971).

McClelland, J. L., McNaughton, B. L. & O’Reilly, R. C. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102 , 419–457 (1995).

Gluck, M. A. & Myers, C. E. Hippocampal mediation of stimulus representation: a computational theory. Hippocampus 3 , 491–516 (1993).

Benna, M. K. & Fusi, S. Place cells may simply be memory cells: memory compression leads to spatial tuning and history dependence. Proc. Natl Acad. Sci. USA 118 , e2018422118 (2021).

Tang, W., Shin, J. D. & Jadhav, S. P. Geometric transformation of cognitive maps for generalization across hippocampal-prefrontal circuits. Cell Rep. 42 , 112246 (2023).

Samborska, V., Butler, J. L., Walton, M. E., Behrens, T. E. J. & Akam, T. Complementary task representations in hippocampus and prefrontal cortex for generalizing the structure of problems. Nat. Neurosci. 25 , 1314–1326 (2022).

Wood, E. R., Dudchenko, P. A., Robitsek, R. J. & Eichenbaum, H. Hippocampal neurons encode information about different types of memory episodes occurring in the same location. Neuron 27 , 623–633 (2000).

Grieves, R. M., Wood, E. R. & Dudchenko, P. A. Place cells on a maze encode routes rather than destinations. eLife 5 , e15986 (2016).

Frank, L. M., Brown, E. N. & Wilson, M. Trajectory encoding in the hippocampus and entorhinal cortex. Neuron 27 , 169–178 (2000).

Sun, C., Yang, W., Martin, J. & Tonegawa, S. Hippocampal neurons represent events as transferable units of experience. Nat. Neurosci. 23 , 651–663 (2020).

Bao, P., She, L., McGill, M. & Tsao, D. Y. A map of object space in primate inferotemporal cortex. Nature 583 , 103–108 (2020).

Hesse, J. K. & Tsao, D. Y. The macaque face patch system: a turtle’s underbelly for the brain. Nat. Rev. Neurosci. 21 , 695–716 (2020).

Tanaka, K. Inferotemporal cortex and object vision. Ann. Rev. Neurosci. 19 , 109–139 (1996).

Axelrod, V. et al. Face-selective neurons in the vicinity of the human fusiform face area. Neurology 92 , 197–198 (2019).

Zhou, J. et al. Evolving schema representations in orbitofrontal ensembles during learning. Nature 590 , 606–611 (2021).

Zhou, J. et al. Complementary task structure representations in hippocampus and orbitofrontal cortex during an odor sequence task. Curr. Biol. 29 , 3402–3409.e3 (2019).

Brainard, D. H. The psychophysics toolbox. Spatial Vis. 10 , 433–436 (1997).

Article   CAS   Google Scholar  

Minxha, J., Adolphs, R., Fusi, S., Mamelak, A. N. & Rutishauser, U. Flexible recruitment of memory-based choice representations by human medial-frontal cortex. Science 368 , eaba3313 (2020).

Fu, Z. et al. Single-neuron correlates of error monitoring and post-error adjustments in human medial frontal cortex. Neuron 101 , 165–177.e5 (2019).

Tyszka, J. M. & Pauli, W. M. In vivo delineation of subdivisions of the human amygdaloid complex in a high-resolution group template. Hum. Brain Mapp. 37 , 3979–3998 (2016).

Rutishauser, U., Schuman, E. M. & Mamelak, A. N. Online detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, in vivo. J. Neurosci. Methods 154 , 204–224 (2006).

Kamiński, J. et al. Persistently active neurons in human medial frontal and medial temporal lobe support working memory. Nat. Neurosci. 20 , 590–601 (2017).

Hill, D. N., Mehta, S. B. & Kleinfeld, D. Quality metrics to accompany spike sorting of extracellular signals. J. Neurosci. 31 , 8699–8705 (2011).

Courellis, H., Nummela, S., Miller, C. & Cauwenberghs, G. A computational framework for effective isolation of single-unit activity from in-vivo electrophysiological recording. In Proc. 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS) https://doi.org/10.1109/BIOCAS.2017.8325164 (IEEE, 2017).

Fu, Z. et al. The geometry of domain-general performance monitoring in the human medial frontal cortex. Science 376 , eabm9922 (2022).

Panzeri, S., Moroni, M., Safaai, H. & Harvey, C. D. The structures and functions of correlations in neural population codes. Nat. Rev. Neurosci. 23 , 551–567 (2022).

Anderson, B., Sanderson, M. I. & Sheinberg, D. L. Joint decoding of visual stimuli by IT neurons’ spike counts is not improved by simultaneous recording. Exp. Brain Res. 176 , 1–11 (2007).

Meyers, E. M., Freedman, D. J., Kreiman, G., Miller, E. K. & Poggio, T. Dynamic population coding of category information in inferior temporal and prefrontal cortex. J. Neurophysiology 100 , 1407–1419 (2008).

Courellis, H. et al. Data for: Abstract representations emerge in human hippocampal neurons during inference. OSF HOME https://doi.org/10.17605/OSF.IO/QPT8F (2024).

Download references

Acknowledgements

We thank R. Adolphs for advice and support throughout all stages of the project, members of the labs of R. Adolphs, U. Rutishauser and M. Meister for discussion, and C. Katz and K. Patel for help setting up the recording system for single-unit recordings at Toronto Western Hospital. We thank all participants and their families for their participation and the staff and physicians of the Cedars-Sinai and Toronto Western Epilepsy Monitoring Units for their support. This work was supported by the BRAIN Initiative through the US National Institutes of Health Office of the Director (grant no. U01NS117839 to U.R.), the National Institute of Mental Health (NIMH) (grant nos. R01MH110831 to U.R. and R01MH082017 to C.D.S. and S.F.), the Caltech NIMH Conte Center (grant no. P50MH094258 to R.A. and U.R.), the Simons Foundation Collaboration on the Global Brain (to S.F., C.D.S. and U.R.), the Gatsby Foundation (to S.F.), the Swartz Foundation (to S.F.), the Moonshot R&D grant no. JPMJMS2294 (to K. Matsumoto) and by a merit scholarship from the Josephine De Karman Fellowship Trust (to H.S.C.).

Author information

These authors contributed equally: Hristos S. Courellis, Juri Minxha

These authors jointly supervised this work: Stefano Fusi, Ueli Rutishauser

Authors and Affiliations

Department of Neurosurgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA

Hristos S. Courellis, Juri Minxha, Adam N. Mamelak & Ueli Rutishauser

Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA

Hristos S. Courellis, Juri Minxha & Ueli Rutishauser

Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA

Juri Minxha, Daniel L. Kimmel, C. Daniel Salzman & Stefano Fusi

Krembil Research Institute and Division of Neurosurgery, University Health Network (UHN), University of Toronto, Toronto, Ontario, Canada

Araceli R. Cardenas & Taufik A. Valiante

Department of Psychiatry, Columbia University, New York, NY, USA

Daniel L. Kimmel & C. Daniel Salzman

Department of Neurology, Cedars-Sinai Medical Center, Los Angeles, CA, USA

Chrystal M. Reed & Ueli Rutishauser

New York State Psychiatric Institute, New York, NY, USA

C. Daniel Salzman

Department of Neuroscience, Columbia University, New York, NY, USA

C. Daniel Salzman & Stefano Fusi

Kavli Institute for Brain Sciences, Columbia University, New York, NY, USA

Center for Neural Science and Medicine, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA

Ueli Rutishauser

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization was by J.M., U.R., C.D.S. and S.F. Tasks were designed by J.M. and D.L.K. Data were collected by J.M., H.S.C. and A.R.C. Data were analysed by H.S.C. and J.M. H.S.C., U.R. and S.F. wrote the paper. U.R. and S.F. provided supervision. C.M.R. provided clinical care and facilitated experiments. Surgeries were carried out by A.N.M. and T.A.V.

Corresponding authors

Correspondence to Hristos S. Courellis or Ueli Rutishauser .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Timothy Behrens, Yunzhe Liu and Ila Fiete for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended data fig. 1 task behavior and single-neuron responses across all recorded regions..

(a) Task performance of n = 49 control subjects. Accuracy is reported as an average for each subject over all non-inference trials (left) and all inference trials (right; included are 3 the three trials after every switch in which an image was seen the first time after the switch, i.e. timepoints 2–4 in Fig. 1f ). Chance is 50%. This task variant is equivalent to the first session of the task encountered by patients (before explicit instructions of latent task structure). 46/49 subjects performing above chance on non-inference trials. (b) Performance of patients in non-inference trials. Each dot is a single session. Only sessions where patients exhibited above-chance accuracy on non-inference trials are shown (36/42 sessions, p < 0.05, one-sided Binomial Test on all non-inference trials vs. 0.5). (c) Non-inference performance for context 1 and 2 for the n = 36 sessions included in the analysis. Error bars are SEM over blocks. The reported p-value is a paired two-sided t-test between the mean accuracies for Context 1 and Context 2 across all sessions. (d) Same as (c), but with reaction time (RT), computed as time from stimulus onset to button press for every trial. Mean RT’s are also computed by block. n = 36 sessions. (e-f) Performance as a function of time in the task for the (e) inference absence and (f) inference present groups. Shown is the accuracy for the last non-inference trial before a switch and the first inference trial after a switch. Accuracy is shown block-by-block averaged over a 3-block window (mean ± s.e.m. across sessions). (g-i) Behavioral performance for the subjects in the post-instruction, not-exhibited, and pre-instruction groups, respectively. See Fig. 1f for notation. Plot shows performance on the last trial before the context switch, the first trial after the context switch, and for the first inference trial (Trial 2) averaged over all trials in each session (mean ± s.e.m. across sessions). Dashed line marks chance. The first inference trial performance (block box) was used to classify patients the patients, so significance is not reported for this trial. P-values are a one-way binomial test vs. 0.5. (j) Example hippocampal neuron that encodes stimulus identity. Raster trials are reordered based on stimulus identity, and sorted by reaction time therein (black curves). Stimulus onset occurs at time 0. Black points above PSTH indicate times where 1-way ANOVA over the plotted task variables was significant (p < 0.05). Errorbars are ±s.e.m. across trials. (k-p) Normalized activity for all neurons recorded from the hippocampus (k), amygdala (l), VTC (m), dACC (n), preSMA (o), and vmPFC (p). Each is plotted as a heat map of trial-averaged responses to each unique condition (8 total, specified by unique Response-Context-Outcome combinations). Z-scored firing rates are computed from 0.2 s to 1.2 s after stimulus onset for every trial. Each row of the heat map corresponds to the activity of a single neuron, and columns correspond to each of the 8 conditions. Neurons are ordered such that adjacent rows (neurons) are maximally correlated in 8-dimensional condition response space. This approach would allow for modular tuning to visibly emerge in the heat map if groups of neurons were clustered in their response profiles. (q) Percentage of neurons across all areas that exhibit tuning. Tuning was assessed by fitting a 2 × 2 × 2 (Response-Context-Outcome) ANOVA for every individual neuron’s firing rate during a 1 s window during the stimulus presentation period. Significant neurons were counted as p < 0.05 for main effects or interaction effects involving the stated variables. Significanctly different proportions of tuned neurons between inference present and absent sessions is determined via a two-sided z-test, where “*” indicates p < 0.05, “***” indicates p < 0.005, and “n.s.” indicates “not significant”. (r) Same analysis as (q), but for a 4 × 2 ANOVA for stimulus identity and context. (s) Same analysis as (q), but for a 4 × 2 ANOVA for stimulus identity and response. (t) Percentages of tuned neurons shown separately for each region (compare to Fig. 1j ). Single-neuron tuning is identified using a 3-Way ANOVA (Response × Context × Outcome), corresponding to column 1 (RCO) of Fig. 1j . (u) Same as (t), but single-neuron tuning identified here using 2-Way ANOVA (Stimulus ID × Context), corresponding to column 2 (SC) of Fig. 1j .

Extended Data Fig. 2 Visual representation of all named balanced dichotomies.

Illustration of the named balanced dichotomies that correspond to condition splits that have clearly interpretable meaning with respect to the construction of the task. For example, the context dichotomy (top left), arises from assigning all conditions for context = 1 to one class and all conditions for which context = 2 to the other class. The specific assignment of class labels 1 and 2 is arbitrary, and inverting the labels still corresponds to the same meaning for the dichotomy. All named dichotomies shown here are color coded to reflect their value in all Shattering Dimensionality, CCGP, and Parallelism Score plots, and this color code remains consistent throughout the paper whenever balanced dichotomies are considered.

Extended Data Fig. 3 Additional geometric analysis during stimulus processing and baseline periods.

(a) CCGP for other brain regions during stimulus period. See Fig. 2 for notation. Significant named dichotomies are marked when the dichotomies are above the 95 th percentile of the null distribution in inference present sessions and significantly different between inference absent and present (RankSum p < 0.01/35, Bonferroni corrected for balanced dichotomies). Significant increases were observed in vmPFC for stim pair (purple, p Absent  = 0.45, p Present  = 0.014) and preSMA for response (green, p Absent  = 0.045, p Present  = 0.0010) Stim pair CCGP in AMY was above chance for both inference absent and present sessions (purple, p Absent  = 0.050, p Present  = 0.039). (b) Same as (a), but for PS. PS increased significantly for stim pair in amygdala (purple, p Absent  = 1.3 × 10 −4 , p Present  = 9.0 × 10 −8 ) and context in the dACC (red, p Absent  = 0.99, p Present  = 3.9 × 10 −12 ). (c) Change in decoding accuracy. (d) Same as (c), but for CCGP. (e-f) Error trial analysis for neural response following stimulus onset in the hippocampus. Context (red) is not decodable and not in an abstract format in incorrect trials during inference present sessions. Only correct trials are used in inference absent sessions. Horizontal black bars indicate shattering dimensionality. Stars denote named dichotomies that are above chance in the inference present trials and are significantly different from their corresponding inference absent value (p < 0.05/35, Bonferroni corrected). p Present  = 0.0028, p Present  = 2.0 × 10 −3 , and p Present  = 0.037 for context, stim pair and parity, respectively in panel (e) and p Present  = 1.1 × 10 −16 and p Present  = 0.0030 for context and stim pair in panel (f) . (g) PS for hippocampus. Context PS was significantly larger (red, p Absent  = 0.55, p Present  = 1.4 × 10 −15 ), as was stim pair (purple, p Absent  = 0.17, p Present  = 1.7 × 10 −8 ). (h) Same as (c) , but for PS. (i,j) Error trial analysis for the baseline period in the hippocampus. See (e-f) for notation. p Present  = 0.012 and p Present  = 0 for context in (i) and (j) , respectively. (k-o) Analysis of baseline period for other brain regions (k) and the hippocampus (l-o). Compare to Fig. 2h . (k) Significant increases were observed in dACC for context (red, p Absent  = 0.37, p Present  = 0.049). SD was not different from chance (p RS >0.05 or all areas). (l-m) Change in decoding accuracy and CCGP. (n) PS. Context is the only named dichotomy for which the PS is significantly different from chance in nference present sessions (red, p Absent  = 0.37, p Present  = 1.2 × 10 −10 ). (o) Change is PS shown in (n). (p-t) Analysis of baseline period for the dACC. (p) Context (red, p Absent  = 0.26, p Present  = 0.018) is in an abstract format. (q) Context PS (red, p Absent  = 0.18, p Present  = 0.013) is significant in inference present sessions. (r-s) Change in decoding accuracy (r), CCGP (s), and PS (t). Parity and context PS increase significantly (p = 0.0016 and p = 0.026, respectively). (u-y) Analysis of responses in VTC. (u) Decoding during pre-stimulus baseline. None of the dichotomies are decodable during inference absent or present (p > 0.05 for all dichotomies) and SD does not significantly differ (0.50 vs 0.51, p RS  = 0.34). (v-y) Analysis of stimulus period. (v) Decodability. The stimulus dichotomies are decodable both during inference absent and inference present sessions. SD increased significantly (inference absent vs present, 0.66 vs 0.70, p RS  = 0.0056). Dichotomies: purple, p Absent  = 6.8 × 10 −13 , p Present  = 6.6 × 10 −14 , brown, p Absent  = 2.2 × 10 −9 , p Present  = 6.0 × 10 −14 , pink, p Absent  = 1.1 × 10 −13 , p Present  = 6.7 × 10 −14 . Context is not significantly decodable (red, p Absent  = 0.24, p Present  = 0.38). (w) CCGP. Two stimulus dichotomies are in an abstract format in inference absent and all three are in an abstract format in inference present (purple, p Absent  = 0.0054, p Present  = 0.0036, brown, p Absent  = 0.057, p Present  = 0.0029, pink, p Absent  = 0.0030, p Present  = 0.0032). (x) PS. PS for two of the stimulus dichotomies is above chance in inference absent sessions, and all three are above chance in inference present sessions (purple, p Absent  = 0, p Present  = 4.3 × 10 −13 , brown, p Absent  = 0.73, p Present  = 0, pink, p Absent  = 0, p Present  = 5.9 × 10 −7 ). (y) Error trial analysis. Decoders are trained on correct trials and evaluated on error trials in inference present sessions. All stimulus identity-related dichotomies are decodable during error trials (purple, p Present(error) = 7.8 × 10 −11 , brown, p Present(error) = 1.1 × 10 −13 , pink, p Present(error) = 8.7 × 10 −11 ) and SD does not decrease (black bar, inference present vs present (error), 0.67 vs. 0.66, p RS  = 0.65). (z-ac) Cross-session generalization. (z) PS for context during the stimulus period for random half-splits of the inference present sessions (Left, Middle column, 11 sessions in each half). Cross-half context PS is also computed through cross-session neural geometry alignment (Right Column, see  Methods ). Baseline context PS is significantly above chance within each half and across halves (p Half-Split One  = 0.0081, p Half-Split Two  = 0.0098, p Cross-Half  = 0.033). (aa) Same as (z) , but for the baseline period. Context PS is significantly above chance within each half and across halves (p Half-Split One  = 0.0029, p Half-Split Two  = 0.0022, p Cross-Half  = 0.010). (ab) Same as (z) , but for the inference absent sessions (7 sessions in each half) during the stimulus period. (ac) Same as (ab) , but for the baseline period. In all panels, the gray shaded bar indicates 5 th -95 th percentile of the null distribution and horizontal black lines indicate SD. All p Absent, p Present, p Half-split, and p Cross-Half values stated are estimated empirically based on the null distribution shown. All p RS values stated are a two-way ranksum test.

Extended Data Fig. 4 Additional control analyses for Hippocampal representational geometry after excluding univariantly tuned neurons.

Identical analysis to the main geometric analysis shown in Fig. 2 , except that neurons are excluded from the analysis with the following criteria: in (a-j) , neurons with significant linear tuning for Context, Response, or Outcome (2 × 2 × 2 ANOVA, Any Main Effect p < 0.01), and in (k-m) , neurons with significant linear tuning for Stimulus Identity or Context (4x2 ANOVA, Any Main Effect p < 0.01). 455/494 neurons were retained for the stimulus period analysis (a-e) and 458/494 neurons were retained for the baseline period analysis (f-j). All primary results for changes in hippocampal geometry were recapitulated apart from decodability of the parity dichotomy during the stimulus period (a). (a-e) Stimulus period analysis. (a) Decodability. Context (red, p Absent  = 0.36, p Present  = 0.0001, p RS  = 1.6 × 10 −31 ) and stim pair (purple, p Absent  = 0.078, p Present  = 4.2 × 10 −5 , p RS  = 6.6 × 10 −31 ) was decodable and SD (0.54 vs. 0.58, p RS  = 0.0013) increased. (b) CCGP. Context (red, p Absent  = 0.63, p Present  = 0.0016, p RS  = 5.2 × 10 −34 ) and stim pair (purple, p Absent  = 0.17, p Present  = 0.00095, p RS  = 5.3 × 10 −34 ) increased. (c) PS. Context (red, p Absent  = 0.40, p Present  = 3.7 × 10 −13 ) and stim pair (purple, p Absent  = 0.83, p Present  = 1.2 × 10 −7 ) increased. (d-e) Error trial analysis. (d) Decodability. Context (red, p Absent  = 0.36, p Present  = 0.0029, p Present(error) = 0.64, p RS  = 1.5 × 10 −20 ) and stim pair (purple, p Absent  = 0.071, p Present  = 0.0021, p Present(error) = 0.062, p RS  = 2.0 × 10 −5 ) were decodable only in error trials. SD was not significantly different (inference present vs present (error), 0.56 vs. 0.55, p RS  = 0.62) during the stimulus presentation. (e) PS. Context (red, p Absent  = 0.40, p Present  = 4.6 × 10 −15 , p Present(error) = 0.012) was largerest in correct trials. (f-j) Baseline analysis. (f) Context decodability (red, p Absent  = 0.37, p Present  = 0.013, p RS  = 2.2 × 10 −26 ) and SD (black, 0.50 vs. 0.52, p RS  = 0.036). (g) CCGP. Context (red, p Absent  = 0.31, p Present  = 0.0044, p RS  = 1.9 × 10 −33 ) differed significantly. (h) PS. Context differed significantly (red, p Absent  = 0.12, p Present  = 0.0055). (i-j) Error trial analysis during the baseline. (i) Decodability. Context was elevated but not significantly during correct trials (red, p Absent  = 0.55, p Present  = 0.12, p Present(error) = 0.37). SD increased significantly (black, inference present vs present (error), 0.51 vs. 0.49, p RS  = 0.030). (j) PS. Context increased significantly in correct trials (red, p Absent  = 0.66, p Present  = 8.5 × 10 −9 , p Present(error) = 0.30). (k-m) Same as (a-c), but after removing neurons tuned to stimulus identity using the 2-Way ANOVA during the stimulus period. 412/494 neurons were retained. Context remains in an abstract format. (k) Context decodability (red, p Absent  = 0.38, p Present  = 0.0088, p RS  = 4.1 × 10 −28 ). SD was not significantly different (black, 0.53 vs. 0.53, p RS  = 0.69). (l) CCGP. Context (red, p Absent  = 0.51, p Present  = 6.0 × 10 −4 , p RS  = 2.5 × 10 −34 ) increased significantly. (m) PS. Context (red, p Absent  = 0.77, p Present  = 2.3 × 10 −6 ) increased significantly. (n-s) Seizure onset zone exclusion analysis. Analysis shown is identical to Fig. 2 , except that hippocampal neurons recorded in seizure onset zones were removed. 410/494 neurons were retained for analysis. Results were effectively identical to that reported in Fig. 2 , with every significant named dichotomy increase during stimulus (n-p) and baseline (q-s) periods being recapitulated in the absence of SOZ hippocampal neurons. (t-z) Non-inference performance control analysis. Identical analysis to the main geometric analysis shown in Fig. 2 , except that inference absent and inference present sessions were distribution-matched for non-inference trial performance. Pairs of inference absent and inference present sessions with at most 7.5% difference in non-inference trial performance were selected, prioritizing sessions with more hippocampal neurons. This matching process yielded 10 inference absent sessions (152 neurons) and 10 inference present sessions (187 neurons) whose average non-inference performances did not statistically significantly differ (92.8% v.s. 94.7%, p RS  = 0.58, ranksum over sessions). All main geometric findings were recapitulated for the stimulus (t-v) and baseline (w-y) periods. (z) Distribution-matched behavior. P-values are one-way binominal test vs. 0.5. n = 10 sessions in each group. Error bars are ±s.e.m. across sessions. In all panels, the gray shaded bar indicates 5 th –95 th percentile of the null distribution and horizontal black lines indicate SD. All p Absent, and p Present values stated are estimated empirically based on the null distribution shown. All p RS values stated are a two-way ranksum test.

Extended Data Fig. 5 Effect of inference and errors on shattering dimensionality as a function of dichotomy difficulty.

“Dichotomy difficulty” quantifies the amount of non-linear interaction of task variables needed in a population of neurons to decode a given dichotomy (see  methods ). (a) Example dichotomies of increasing difficulty. The difficulty 4 dichotomy corresponds to context and difficulty 12 dichotomy corresponds to parity (Extended Data Fig. 2 ). (b-g) Decoding accuracy as a function of dichotomy difficulty for different brain regions. Reported values (mean +/− SEM) are computed over dichotomy decoding accuracies, where the average decoding accuracy for each dichotomy is computed with 1000 repetitions of re-sampled estimation (see  methods ). Black dashed lines indicate chance level (50% for binary decoding), horizontal black lines indicate the 5 th and 95 th pctle of the null distribution. P-values are computed by conducting a one-way ANOVA over dichotomies independently for every dichotomy difficulty (Bonferroni multipe comparison corrected). This value is not meaningfully computable for difficulty 12, which contains a single dichotomy (the parity dichotomy), and is therefore not reported. Decoding accuracy from the hippocampus (b) is higher in inference present compared to inference present sessions. In error trials, decoding is at chance. n = 1000 random resamples.

Extended Data Fig. 6 Cross-condition generalization performance for stimulus identity and context defined over stimulus pairs.

(a-f) Illustration of analysis over pairs of stimuli. When considering a pair of stimuli (e.g. A and B) across two contexts (e.g. 1 and 2), there are four possible task conditions (A1, B1, A2, B2). On these points, stimulus (A1A2 vs B1B2) and context (A1B1 vs A2B2) can be decoded in a straightforward manner, but is not informative about the format in which stimulus and context are encoded. Rather, the CCGP for stimulus across contexts (a-c) and for context across stimuli (d-f) provide information about the structure of the two variables and how they interact. (a-c) Illustration of CCGP for assessing whether stimuli are abstract with respect to context. (a) A linear decoder (blue bar) is trained to distinguish between stimuli A and B in context 1 (blue + and – correspond to class labels for training). The decoder is then tested (generalized) on context 2, where stimulus identity is decoded (red bar, + and – for class labels). (b) The training step. (c) The testing step. Arrows show the stimulus and context coding vectors. (d-f) Illustration of CCGP for assessing whether context is abstract with respect to stimulus identity. See (a-c) for notation. (g-j) Example neurons from hippocampus (g,h) and VTC (i,j) with tuning for stimulus identity. Plotting conventions identical to those used in Extended Data Fig. 1j . (k-l) Distances between pairs of stimulus representations in hippocampus (k) and VTPC (l). Color code indicates stimulus pair. Distance is the Euclidean distance between the stimulus centroids, each of which is an N (# of neurons) dimensional vector of average firing rates during stimulus presentation. Neuron counts are balanced between inference absent and inference present sessions. Null distributions are geometric nulls. Significance of the difference is tested by two-sided ranksum test computed over stimulus pairs, and n.s. indicates p > 0.01. p RS  = 0.39, p RS  = 0.40, p RS  = 0.13, and p RS  = 0.026 for panels (k-l), respectively. (m-n) Decodability of stimulus identity for hippocampus (m) and VTC (n). Each datapoint is a binary decoder between the two stimulus identities in a given pair. Significance of the difference between inference absent and inference present decodability is also established by Ranksum test over average decoding accuracies and n.s. indicates p > 0.05.

Extended Data Fig. 7 Additional context CCGP analysis over stimulus pairs for hippocampus and ventral temporal cortex (stimulus period).

(a-b) Context decoding accuracy for individual stimulus pairs in hippocampus (a) and VTC (b). (c-d) Context CCGP and Context PS for individual stimulus pairs for VTC (compare to Fig. 3g,h for hippocampus). n.s. is p > 0.01 of two-tailed ranksum test comparing absent vs. present. p RS  = 0.026 for (c) . (e-h) Example neurons from hippocampus (e-g) and VTC (f-h) that are modulated by both stimulus identity and context. Error bars in PSTH (bottom) are ± s.e.m. across trials. (g,h) Mean ± s.e.m. firing rates during the stimulus period. Black arrows indicate the direction in which the firing rate for a stimulus is modulated by context. n = 120 trials. (i) Change in the consistency of context-modulation for stimuli averaged over stimulus-tuned neurons in VTC (n = 104) and HPC (n = 63). Context modulation consistency is the tendency for a neuron’s firing rate to shift consistently (increase or decrease) to encode context across stimuli (see  methods ). There was a significant interaction between brain area (HPC/VTC) and session type (inference absent/present); 2 × 2 ANOVA, p Area  = 0.36, p Inference  = 0.64, p x  = 4.5 × 10 −5 ), indicating that modulation consistency increased in HPC in inference present sessions, whereas the opposite was the case in VTC.

Extended Data Fig. 8 Hippocampal MDS plots summarizing changes in stimulus and context geometry.

(a-f) 2D MDS plots for individual stimulus pairs. See Fig. 2j for notation. MDS was conducted independently for inference absent and inference present sessions, making individual MDS axes not directly comparable. But note that relative distances are comparable because we matched the number of neurons. Only correct trials are shown. Disentangling of context and stimulus identity is present across most stimulus pairs, with the notable exception of the B/D stimulus pair (e), which is correlated with outcome and therefore cannot be dissociated from outcome using CCGP. The emergence of quadrilaterals with approximately parallel sides for all other stimulus pairs (a-d, f) is a signature of disentangling of stimulus identity and context. (g) Changes in neural geometry. MDS of condition-averaged responses of all recorded HPC neurons shown for inference absent (left) and inference present (right) sessions. All plotting conventions are identical to those in (a-f), except MDS was applied with N dim  = 3, and three stimuli (A,B,D) are plotted simultaneously. Black arrows on the inference present plot highlight parallel coding of stimuli across the two context planes. (h,i) MDS plots of HPC condition-averaged responses shown for context 1 (h) and context 2 (i) separately. Axes are directly comparable here between inference absent and present due to alignment via CCA prior to plotting. Note that the stimulus geometry in each context is a tetrahedral (maximal dimensionality, unstructured) regardless of the presence or absence of inference behavior.

Extended Data Fig. 9 Additional analysis for firing rate property changes that are underlying geometric changes.

(a-j) Stimulus period analysis. (a) Distance between centroids for other brain regions. Plotting conventions are identical to Fig. 4g . Neuron counts were only balanced for each region. Significant change in average dichotomy separation determined by a two-tailed ranksum test, Bonferroni corrected for 5 multiple comparions. (b) Changes in inter-centroid distance for balanced dichotomies. No distances for named dichotomies changed more than would be expected by chance. (c) Mean firing rates for individual task conditions for all regions other than HPC. See Fig. 4e for notation. Significant change in average dichotomy separation determined by a two-tailed ranksum test, Bonferroni corrected for 5 multiple comparions. (d-g) Changes in single-neuron tuning quantified by a 3-way ANOVA (Response, Context, Outcome) with interactions. Significant factors (p < 0.05) were identified for every neuron and averages of both the number of factors per neuron (d,e) and the depth of tuning of those factors quantified by the F-Statistic (e,g) are reported (mean ± s.e.m. across neurons). Significance of difference between inference absent and present sessions was assessed by two-tailed ranksum test over significant neurons between the two groups. n = 58,47,24,22,96,118 for HPC, vmPFC, AMY, dACC, preSMA, and VTC, respectively. (h) Assessment of single trial variability of context coding. For each trial, the population response was projected onto the coding axis for context. Vertical lines indicating the mean. (i-j) Fraction of hippocampal (i) and VTC (j) neurons that exhibit selectivity for a given variable. For every neuron, selectivity is determined with a 4 × 2 ANOVA (Stimulus Identity, Context), with a per-factor significance threshold of p < 0.05. Significant differences in tuned fractions between inference absent and inference present assessed with two-tailed z-test. (k-r) Baseline period analysis for hippocampus (k-l) and dACC (m-p). (k) Average trial-by-trial variance of individual trials projected onto the coding direction for every dichotomy. See Fig. 4i for notation. Average variance along coding directions decreased significantly between inference absent and inference present sessions (p RS = 6.5 × 10 −13 , ranksum over dichotomies). (l) Change in variance for all dichotomies shown in (k). No named dichotomies fell outside the null distribution. (m-n) Same as (a,b) but for the dACC at baseline. See Fig. 4g for plotting conventions. Average distance between dichotomy centroids increased (p RS = 2.9 × 10 −8 , ranksum over dichotomies). Context was significantly separated (p Absent  = 0.48, p Present  = 0.0065). (n) Changes in distance between inference present and inference absent sessions for all dichotomies shown in (m) . Context alone (red, p Δ = 0.047) exhibited a greater increase in distance than expected by chance. (o-p) Same as (k-l), but for he dACC. Average variance along coding directions increased significantly (p RS  = 6.0 × 10 −3 , ranksum over dichotomies). (q) Mean baseline firing rates in hippocampus (p RS  = 1.6 × 10 −4 , ranksum over conditions). See Fig. 4e for plotting conventions. Ranksum test over conditions. (r) Same as (q) but for the other brain areas. Ranksum test over conditions. Note that all brain regions other than AMY exhibit slight but significant increases (p RS  = 0.050, 0.23, 1.6 × 10 −4 , 1.6 × 10 −4 , and 1.6 × 10 −4 for vmPFC, AMY, dACC, preSMA, and VTC, respectively). (s-w) Control analysis for stimulus period after distribution-matching for firing rate. (s) Distribution of mean stimulus firing rates over all hippocampal neurons in the inference absent (gray) and inference present (black) sessions, as well as randomly thinned inference absent firing rates that distribution-match the inference present firing rates (orange). (t) Mean firing rates before and after distribution matching. Ranksum test over conditions. p RS  = 1.6 × 10 −4 for absent vs. absent-match. (u-w) Replication of key results for the set of neurons that are distribution matched. Plotting conventions are those shown in Fig. 2 . No meaningful differences are present between inference absent and distribution-matched inference absent for any dichotomy/metric. (u) p Present  = 1.8 × 10 −6 , p Present  = 6.4 × 10 −6 , and p Present  = 0.016 for context, stim pair, and parity respectively. (v) p Present  = 0.035 and p Present  = 0.0047 for context and stim pair. (w) p Present  = 7.2 × 10 −10 and p Present  = 3.6 × 10 −6 for context and stim pair. (x-ab) Control analysis for stimulus period after excluding high-hippocampal-firing-rate sessions. (x) Distribution of mean hippocampal firing rate over inference absent (gray) and inference present (black) sessions. Each point in the distribution corresponds to the mean hippocampal firing rate over all neurons in a single session. Vertical dashed line indicates 3 Hz threshold. Hippocampal neurons from all inference absent and inference present sessions above this threshold were excluded from analysis shown in (y-ab) . 131/169 inference absent neurons (10/14 sessions) and 318/325 inference present neurons (21/22 sessions) are retained. (y) Same as (t) , but computed using all sessions with mean hippocampal firing rate <3 Hz (p RS  = 1.6 × 10 −4 ). (z-ab) Neural geometry measures re-computed excluding hippocampal neurons from high-firing-rate sessions. No meaningful differences are apparent except the above-chance context PS in inference absent sessions (red, p Absent  = 2.2 × 10 −8 ). In all panels, * indicates p < 0.05 and ns indicates not significant. All p Absent, and p Present values stated are estimated empirically based on the null distribution shown. All p RS values stated are a two-sided ranksum test.

Extended Data Fig. 10 Additional analysis of the effect of instructions on hippocampal neural geometry.

(a-g) Post-instruction inference group. (a-b) Behavior. Identical to Extended Data Fig. 1e,f , except now the session recorded immediately preceding and immediately following verbal instructions are shown. Average performance is computed as a moving average with a 3-block window on the last three trials before a context switch (non-inference) and on the first inference trial after a switch (inference). Error bars are standard errors computed over subjects. Chance performance is 0.5. (c-d) Geometric measures during the stimulus period. Only context is shown as a named dichotomy for visual clarity. (c) CCGP (context, red, p One  = 0.27, p Two  = 0.046, p RS  = 1.4 × 10 −31 ) and (d) PS (context, red, p One  = 0.029, p Two  = 3.5 × 10 −6 , p Two(error) = 0.0028). (e-g) Geometric measures during the baseline period. (e) Decoding accuracy (context, red, p One  = 0.35, p Two  = 0.0014, p Two(error)  = 0.55, p RS  = 1.4 × 10 −20 ). (f) CCGP (context, red, p One  = 0.33, p Two  = 0.0037, p RS  = 3.0 × 10 −34 ). (g) PS (context, red, p One  = 0.017, p Two  = 7.5 × 10 −8 , p Two(error)  = 0.40). (h-n) Same as (a-g) , but for inference not-exhibited group. (j-k) Geometric measures during the stimulus period. (j) CCGP (context, red, p One  = 0.56, p Two  = 0.39, p RS  = 0.004). (k) PS (context, red, p One  = 0.81, p Two  = 0.95). (l-n) Geometric measures during the baseline period. (l) Decoding accuracy (context, red, p One  = 0.45, p Two  = 0.45, p RS  = 0.68). (m) CCGP (context, red, p One  = 0.45, p Two  = 0.47, p RS  = 0.15). (n) PS (context, red, p One  = 0.93, p Two  = 0.30) for the. (o-u) Same as (a-g) , but for the pre-instruction inference group. (q-r) Geometric measures during the stimulus period. (q) CCGP (context, red, p One  = 0.23, p Two  = 0.19, p RS  = 0.0045). (r) Parallelism Score (context, red, p One  = 6.3 × 10 −8 , p Two  = 4.5 × 10 −7 ). (s-u) Geometric measures during the baseline period. (s) Decoding accuracy (context, red, p One  = 0.37, p Two  = 0.47, p RS  = 0.036), (t) CCGP (context, red, p One  = 0.30, p Two  = 0.50, p RS  = 5.9 × 10 −7 ), and (u) PS (context, red, p One  = 1.7 × 10 −5 , p Two  = 0.029). (v) Changes in hippocampal firing rates for the 3 different sub-groups of session pairs. Firing rate changes are computed during the stimulus presentation period (0.2 s to 1.2 s after stim onset) from consecutive sessions. Points are average changes in condition-averaged firing rates (8 unique conditions). Changes in firing rate that significantly differed from zero (two-sided t-test, p < 0.05/3, boneferroni corrected) are indicated with a “*” (p = 1.5 × 10 −4 , 1.2 × 10 −4 , and 0.088). Post-instruction inference group alone exhibited significant decrease in firing rate. Inference not-exhibited group exhibited an increase in firing rate. In all panels stated p-values denoted as p One and p Two are estimated empirically based on the null distribution shown. All p RS values stated are a two-way ranksum test.

Supplementary information

Supplementary information.

Supplementary Tables 1–4, Notes 1–6, Results, Discussion, Methods and references.

Reporting Summary

Peer review file, supplementary video 1.

Transformation of hippocampal geometry shown with MDS of real data. Visualization of the transformation of the representational geometry in the hippocampus shown using MDS of condition-averaged responses for all recorded hippocampal neurons during the stimulus period. Plotting conventions and data are identical to those used in Fig. 3i. Here we use linear interpolation between the starting and ending geometry shown in the video, which correspond to inference absent and inference present sessions, respectively. The video is meant to provide intuition for how the task conditions are represented differently in neural state space in the presence and absence of inference behaviour.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Courellis, H.S., Minxha, J., Cardenas, A.R. et al. Abstract representations emerge in human hippocampal neurons during inference. Nature 632 , 841–849 (2024). https://doi.org/10.1038/s41586-024-07799-x

Download citation

Received : 30 November 2023

Accepted : 09 July 2024

Published : 14 August 2024

Issue Date : 22 August 2024

DOI : https://doi.org/10.1038/s41586-024-07799-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

How the human brain creates cognitive maps of related concepts.

  • Mitchell Ostrow

Nature (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

representation of knowledge in psychology

American Psychological Association Logo

Educational Psychology Promotes Teaching and Learning

Educational Psychology Promotes Teaching and Learning

Understanding Educational Psychology

Today’s educational system is highly complex. There is no single learning approach that works for everyone.

That’s why psychologists working in the field of education are focused on identifying and studying learning methods to better understand how people absorb and retain new information.

Educational psychologists apply theories of human development to understand individual learning and inform the instructional process. While interaction with teachers and students in school settings is an important part of their work, it isn’t the only facet of the job. Learning is a lifelong endeavor. People don’t only learn at school, they learn at work, in social situations and even doing simple tasks like household chores or running errands. Psychologists working in this subfield examine how people learn in a variety of settings to identify approaches and strategies to make learning more effective.

Educational Psychology Applied

Psychologists working in education study the social, emotional and cognitive processes involved in learning and apply their findings to improve the learning process. Some specialize in the educational development of a specific group of people such as children, adolescents or adults, while others focus on specific learning challenges such as attention deficit hyperactivity disorder (ADHD) or dyslexia. 

No matter the population they are studying, these professionals are interested in teaching methods, the instructional process and different learning outcomes.

How much does the time of day when new information is introduced influence whether a person retains that information? What does culture have to do with how we process new ideas? How does age affect our ability to develop new skills, like language? How is in-person learning different from remote learning using technology? How does the choice of a media platform make a difference in learning? 

These are all questions that educational psychologists are asking — and answering — in settings as diverse as government research centers, schools, community organizations and learning centers.

Pursuing a Career in the Psychology of Teaching and Learning

Related books

More Activities for Teaching Positive Psychology

Essentials of Conditioning and Learning 5th

Study Like a Champ

Affirming LGBTQ+ Students in Higher Education

Psychology subfields

arXiv's Accessibility Forum starts next month!

Help | Advanced Search

Computer Science > Computation and Language

Title: beneath the surface of consistency: exploring cross-lingual knowledge representation sharing in llms.

Abstract: The veracity of a factoid is largely independent of the language it is written in. However, language models are inconsistent in their ability to answer the same factual question across languages. This raises questions about how LLMs represent a given fact across languages. We explore multilingual factual knowledge through two aspects: the model's ability to answer a query consistently across languages, and the ability to ''store'' answers in a shared representation for several languages. We propose a methodology to measure the extent of representation sharing across languages by repurposing knowledge editing methods. We examine LLMs with various multilingual configurations using a new multilingual dataset. We reveal that high consistency does not necessarily imply shared representation, particularly for languages with different scripts. Moreover, we find that script similarity is a dominant factor in representation sharing. Finally, we observe that if LLMs could fully share knowledge across languages, their accuracy in their best-performing language could benefit an increase of up to 150\% on average. These findings highlight the need for improved multilingual knowledge representation in LLMs and suggest a path for the development of more robust and consistent multilingual LLMs.
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as: [cs.CL]
  (or [cs.CL] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

AIP Publishing Logo

  • Previous Article
  • Next Article

INTRODUCTION

Application of cvae in 1d model data, application of cvae in 2d model data, application of cvae in experimental data, pzt thin films, data analysis, supplementary material, acknowledgments, author declarations, conflict of interest, author contributions, data availability, physical discovery in representation learning via conditioning on prior knowledge.

ORCID logo

  • Split-Screen
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Open the PDF for in another window
  • Reprints and Permissions
  • Cite Icon Cite
  • Search Site

Yongtao Liu , Bryan D. Huey , Maxim A. Ziatdinov , Sergei V. Kalinin; Physical discovery in representation learning via conditioning on prior knowledge. J. Appl. Phys. 14 August 2024; 136 (6): 064902. https://doi.org/10.1063/5.0222403

Download citation file:

  • Ris (Zotero)
  • Reference Manager

Recent advances in electron, scanning probe, optical, and chemical imaging and spectroscopy yield bespoke data sets containing the information of structure and functionality of complex systems. In many cases, the resulting data sets are underpinned by low-dimensional simple representations encoding the factors of variability within the data. The representation learning methods seek to discover these factors of variability, ideally further connecting them with relevant physical mechanisms. However, generally, the task of identifying the latent variables corresponding to actual physical mechanisms is extremely complex. Here, we present an empirical study of an approach based on conditioning the data on the known (continuous) physical parameters and systematically compare it with the previously introduced approach based on the invariant variational autoencoders. The conditional variational autoencoder (cVAE) approach does not rely on the existence of the invariant transforms and hence allows for much greater flexibility and applicability. Interestingly, cVAE allows for limited extrapolation outside of the original domain of the conditional variable. However, this extrapolation is limited compared to the cases when true physical mechanisms are known, and the physical factor of variability can be disentangled in full. We further show that introducing the known conditioning results in the simplification of the latent distribution if the conditioning vector is correlated with the factor of variability in the data, thus allowing us to separate relevant physical factors. We initially demonstrate this approach using 1D and 2D examples on a synthetic data set and then extend it to the analysis of experimental data on ferroelectric domain dynamics visualized via piezoresponse force microscopy.

The tremendous success of physical sciences over the last two hundred years has been largely predicated on the search for and discovery of physical mechanisms, meaning simple laws and factors that can explain observations. The paradigmatic example of this, as eloquently summarized by Wigner in his oft-cited opinion, 1 is the discovery of Newton's laws. Similarly, numerous studies of celestial objects since ancient Egypt and Sumer times lead to the constant improvement of the model describing planetary motion and result in the well-known Keplerian model; further studies including Newton's law of universal gravitation and Einstein's general theory of relativity explain the planets’ elliptical motion and the irregular motion of Mercury, respectively. Such patterns are followed in most scientific fields still today, with the experimental observations used to derive correlative relationships that, in turn, underpin the emergence of physical models. These are often linked to symbolic regression, where simplicity and elegance of the mathematical law are considered as a strong indicator that the correct physical model has been identified. Overall, the greatest advantage of known symbolic or computational models (e.g., the lattice Hamiltonian in condensed matter physics 2,3 or force fields in molecular dynamics 4 ) is their capability to extrapolate outside of the original measurement domain, predicting the effect of parameter changes and generally allowing for interventional and counterfactual studies. 5 For example, Newton's laws allow predicting the trajectories of man-made objects, whereas modern calculation methods allow exploring properties and functionalities of not-yet realized molecules and materials. 6,7

The rapid development of deep learning 8,9 methods over the last decade has provided a powerful new tool for physical research capable of building correlative relationships between multidimensional objects. While early applications have relied on purely correlative models, the developments over the last several years include the introduction of physical constraints and symmetries in the neural networks, making the interpolations consistent with prescribed physical models. 10,11 Similarly, the advancement in symbolic regression methods has allowed for the discovery of physical laws from observational data, first implemented in the framework of genetic algorithms 12 and subsequently extended toward deep learning symbolic regression methods, 10,13 physics-enhanced neural networks, 14–16 Koopman operator based methods, 17–19 and Bayesian methods. 20 Much of this effort relied on the presence of robust physical descriptors, such as planetary coordinates in astronomy or atomic nuclei in electron microscopy studies.

However, in many cases, accessible to observation are complex data sets representing static or dynamic fields, as exemplified by video data, atomic evolution movies in electron microscopy, and dynamic materials studies with scanning tunneling microscopy (STM) and scanning probe microscopy. In these cases, the presence of simple underlying physical mechanisms can also be postulated. For example, the contrast in STM is determined by the underlying atomic structure and associated spatial distribution of electronic densities, where the relationship between the two is defined by quantum mechanics. Similarly, the observed distribution of electromechanical activity on the surfaces of ferroelectric and ionic materials visualized by piezoresponse force Micromcopy (PFM) is determined by local variations in materias functionalities. Effects of the image formation mechanisms are often non-negligible, sometimes masking or even inverting the measured parameters, 21 so also must be incorporated. Correspondingly, machine learning methods capable of physical discoveries from such data are of interest, 22–24 including interpolating within and (à la Wigner) extrapolating outside of the original measurement domain.

Especially as experimental data sets continue to grow from a manageable handful to thousands of frames 25 and hundreds of millions of pixels or voxels, 26 recent advances in the generative statistical models such as simple and variational autoencoders (VAEs) 27–29 offer a pathway for addressing these problems. The general premise of the autoencoder is that the observational data set can be encoded via a small (compared to the input space dimensionality of data) number of latent variables, where the relationship between the latent vector and the data object is defined by the encoder and decoder network. The multitude of available studies have illustrated that VAEs allow for the disentanglement of the latent representation, generally referring to the behavior where the variability along the selected latent variable corresponds to easily identifiable trends in data. 22,23,30–36 Naturally, this poses the challenge as to whether latent variables can be identified with specific physical mechanisms, or predefined or controlled. 37 Finally, of particular interest is whether generative models such as VAEs can be used to extrapolate outside the original distribution.

Here, we explore the introduction of known physical mechanisms by conditional variational autoencoders (cVAE), using the conditioning based on known (continuous) descriptors. We use the known or hypothesized physical factors as the condition and explore the unknown factors that can be reflected in the resultant latent distributions. As such, the latent distributions can show the discovery. For example, in the study of ferroelectric domain walls dynamics using piezoresponse force microscopy, it is easy to quantify the distance of domain wall motion; in contrast, it is challenging to quantify some factors, such as the domain wall shape and the effects of surrounding elements (e.g., other interacting walls, defects, strain conditions) on wall motions. In this case, the distance of domain wall motion can serve as a known physical factor, and then, the other factors can be encoded in latent variables and shown in the latent distribution. If these unknown factors are well disentangled in latent variables, we expect the latent distribution to be simple (e.g., shown well-classified distribution); otherwise, a complex latent distribution potentially implicates complex unknown factors. We further analyze the correlation between latent variables and ground truth properties. We further explore the potential of the conditional VAE approach to extrapolate outside of the original range of conditioning parameters. This approach is illustrated for model systems with known factors of variability and further extended to experimental PFM data of ferroelectric domain dynamics.

VAE analysis of the 1D synthetic peak data. (a) An example of a 1D peak with labeled ground truth parameters, where μ is the peak shift, σ is the peak width (full-width-half-max), and the peak maxima is the amplitude (A). (b) 25 randomly sampled examples out of 3000 in the synthetic peak data. (c) Simple VAE of analysis of the peak data shown as latent space colored by ground truth parameters, where it is observed that all peak parameters are encoded into latent variables. (d) Shift-invariant VAE analysis of the peak data shown as latent space colored by ground truth parameters, where there is no correlation between peak shift μ and latent variables because the peak shift is encoded into the shift variable as shown in Fig. S2(d) in the supplementary material.

VAE analysis of the 1D synthetic peak data. (a) An example of a 1D peak with labeled ground truth parameters, where μ is the peak shift, σ is the peak width (full-width-half-max), and the peak maxima is the amplitude (A). (b) 25 randomly sampled examples out of 3000 in the synthetic peak data. (c) Simple VAE of analysis of the peak data shown as latent space colored by ground truth parameters, where it is observed that all peak parameters are encoded into latent variables. (d) Shift-invariant VAE analysis of the peak data shown as latent space colored by ground truth parameters, where there is no correlation between peak shift μ and latent variables because the peak shift is encoded into the shift variable as shown in Fig. S2(d) in the supplementary material .

The simple VAE analysis of this data set is shown in Fig. 1(c) . Here, the latent distributions of the data are plotted with the color overlay corresponding to the ground truth labels. The latter are not available to the algorithm, and hence allow identification of the latent variables in terms of the data set parameters. Of course, the data have three factors of variability, and the latent space is two dimensional; thus, we do not expect the full separation of the factors of variability. Still, examining the results clearly illustrates that the z 1 variable is largely associated with peak shift μ ⁠ , while the variability in the z 2 direction represents the joint effect of the amplitude A and width σ ⁠ .

A similar analysis using shift-invariant VAE encodes the data in terms of separating the shift into a shift variable and the rest of information into standard latent variables. Examination of the data in Fig. 1(d) illustrates that now variability associated with the peak shift μ has disappeared, whereas variability in latent space represents the collective effect of amplitude and width. These findings are equivalent to our previous work and illustrate the capability of physically defined invariances to disentangle them from real data. 42,43 However, these invariances are encoded in the coordinate transform in the invariant VAE (iVAE) framework, as described in depth in earlier works. 23,32,33,44–47 A systematic discussion of these iVAE can be found in our previous work. 24  

Next, we apply the cVAE approach to the same data set. In this case, the VAE receives the data set of the shape N  ×  D (here N is the number of total 1D peak data and D is the dimension of a single 1D peak data) and a conditional vector of the shape N  × 1 describing a known continuous parameter as an input. It is important to note that, unlike in iVAE models, there is no coordinate transform as a part of the model architecture, and the conditioning vector can represent any known salient feature or features. However, the use of the same physical parameters as for the toy model allows for physical comparisons.

The cVAE analysis conditioned on the peak shift μ is shown in Fig. 2(a) . Examination of the ground truth labels illustrates that z 2 is still associated with peak amplitude A and width σ , but not peak shift μ ⁠ . In this manner, the cVAE and shift-invariant VAE each leads to comparable outcomes. We then further explore conditioning on a pair of variables, namely, μ and σ . In this case, the latent manifold is still 1D, but now z 2 is clearly associated with the peak amplitude A . This is a demonstrable improvement over the shift-invariant VAE analysis, which did not allow for the separation of the two ground truth factors. Finally, the conditioning on all three variables results in the complete collapse of the representation, and the data manifold is now zero dimensional.

cVAE analysis of the 1D synthetic peak data. (a) Learned latent manifold of cVAE conditioned on the (known) peak shift μ. In this case, there is no correlation between peak shift and latent variables, whereas both peak width σ and peak amplitude A are correlated with latent variable z2. (b) Learned latent manifold of cVAE conditioned on the peak shift μ and peak width σ. There is still no correlation between peak shift, peak width, and latent variables, while peak amplitude A is now correlated with latent variable z2. (c) Learned latent manifold of cVAE conditioned on the peak shift μ, peak width σ, and peak amplitude A, where the latent manifold is completely collapsed because all peak parameters are added as conditions. (d) The correlation between z2 and ground truth peak amplitude A with different cVAE conditionings. It shows that conditioning simplifies analysis; that is, especially in the second plot when conditioning on peak shift μ and width σ (two of the three data set variabilities), cVAE successfully encodes peak amplitude A (the third variability) into z2, evidenced by the linear correlation between z2 and ground truth peak amplitude A. Noteworthily, increasing the latent dimension of simple VAE does not allow us to encode peak amplitude into latent variables, as shown in Figs. S3 and S4 in the supplementary material. Furthermore, when the peak amplitude A is added as a condition, this correlation disappears, as shown in the third plot. More examples assessing latent variables vs ground truths of cVAE are shown in Figs. S5–S7 in the supplementary material.

cVAE analysis of the 1D synthetic peak data. (a) Learned latent manifold of cVAE conditioned on the (known) peak shift μ . In this case, there is no correlation between peak shift and latent variables, whereas both peak width σ and peak amplitude A are correlated with latent variable z 2 . (b) Learned latent manifold of cVAE conditioned on the peak shift μ and peak width σ. There is still no correlation between peak shift, peak width, and latent variables, while peak amplitude A is now correlated with latent variable z 2 . (c) Learned latent manifold of cVAE conditioned on the peak shift μ , peak width σ, and peak amplitude A, where the latent manifold is completely collapsed because all peak parameters are added as conditions. (d) The correlation between z 2 and ground truth peak amplitude A with different cVAE conditionings. It shows that conditioning simplifies analysis; that is, especially in the second plot when conditioning on peak shift μ and width σ (two of the three data set variabilities), cVAE successfully encodes peak amplitude A (the third variability) into z 2 , evidenced by the linear correlation between z 2 and ground truth peak amplitude A. Noteworthily, increasing the latent dimension of simple VAE does not allow us to encode peak amplitude into latent variables, as shown in Figs. S3 and S4 in the supplementary material . Furthermore, when the peak amplitude A is added as a condition, this correlation disappears, as shown in the third plot. More examples assessing latent variables vs ground truths of cVAE are shown in Figs. S5–S7 in the supplementary material .

This simple 1D example demonstrates that conditioning on known factors of variability allows us to simplify the representations of the data and partially control the physical meaning of the remaining latent variables. As shown in Fig. 2(d) , when conditioning on peak shift μ and width σ (two of three variabilities of the data set), cVAE clearly encoded peak amplitude A (third variability of the data set) into z 2 . The associated changes in the dimensionality of the latent manifold thereby allow the number of intrinsic factors of variability in a data set to be explored.

To continue the discussion, we also note that the trained cVAE model can be used to synthesize data with preserved latent traits, allowing for interpolation and extrapolation along the conditioning parameter. This behavior is illustrated in Fig. 3 , which shows three latent distributions produced by conditioning the trained cVAE's decoder on different peak shifts. For example, Fig. 3(a) shows the manifold produced by conditioning the decoder on a peak shift of −2, resulting in a left-from-the-center shift of all peaks presented in Fig. 3(a) ; in the meantime, other peak parameters (e.g., peak width) still vary as expected. Similarly, manifolds in Figs. 3(b) and 3(c) are conditioned on peak shifts of 0 and 2, respectively. Consequently, peaks in Fig. 3(b) are located at the center, and peaks in Fig. 3(c) are shifted to the right-hand side.

Latent manifold learned by cVAE conditioned on a peak shift. (a)–(c) Latent manifold plots under various peak shift conditions. Note that each curve in the manifold shows a constant peak shift corresponding to the chosen condition (−2, 0, 2).

Latent manifold learned by cVAE conditioned on a peak shift. (a)–(c) Latent manifold plots under various peak shift conditions. Note that each curve in the manifold shows a constant peak shift corresponding to the chosen condition (−2, 0, 2).

We further expand this approach to 2D objects. Here, we use the previously developed cards data set 48 that contains four classical card hands [as shown in Fig. 4(a) ] augmented by rotations, translations, and shear. This data set allows for readily identifiable discrete classes as well as interesting degeneracies (e.g., rotated and deformed diamonds can be identical). The card data set used here includes 4000 cards (1000 per each card suite) with various disorders including random rotations in the range of [−30°, 30°] (here we note that the input in the Jupyter Notebook for rotation is [0, 30°], which performs both clockwise rotation and anticlockwise rotation), shifts in the range of [−4, 4] pixels, and shear in the range of [−0.002, 0.002]. Figure 4(b) shows some example card images in the data set.

Simple VAE, rVAE, shift-VAE, and cVAE analysis of the 2D cards data set. (a) The source card images. (b) Examples of the generated cards data with different shifts, rotations, and shears. (c) Latent space of simple VAE analysis, where clear correlations between (1) latent variables and class, (2) latent variables and x-translation, (3) latent variables and y-translation are observed. (d) Latent space of rVAE analysis, where there is no obvious correlation between latent variables and rotation angle because rotation angle is encoded into rotation latent variable. (e) Latent space of shift-VAE analysis, where there is no obvious correlation between latent variables and x-translation/y-translation because translations are encoded into the translation latent variable. (f) Latent space of cVAE analysis conditioned on rotation and x-translation, where there is no obvious correlation between latent variables and rotation angle as well as between latent variables and x-translation; in addition, (1) four classes are very obvious in latent space, (2) an obvious correlation between latent variables and y-translation is seen; these indicate that cVAE simplifies the disentanglements of class and y-translation with known factor of variabilities, i.e., rotation and x-translation. The plots of the discovered latent variables vs ground truth parameters for the VAE, rVAE, shift-VAE, and cVAE analyses in (c)–(f) are shown in Figs. S8–S11 in the supplementary material, respectively. More analyses of cVAE with other conditions are shown in Figs. S12–S14 in the supplementary material.

Simple VAE, rVAE, shift-VAE, and cVAE analysis of the 2D cards data set. (a) The source card images. (b) Examples of the generated cards data with different shifts, rotations, and shears. (c) Latent space of simple VAE analysis, where clear correlations between (1) latent variables and class, (2) latent variables and x-translation, (3) latent variables and y-translation are observed. (d) Latent space of rVAE analysis, where there is no obvious correlation between latent variables and rotation angle because rotation angle is encoded into rotation latent variable. (e) Latent space of shift-VAE analysis, where there is no obvious correlation between latent variables and x-translation/y-translation because translations are encoded into the translation latent variable. (f) Latent space of cVAE analysis conditioned on rotation and x-translation, where there is no obvious correlation between latent variables and rotation angle as well as between latent variables and x-translation; in addition, (1) four classes are very obvious in latent space, (2) an obvious correlation between latent variables and y-translation is seen; these indicate that cVAE simplifies the disentanglements of class and y-translation with known factor of variabilities, i.e., rotation and x-translation. The plots of the discovered latent variables vs ground truth parameters for the VAE, rVAE, shift-VAE, and cVAE analyses in (c)–(f) are shown in Figs. S8–S11 in the supplementary material , respectively. More analyses of cVAE with other conditions are shown in Figs. S12–S14 in the supplementary material .

We apply simple VAE, rotationally invariant autoencoders (rVAEs), shift-invariant VAE, and cVAE to this card data set. The simple VAE analysis with the ground truth labels is illustrated in Fig. 4(c) . Note that, due to close similarity between different cards after various rotations and deformations, the VAE fails to cluster the data set based on class variability and classes form connected manifolds. Rather, class-specific clusters form complex interpenetrating distributions in the latent space. That said, translation in x and y directions show clear alignment with chosen directions in the latent space.

Shift-invariant VAE and rVAE allow us to separate translation and rotation into specific latent variables, and the rest of the information is encoded into the standard latent variables. Figure 4(d) shows the rVAE analysis results of the cards data sets. The rVAE reveals better performance for clustering the cards images based on class variability when the card rotation is separated into a rotation variable, though there is still a significant interpenetration between some classes. Then, z 1 is associated with both x -translation and y -translation. As expected, the rotation is not associated with z 1 and z 2 because it is encoded into the rotation variable [as shown in Fig. S9(b) in the supplementary material ]. Notably, there is also no correlation between rotation and simple VAE latent, but this is because the competing tendencies of representation disentanglement when several physical factors of variability compete for representation by latent variables. Figure 4(e) shows even better performance on clustering the card images into four classes when implementing shift-VAE. As expected, there is no correlation between translations and standard latent variables as the translations are encoded into the translation variables. In this case, the rotation is associated with the standard latent variables z 1 and z 2 .

Figure 4(f) shows results from the cVAE analysis, where the data set was conditioned on both the rotation and x -translation. In this case, the latent space distribution clearly shows four unique clusters corresponding to the individual cards, with the variation between different classes associated with a selected direction in the latent space. The label distributions corresponding to the rotations and x -translation, i.e., to the variables on which conditioning has been performed, are featureless. At the same time, the translation in the y -direction, i.e., the only remaining factor of variability, becomes clearly associated with another direction in the latent space. Additional examples of cVAE analyses can be found in Figs. S12–S14 in the supplementary material , while the provided Jupyter notebook also allows more analyses to be explored. This illustrates that cVAE analysis with known physical parameters in the training session (e.g., rotation and x -translation) enables improving the disentanglement of latent representation of unknown physical parameters (e.g., y -translation).

It is also important to note that the behaviors of the latent representations of the data provide insight into physically relevant factors of variation within the data (i.e., classes, rotations, and translations). For invariant VAEs (rVAE and shift-invariant VAE), the introduction of invariances leads to the simplification of the latent distributions, which become controlled by the remaining (discrete or continuous) factors of variability; we do not discuss more details about rVAE and shift-VAE and direct the readers to our previous works. 22,23,36 For the cVAE, introducing the known conditioning enables simplification of the latent distribution if the conditioning vector is correlated with the factor of variability in the data, thus allowing us to separate relevant physical factors and experimental artifacts [e.g., scan distortions in experimental scanning probe microscopy (SPM) data].

Similar to the 1D example, we further explore the potential of the cVAE to not just interpolate but also to extrapolate along the conditioning variables. Figure 5 demonstrates card images generated by a trained cVAE with rotation and x -translation as conditions. The training data includes 4000 cards with random rotations in the range of [−30°, 30°] and shifts in the range of [−15, 15]. The extrapolation was performed by adding specified conditions to cVAE (e.g., x -translation = 20, rotation = 60). The extrapolated card images shown in Fig. 5 are consistent with the specified conditions. The card images in each sub-image shift from the left-hand side to the right-hand side of the field of view as the x -translation condition changes from −20 to 20. The tilt of the cards changes in a counterclockwise manner as the rotation condition changes from −60° to 60°. The chosen example also demonstrates the variability of decoded objects within the chosen regions of latent space. The extrapolation process is also available in the provided Jupyter notebooks, allowing readers to explore it.

Learned latent manifolds of cVAE analysis of 2D cards data with rotation angle and x-translation as conditions. Each latent manifold shows constant x-translation and rotation corresponding to the defined conditions.

Learned latent manifolds of cVAE analysis of 2D cards data with rotation angle and x-translation as conditions. Each latent manifold shows constant x-translation and rotation corresponding to the defined conditions.

In Fig. 6 , we summarized the interpolation and extrapolation performance by the mean squared error and structure similarity index 49 between generated card images and the corresponding ground truth card image. The structure similarity index, first introduced in 2004, 49 is widely used as a metric to measure the similarity between two given images from an image formation point of view. We analyzed three models in Fig. 6 : (i) cVAE with rotation as a condition, (ii) cVAE with x -translation and rotation as conditions, and (iii) cVAE with x -translation and y -translation as conditions. We explored the interpolation and extrapolation performance of all three models in Fig. 6 . Generally, interpolation performs extremely well, with almost perfect reconstruction within the training region. Figure 6 suggests that the cVAE allows for limited extrapolation on conditioning parameters. For the data set conditioned on the x -translation and rotation, the structure similarity index (SSID) for the images was reconstructed well outside the original training region, with the ground truth illustrating a clear matching pattern. Interestingly, the regions of good matching have a complex structure, where some directions in parameter space are associated with good interpolation, whereas in other parts of parameter space, the errors accumulate. These are unsurprising given the various local symmetries among the initial 4 cards. A similar behavior is observed for conditioning on x -translation and y -translation.

Interpolation and extrapolation performance of cVAE are shown as mean squared error and structure similarity index between the cVAE generated card data and the ground truth card data as a function of defined conditions. (a) Performance of cVAE conditioned on rotation. (b) Performance of cVAE conditioned on x-translation and rotation. (c) Performance of cVAE conditioned on x-translation and y-translation.

Interpolation and extrapolation performance of cVAE are shown as mean squared error and structure similarity index between the cVAE generated card data and the ground truth card data as a function of defined conditions. (a) Performance of cVAE conditioned on rotation. (b) Performance of cVAE conditioned on x-translation and rotation. (c) Performance of cVAE conditioned on x-translation and y-translation.

With this thorough understanding of VAE based approaches, we extend this analysis to experimental data on ferroelectric domain switching. Previously, we have demonstrated the use of rVAE to explore the ferroelectric domain switching pathway and domain wall dynamics. When applying rVAE to consecutive PFM images revealing the ferroelectric domain switching process, the polarization switching mechanism can be visualized in the latent space. 23 When applying rVAE to stacked ferroelectric and ferroelastic domain wall images (generated based on numerous continuously acquired PFM images during domain switching via electric field poling), it disentangles the factors affecting the ferroelectric domain wall dynamics. This includes how the distribution of ferroelastic domain walls affects the dynamics of ferroelectric domain walls, offering insights into the intrinsic mechanisms of ferroelectric polarization switching and hence approaches to engineer devices with more stable domains, domains that can switch faster, or for lower energy switching. 46 In particular, we probed the ferroelectric domain wall pinning mechanisms by translating the latent space to physical descriptors. 46 However, these analyses were enabled by the rotational invariances inherent to rVAE, but the physical interpretation of the latent variables was based exclusively on the analysis of the latent spaces. Here, we expand this analysis toward elucidation of the relevant latent mechanisms when the input data are conditioned on the a priori known physical descriptors.

As a model system, we explore the ferroelectric polarization switching dynamics in a 150 nm thick lead zirconate titanate (PZT) thin film grown on a SrTiO 3 (001) substrate by pulsed laser deposition (PLD), with a heteroepitaxial intermediate conducting oxide electrode (SRO). 50,51 We explore the domain switching dynamics as a function of time using PFM by applying a constant tip bias that just surpasses the coercive field. Consecutive PFM images (Fig. S15 in the supplementary material ) show the ferroelectric switching from the (001) to the (00−1) states. Consequently, domain switching can be excited and observed at the same time. 52,53 These PFM data were used in our earlier publication; 23 here, we just reuse the PFM data to demonstrate the application of cVAE.

In the cVAE analysis, we introduced a time delay (d t ). That means, the domain wall location is determined by a Canny filter 54 at time t , the sub-image centered at the domain wall location is created at time t and t  + d t . This leads to a comparison of domains at time t and t +  d t in the sub-image data sets and hence the domain switching and wall dynamics are encoded as dependents of time. Figure 7(a) shows a comparison of domains at d t  =  0 and d t = 5 ; in the first image (d t = 0 ) the domain wall is located at the center, in the second image (d t = 5 ) the domain wall moves away from the center. Figure 7(b) shows example sub-images used for cVAE analysis.

Simple VAE and rVAE analysis with conditions of experimental PFM data. (a) Examples of PFM image patches (window size = 30) with different time delays. (b) Examples of the generated PFM image patches. The color in the PFM image represents the polarization magnitude. (c)–(e) VAE analyses: (c) latent space of VAE analysis without conditions colored by switch degree and time delay, where a correlation between latent variables and switch degree and time delay is observed; (d) latent space of cVAE with switch degree as a condition; in this case, no correlation between ground truth parameters (switch degree and time delay) and latent variables is observed; (e) latent space of cVAE with time delay as a condition; in this case, the correlations between ground truth parameters (switch degree and time delay) and latent variables are modified but do not disappear. It is seen that adding time delay as a condition is not functioning; this is probably because there is conflict of information included in time delay and switch degree, as seen in (a), that is, a larger time delay generally corresponds to a larger switch degree. The latent variables vs ground truth parameter plots of these VAE analyses are shown in Fig. S16 in the supplementary material.

Simple VAE and rVAE analysis with conditions of experimental PFM data. (a) Examples of PFM image patches (window size = 30) with different time delays. (b) Examples of the generated PFM image patches. The color in the PFM image represents the polarization magnitude. (c)–(e) VAE analyses: (c) latent space of VAE analysis without conditions colored by switch degree and time delay, where a correlation between latent variables and switch degree and time delay is observed; (d) latent space of cVAE with switch degree as a condition; in this case, no correlation between ground truth parameters (switch degree and time delay) and latent variables is observed; (e) latent space of cVAE with time delay as a condition; in this case, the correlations between ground truth parameters (switch degree and time delay) and latent variables are modified but do not disappear. It is seen that adding time delay as a condition is not functioning; this is probably because there is conflict of information included in time delay and switch degree, as seen in (a), that is, a larger time delay generally corresponds to a larger switch degree. The latent variables vs ground truth parameter plots of these VAE analyses are shown in Fig. S16 in the supplementary material .

In the cVAE analysis, we used switch degree and time delay as conditions. The switch degree represents the ratio of yellow (switched) domain area and blue (unswitched) domain area, and the time delay is explained above. Shown in Fig. 7(c) is a simple VAE analysis without conditions, where latent variables are colored by ground truth values. Just as with the 1D and 2D model systems explored above, the color gradient reveals that both switch degree and time delay are encoded into latent variables. Figures 7(d) and 7(e) then show cVAE analyses conditioned by switch degree and time delay, respectively. Figure 7(d) confirms that the switch degree is featureless in the latent space when it is performed as a condition, indicating that this prior physical knowledge effectively affects the cVAE analysis. However, the time delay is still visible as a correlation within the latent space when it is performed as a condition [ Fig. 7(e) ]. Note that conditioning on switch degree not only leads to switch degree featureless in latent space but also leads to time delay featureless in latent space. This is possibly because of the intimate connection between switch degree and time delay, as shown in Fig. 7(a) ; when a time delay is added (e.g., d t = 5 ), the switch degree also changes simultaneously.

We also explored the reconstruction of PFM image patches by cVAE with predefined parameters. Shown in Fig. 8 are the analyses of two cVAE conditioned on time delay and switch degree, respectively. cVAE manifolds show the reconstruction of PFM image patches with different conditions. More reconstruction by cVAE is shown in Fig. S17 in the supplementary material . Such cVAE manifolds as a function of defined conditions allow us to perform extrapolation into the future as well, as demonstrated in the final column in Fig. 8 . The analyses in Figs. 7 and 8 are extended to PFM domain wall images generated from raw PFM images via application of a Canny filter (results are shown in Figs. S18–S21 in the supplementary material ).

Latent manifolds with predefined conditions of cVAE analyses of experimental PFM data based on image patches with window size = 20. (a) Latent space distribution and latent manifold of VAE analysis with time-delay as a condition. (b) Latent space distribution and latent manifold of VAE analysis with switch-degree as a condition.

Latent manifolds with predefined conditions of cVAE analyses of experimental PFM data based on image patches with window size = 20. (a) Latent space distribution and latent manifold of VAE analysis with time-delay as a condition. (b) Latent space distribution and latent manifold of VAE analysis with switch-degree as a condition.

In conclusion, we demonstrate the use of conditional variational autoencoders (cVAEs) to explore physical information by conditioning prior known physical parameters, and we compare the cVAE with the previous invariant VAE (iVAE) approach. Given that cVAE does not rely on the specific invariant transform, it allows for much greater flexibility. We showed the application of this approach using modeled 1D spectrum and 2D image data sets, revealing that the conditioned parameters become featureless in the latent representation. Then, we extended this approach to experimental PFM data on ferroelectric domain switching and domain wall dynamics. While the latent distribution of the experimental data set shows more complexity, we argue that the cVAE-based physics discovery can be performed in iterative and hypothesis testing modes, allowing for simple and low-dimensional latent distributions when the relevant physical factors are correctly identified.

The PZT film is grown on a SrTiO 3 (001) substrate by pulsed laser deposition (PLD), with an intermediate conducting oxide electrode (SRO). The PLD deposition is conducted at 650 °C with 100 mTorr oxygen partial pressure, and then, the samples are cooled to room temperature.

For cVAE training, we used a simple cVAE neural network architecture for the encoder and decoder; both encoder and decoder consist of two fully connected layers with 128 neurons in each layer (i.e. , hidden_dim_e = [128, 128], hidden_dim_d = [128, 128]), activated by the hyperbolic tangent function. The number of layers and neurons can be increased depending on the complexity of the input features. The available activations also include ReLU, leaky ReLU, softplus, and GELU. The latent dimension of cVAE in this work is two (i.e., latent_dim = 2), and the conditioning dimension is one, two, or three (e.g., c_dim = 1) depending on the available conditions for each analysis. The loss functions are the sum of reconstruction loss and Kulback–Leibler loss. The conditioning vector was concatenated with the vector sampled from the latent space before passing into the decoder neural network. The training procedure is the same as an autoencoder with an additional term “condition” passed to the model. The 500 epochs training was performed in Google Colab using T4GPU. The detailed methodologies of analysis are also shown in Jupyter notebooks that are available from https://github.com/yongtaoliu/Physics–cVAE , which include all data and model parameters for the results presented in this manuscript, allowing readers to reproduce the investigation. These notebooks also allow readers to adapt it for their own investigations.

The details of shift-VAE and rVAE architectures are described elsewhere. 22,23,36

VAEs were implemented using open-source package pyroVED from https://pyroved.readthedocs.io/en/latest/models.html .

See the supplementary material for figures of VAE, rVAE, and cVAE analyses of 1D spectrum, 2D card, and PFM data.

This work (ML analysis) was supported (Y.L., S.V.K., and M.A.Z.) by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences as part of the Energy Frontier Research Centers program: CSSAS—The Center for the Science of Synthesis Across Scales under Award No. DE-SC0019288. This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC0500OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for the United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ).

The authors declare no conflict of interest.

Yongtao Liu: Formal analysis (lead); Investigation (lead); Writing – original draft (lead); Writing – review & editing (lead). Bryan D. Huey: Resources (supporting). Maxim A. Ziatdinov: Methodology (lead). Sergei V. Kalinin: Supervision (lead); Writing – original draft (lead); Writing – review & editing (lead).

The interactive Jupyter notebooks that reproduce this paper's results are available at https://git.io/JD28J .

Citing articles via

Submit your article.

representation of knowledge in psychology

Sign up for alerts

representation of knowledge in psychology

  • Online ISSN 1089-7550
  • Print ISSN 0021-8979
  • For Researchers
  • For Librarians
  • For Advertisers
  • Our Publishing Partners  
  • Physics Today
  • Conference Proceedings
  • Special Topics

pubs.aip.org

  • Privacy Policy
  • Terms of Use

Connect with AIP Publishing

This feature is available to subscribers only.

Sign In or Create an Account

glow blue particles on blue background are hanging in air for bright festive presentation with depth of field and light bokeh effects

5 career opportunities you could pursue after a 3-year psychology degree

Exploring employment opportunities

Recent research by the University of Canberra, Demand for Australia’s psychology graduates suggests that about a quarter of job listings may align with the skill set of individuals holding a 3-year psychology degree , offering a broad spectrum of employment prospects. Recent UWA graduates have pursued roles spanning human resources, social and welfare services, and research, showcasing the versatility and wide applicability of a psychology degree across various industries.

Career opportunities for psychology graduates

Graduates of UWA's psychology courses are well-prepared for various career pathways and opportunities. Here are just a few of the roles that our graduates can pursue:

  • Mental Health Support Worker: Undertake work such as supporting other mental health professionals and providing behavioural support to people with disabilities by helping them develop coping strategies and facilitating therapeutic activities and group sessions.
  • Community Services Worker : You can make a difference in the lives of individuals and communities by working in areas such as counselling, youth worker, and community outreach programs.
  • Human Resources Specialist : With a deep understanding of human behaviour and motivation, psychology graduates excel in roles related to recruitment, training, and employee development.
  • Case Manager : You can work as a case manager, supporting individuals with mental health issues, disabilities, or substance abuse problems by coordinating services, advocating for their rights, and helping them navigate the healthcare system.
  • Market Research Analyst : Understanding consumer behaviour is crucial in marketing. A 3-year psychology degree will give you valuable skills in data analysis, research methodologies, and understanding consumer motivations, making you an ideal candidate for market research roles.

Why study psychology at UWA?

Studying psychology at UWA offers you a comprehensive education that prepares you for success in the workforce. Our range of undergraduate courses includes the Bachelor of Psychology which can also be combined with degrees in Arts, Science, and Commerce allowing you to graduate with two degrees in four years. This allows you to pair your Psychology degree with majors such as Criminology and Criminal Justice, Management, Finance, and more, giving you the flexibility to tailor your studies to your interests and career goals. Another fourth-year option is the Bachelor of Philosophy (Honours), which has a research-intensive focus that equips you with the skills to identify and solve complex problems, independently manage and lead projects, and communicate effectively across various platforms.

Postgraduate options

For those interested in further study, UWA offers a range of postgraduate courses in psychology . You can extend your 3-year degree studies with an honours year, which prepares you for postgraduate courses that lead to registration as a psychologist such as: Master of Industrial and Organisational Psychology, Doctor of Philosophy and Master of Clinical Psychology, Doctor of Philosophy and Master of Clinical Neuropsychology. These postgraduate courses provide you with specialised knowledge and advanced training, preparing you for roles in clinical practice, organisational consulting, research, and more. Pursuing postgraduate study in psychology opens up additional career opportunities and allows you to deepen your expertise in specific areas of interest.

Find out more about our psychology courses!

Explore more article topics

  • Courses and Careers

Don't miss a moment

Stay up-to-date with the latest information about studying at UWA including, our courses and career pathways, event and webinars, key dates, what uni life is really like, tips for studying plus, so much more!

Related stories

Continue seeking wisdom.

representation of knowledge in psychology

PRKG: Pre-Training Representation and Knowledge-Graph-Enhanced Web Service Recommendation for Mashup Creation

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options, recommendations, web service recommendation for mashup creation based on graph network.

In recent years, the world has witnessed the increased maturity of service-oriented computing. The mashup, as one of the typical service-based applications, aggregates contents from more than one source into a single user interface. Facing the ...

Web API service recommendation for Mashup creation

Mashup refers to a sort of web application developed by reusing or combining web API services, which are very popular software components for building distributed applications. As the number of open web APIs increases, to find suitable web APIs for Mashup ...

A Social-Aware Service Recommendation Approach for Mashup Creation

Mashup is a user-centric approach to create value-added new services by utilizing and recombining existing service components. However, as services become increasingly more spontaneous and prevalent on the Internet, finding suitable services from which ...

Information

Published in, publication history.

  • Research-article

Contributors

Other metrics, bibliometrics, article metrics.

  • 0 Total Citations
  • 0 Total Downloads
  • Downloads (Last 12 months) 0
  • Downloads (Last 6 weeks) 0

View options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

IMAGES

  1. PPT

    representation of knowledge in psychology

  2. Schematic representation of the three knowledge levels within mental

    representation of knowledge in psychology

  3. Knowledge and Human Mind

    representation of knowledge in psychology

  4. Knowledge sharing, psychology, psychology, flower, brain png

    representation of knowledge in psychology

  5. What Is Knowledge? A Brief Primer

    representation of knowledge in psychology

  6. Perception Psychology and How We Understand Our World

    representation of knowledge in psychology

COMMENTS

  1. Knowledge Representation

    Knowledge representation is a key concept in cognitive science and psychology. To understand this theoretical term one has to distinguish between "knowledge" and its "representation.". Intelligent behaviors of a system, natural or artificial, are usually explained by referring to the system's knowledge.

  2. Psychology of knowledge representation

    For example, structured representations are more complex and computationally demanding, but are good at representing relational information. Researchers interested in human psychology would benefit from considering how knowledge is represented in their domain of inquiry. For further resources related to this article, please visit the WIREs website.

  3. Knowledge representation

    At the core of all theories in psychology is a set of assumptions about the information people use to carry out the task being modeled and about the way that information is stored. The format in which information is stored and used in psychological processing is called knowledge representation. This chapter explores what it means for something to be a representation and what types of ...

  4. Representation of Knowledge

    The topic of knowledge representation concerns the various ways in which the collection of such information may be organized and processed. The important point is that the notions of organization and processing are intertwined. The way information is organized has an effect on the processes or operations, which can be used to manipulate ...

  5. Knowledge Representation

    Representation as related to psychology contains at least two meanings. At the ordinary level and in everyday context, representation indicates the process in which human mentally and symbolically refers to, depicts, and mirrors reality; it constitutes one subject matter of psychology (Martin, 2007).At the meta-theoretical level, representation refers to the process in which psychological ...

  6. Knowledge Representation

    The modern notion of a mental representation emerged during the cognitive revolution of the 1950s, when the computational view of the mind ascended. The behaviorist approach to psychology that played a significant role in American psychology explicitly denied that the form and content of people's knowledge were legitimate objects of scientific study.

  7. Knowledge representation

    Theories in psychology make implicit or explicit assumptions about the way people store and use information. The choice of a format for knowledge representation is crucial, because it influences what processes are easy or hard for a system to accomplish. In this chapter, I define the concept of a representation. Then, I review three broad types of representations that have been incorporated ...

  8. Knowledge Representation

    Psychological theories make assumptions about how knowledge is represented and processed. In this chapter, I begin by defining the concept of representation and discussing the ways that mental representations come to have meaning. Then, I present four different types of representations: spatial representations, featural representations ...

  9. Psychology of knowledge representation

    For example, structured representations are more complex and computationally demanding, but are good at representing relational information. Researchers interested in human psychology would benefit from considering how knowledge is represented in their domain of inquiry. This article is categorized under: Psychology > Memory

  10. Representation of Knowledge

    Knowledge representation is a task that has existed since language was written down. Various stages of scientific development lead to the fact that statements can be converted into numbers or formulas that can be read by machines. This allows artificial systems to be used to accomplish complex tasks. In AI, various techniques are used to ...

  11. Knowledge Representation

    Knowledge representation is fundamental to the study of mind. All theories of psychological processing are rooted in assumptions about how information is stored. These assumptions, in turn, influence the explanatory power of theories. This book fills a gap in the existing literature by providing an overview of types of knowledge representation ...

  12. Conceptual knowledge representation: A cross-section of current

    Abstract. How is conceptual knowledge encoded in the brain? This special issue of Cognitive Neuropsychology takes stock of current efforts to answer this question through a variety of methods and perspectives. Across this work, three questions recur, each fundamental to knowledge representation in the mind and brain.

  13. PDF Individual Representation in a Community of Knowledge

    Individual Representation in a Community of Knowledge. Nathaniel Rabb,1,* Philip M. Fernbach,2 and Steven A. Sloman3. An individual's knowledge is collective in at least two senses: it often comes from other people's testimony, and its deployment in reasoning and action requires accuracy underwritten by other people's knowledge.

  14. PDF A Framework for Representing Knowledge

    A Framework for Representing Knowledge Marvin Minsky MIT-AI Laboratory Memo 306, June, 1974. Reprinted in The Psychology of Computer Vision, P. Winston (Ed.), McGraw-Hill, 1975. Shorter versions in J. Haugeland, Ed., Mind Design, MIT Press, 1981, and in Cognitive Science, Collins, Allan and Edward E. Smith (eds.) Morgan-Kaufmann, 1992 ISBN 55860-013-2]

  15. PDF COGNITION

    PROTOTYPE. The best or "most typical" member of a category. Objects most similar to the prototype are the "best" exemplars beyond which there is a gradient of membership. the category of bird is represented by an ideal bird (or birds), which constitute the exemplars for the category.

  16. Knowledge Representation

    Knowledge representation is fundamental to the study of mind. All theories of psychological processing are rooted in assumptions about how information is stored. These assumptions, in turn, influence the explanatory power of theories. This book fills a gap in the existing literature by providing an overview of types of knowledge representation techniques and their use in cognitive models ...

  17. Two Forms of Knowledge Representations in the Human Brain

    These overlaps and divergences indicate that there are two distinct types of knowledge representation in a typically developed human brain, even for sensory-related properties ( Figure 5 B): one based on sensory-derived codes (seeing the colors of roses) and one based on language- (and cognitively) derived codes.

  18. Cognitive Psychology

    HOME STUDIES READING NOTES Cognitive Psychology Chapter 7. Chapter 7: Images and Propositions. Recall from the previous chapter that sense data is not stored in its raw form, but is organized into a mental representation that can be compared to present sensation to recognize that it has been encountered previously, but knowledge is a separate ...

  19. What is a Knowledge Representation?

    A knowledge representation (KR) is most fundamentally a surrogate, a substitute for the thing itself, used to enable an entity to determine consequences by thinking rather than acting, i.e., by reasoning about the world rather than taking action in it.

  20. PDF Representations and Models in Psychology

    pends. in pan. on the capability of the knowledge representatIOn to capture eXIsting structures. The desire for a simple and effective knowledge representation can some­ times lead to oversimplification. For example, a concept of question difficulty can be used to construct a unidimensional representation of knowledge. The

  21. A Survey on Knowledge Graphs: Representation, Acquisition, and

    Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction toward cognition and human-level intelligence. In this survey, we provide a comprehensive review of the knowledge graph covering overall research topics about: 1) knowledge graph representation learning; 2 ...

  22. Cognition and Representation

    The role of representations in cognitive systems and metaphors for cognition are discussed. A representation-theoretical approach to knowledge representation is proposed. The notion of a representation system is applied as a framework for specifying, comparing, and transforming representations. We show how this framework can be used to study ...

  23. Abstract representations emerge in human hippocampal neurons

    A task in which participants learned to perform inference led to the formation of hippocampal representations whose geometric properties&nbsp;reflected the latent structure of the task, indicating ...

  24. Educational Psychology Promotes Teaching and Learning

    Psychologists study how people learn and retain knowledge, applying psychological science to improve the learning process and promote educational success for all. ... Topics in Psychology. Explore how scientific research by psychologists can inform our professional lives, family and community relationships, emotional wellness, and more.

  25. Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge

    We explore multilingual factual knowledge through two aspects: the model's ability to answer a query consistently across languages, and the ability to ''store'' answers in a shared representation for several languages. We propose a methodology to measure the extent of representation sharing across languages by repurposing knowledge editing methods.

  26. Two Forms of Knowledge Representations in the Human Brain

    These overlaps and divergences indicate that there are two distinct types of knowledge representation in a typically developed human brain, even for sensory-related properties ( Figure 5 B): one based on sensory-derived codes (seeing the colors of roses) and one based on language- (and cognitively) derived codes.

  27. HJE: Joint Convolutional Representation Learning for Knowledge

    Learning the representations of a knowledge graph has attracted significant research interest in the field of intelligent Web. By regarding each relation as one translation from head entity to tail entity, translation-based methods including TransE, ...

  28. Physical discovery in representation learning via conditioning on prior

    The rapid development of deep learning 8,9 methods over the last decade has provided a powerful new tool for physical research capable of building correlative relationships between multidimensional objects. While early applications have relied on purely correlative models, the developments over the last several years include the introduction of physical constraints and symmetries in the neural ...

  29. 5 career opportunities you could pursue after a 3-year psychology degree

    A 3-year psychology degree will give you valuable skills in data analysis, research methodologies, and understanding consumer motivations, making you an ideal candidate for market research roles. Why study psychology at UWA? Studying psychology at UWA offers you a comprehensive education that prepares you for success in the workforce.

  30. PRKG: Pre-Training Representation and Knowledge-Graph-Enhanced Web

    To this end, this paper proposes a Web service recommendation method for Mashup creation that exploits pre-training representation and knowledge graphs as auxiliary information. Firstly, it uses the neural factorization machines and Doc2Vec to obtain the text semantic representation and the discrete attribute representation of Web services ...