Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Neuroscience. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 25 February 2024

Understanding How Humans Learn and Adapt to Changing Environmentsfree

Understanding How Humans Learn and Adapt to Changing Environmentsfree

  • Daphne BavelierDaphne BavelierUniversity of Geneva
  •  and Aaron CochraneAaron CochraneUniversity of Geneva

Summary

Compared to other animals or to artificial agents, humans are unique in the extent of their abilities to learn and adapt to changing environments. When focusing on skill learning and model-based approaches, learning can be conceived as a progression of increasing, then decreasing, dimensions of representing knowledge. First, initial learning demands exploration of the learning space and the identification of the relevant dimensions for the novel task at hand. Second, intermediate learning requires a refinement of these relevant dimensions of knowledge and behavior to continue improving performance while increasing efficiency. Such improvements utilize chunking or other forms of dimensionality reduction to diminish task complexity. Finally, late learning ensures automatization of behavior through habit formation and expertise development, thereby reducing the need to effortfully control behavior. While automatization greatly increases efficiency, there is also a trade-off with the ability to generalize, with late learning tending to be highly specific to the learned features and contexts. In each of these phases a variety of interacting factors are relevant: Declarative instructions, prior knowledge, attentional deployment, and cognitive fitness have unique roles to play. Neural contributions to processes involved also shift from earlier to later points in learning as effortfulness initially increases and then gives way to automaticity. Interestingly, video games excel at providing uniquely supportive environments to guide the learner through each of these learning stages. This fact makes video games a useful tool for both studying learning, due to their engaging nature and dynamic range of complexity, as well as engendering learning in domains such as education or cognitive training.

Subjects

  • Cognitive Neuroscience
  • Computational Neuroscience
  • Sensory Systems

Introduction

Humans have an outstanding capacity to learn and adapt. They benefit from a long, protracted period of development that ensures great plasticity and thus adaptability, with young ones exploring and discovering the world they will have to face as adults. As a consequence, human learning has been approached from many vantage points, as exemplified by the diversity of disciplines it encompasses. For example, the journal Science of Learning states its scope as “all research areas related to how the brain learns,” whereas the Journal of the Learning Sciences describes itself as a “multidisciplinary forum for research on education” (each website accessed December 2021). This article will focus exclusively on learning how to face novel tasks or environments in adulthood. Learning to face novel tasks, such as driving a car, will be framed in terms of a multiphase process in which phases are differentially affected by the ability to communicate, make use of prior knowledge, and pay attention. While the first two phases are conducive to knowledge transfer and generalization, after extensive learning, automatization and expertise develop, prioritizing efficiency over flexibility. Consequences for best learning strategies are discussed.

An early formulation of multiphase learning was described by Fitts and Posner (1967), who framed three cascading and qualitatively distinct learning phases: An initial cognitive phase followed by an associative, and then an autonomous, phase. These phases were characterized on multiple dimensions, such as the apparent rapidity of behavioral improvements (decreasing speed across phases). Fitts and Posner (1967) as well as following work (Tenison & Anderson, 2016) describe learning phases that were most importantly defined in terms of the skills and information being used by learners, namely, the initial use of previous learning to complete the task, then the use of declarative memories in the learned domain, and last using relatively automatic stimulus-response mappings.

Similar multiphase approaches to learning have been proposed in perception and attention (LaBerge, 1976) and in motor learning (Lövdén et al., 2020; Makino et al., 2016). LaBerge (1976) describes initial discovery of perceptual relevance and parsing sensory information into invariant features (that define sets of stimuli) distinct from varying features (that define dimensions along which items within a set must be distinguished). In a second phase, LaBerge describes a process of dimensionality reduction, in which the relevant dimensions are reduced into simpler codes that are more easily manipulated. Both of the first two phases of learning require active attention to sensory information, for the purposes of differentiating or integrating features. Finally, in a third phase, attention becomes less necessary for the continued refinement of perceptual codes. This sequence has parallels in the theory of expansion, exploration, selection, and refinement (Lövdén et al., 2020; Wenger et al., 2017). In this view, initial experience with a challenging or novel task leads to an expansion of the solution space (both behaviorally and neurally exploring novel behaviors). Further experience causes a reduction (i.e., selection) of the utilized behavioral patterns and neural circuits, which leads to increasingly specific patterns of activity and a goal-directed refinement of activity (Makino et al., 2016; see Figure 1).

Figure 1. Expanding, then reducing, dimensionality in learning. (A) Expected progression of behavioral performance during skill learning. As learning progresses, behavioral performance at first exhibits large variability as the dimensionality of the learning space is explored, then dimensions most relevant for the task at hand are selected, facilitating the refinement of behavior which is typically associated with less variability in performance. (B) Neural changes during skill learning. Learning of a novel task leads to an initial expansion of recruited circuits as the learner explores the space of possibilities for the novel task at hand; selective activation of only the more-efficient subset of circuits, a form of dimensionality reduction, permits a more targeted progression in learning, leading to a differentially refined set of networks mediating behavioral performance. (C) Although multiple dimensions may be needed initially to perform a novel task (left panel), an additional dimension may be learned (center panels), which allows a lower-dimensional embedding to mediate task performance, allowing eventually for low remaining variation and only the one necessary dimension supporting task performance (right panel). Deriving more efficient representation is a foundational property of learning, such as when trying to understand a system. For example, if learning the predation rate of fish, examining fish’s color (Dimension 1) and size (Dimension 2) may be helpful. However, it may be the case that another factor such as swimming speed (learned Dimension 3) explains most of the predation, making this single-dimensional knowledge a good representation of the reasons for predation.

Source: (A) and (B) adapted from Lövdén et al. (2020).

While the specific qualitative changes in underlying processes across the trajectories of learning may be differently characterized by various theoretical positions as well as the particular learning contexts (e.g., verbal vs. motor learning; Rosenbaum et al., 2001; Schmidt & Bjork, 1992), the existence of such qualitative changes is likely in all but the most constrained learning contexts. Such qualitative shifts need not indicate strictly sequential processes, however, and it is likely that the cognitive and neural mechanisms supporting each phase of learning are fundamentally active throughout the learning experience. Instead, learning phases can be thought of as the sequence in which various learning processes are increasingly facilitated by prior experience (i.e., previous learning) and subsequently saturate while simultaneously facilitating subsequent stages (Ackerman & Cianciolo, 2000). In other words, this article is not intended to describe discrete phases of learning as much as how learning processes facilitate each other and how they saturate.

More specifically, the idea of representational dimensionality will be used as a heuristic label to indicate how the learner engages with the environment (e.g., treating a stimulus as having more or fewer dimensions; see Recanatesi et al., 2021) as well as the resources that the learner recruits. This is similar to saying that learning is “the art of creating the most suitable representation, given the constraints” of the system (Edelman & Intrator, 2002, p. 349). Edelman and Intrator (2002) draw an apt analogy between human learning and a support vector machine learning approach, in which a learning problem may be approached by representing the information in a high-dimensional space for the purpose of reducing the resulting decision into a lower-dimensional space. Additional complexity is present in human learning as their goals and behaviors are dynamically defined in a changing world. For example, while certain brain regions (e.g., orbitofrontal cortex) have been associated with representing the state space of tasks during learning (Wilson et al., 2014), real-world learning often does not have a tractable number of discrete states to learn (van Opheusden & Ma, 2019). The implementation-agnostic concept of representational dimensionality allows for a high-level description of learning that is compatible with a variety of domains and theories. In practice there are many constraints on such a dimensionality, ranging from sensory effectors and neural architectures to time pressures on real-time actions. Indeed, the very architecture of the human prefrontal cortex in some ways seems optimized for representing tasks, goals, and predictions (Soltani & Koechlin, 2022), but every brain is limited in capacity and fidelity.

Reviews of Learning—Specificity Along Time and Domains

With learning understood as comprising interacting subprocesses which may be differentially determinant of outcomes at different points in the learning trajectory, it is easy to see that much of learning research is placed squarely within one or another of these time-evolving processes. Methodological constraints lead learning to be studied in ways that make direct comparisons between processes difficult. As noted by van Opheusden and Ma (2019), comparing the processes that sustain initial exploration to those that facilitate extensive expertise poses serious methodological challenges.

In addition, much of learning research tends to be placed squarely within one or another specific field, such as motor learning, perceptual learning, or language learning, to cite a few. To the extent that interplays or contrasts are considered, it may then be challenging to approach such research from the more abstract perspective that this article takes. Indeed, within learning disciplines, copious excellent reviews have been published on such topics as learning in automatic habits (Ashby et al., 2010), science and math education (Fyfe et al., 2014), motor learning (Dayan & Cohen, 2011), category learning (Markman & Ross, 2003; Richler & Palmeri, 2014), statistical learning (Fiser & Lengyel, 2019), and perceptual learning (Seitz & Dinse, 2007; Watanabe & Sasaki, 2015). In parallel, a variety of neural systems and/or mechanisms have been considered, such as striato-cortical loops (Cataldi et al., 2021; Liljeholm & O’Doherty, 2012; Shohamy, 2011), neuromodulators (Roelfsema & Holtmaat, 2018), and medial temporal structures (Squire, 1992; Zeithamova & Bowman, 2020), as well as more theoretical predictive coding (Aitchison & Lengyel, 2017) or reinforcement learning (Cazé et al., 2018) models.

This article draws examples from several disciplines, but primarily addresses skill learning, rather than declarative knowledge learning or episode-related learning, which can be achieved with only a single exposure and is therefore somewhat independent of the broad conceptual framework outlined here. In a similar vein, this article focuses on learning that is model-based, as opposed to model-free. While this terminology comes largely from the literature on reinforcement learning, parallel distinctions can be drawn from the literature on behavioral control (e.g., “reflective” vs. “reflexive”), computational approaches to decision-making (e.g., “caching” vs. “tree search”), and related fields (Daw et al., 2005). Briefly, a basic distinction exists between learning that integrates knowledge and behavior as an efficient and unitary entity (i.e., caching a reflexive action without using a model) as contrasted with learning that involves predicting future states given possible actions (i.e., learning a model to reflect on previous experiences and search for the best current action). While it is true that real-world learning situations may often include both model-based and model-free learning, here only model-based learning is directly addressed.

Early Learning: Identifying the Learning Space

The earliest points in learning are defined by the need to understand the goals and structures relevant to the task at hand. Initially this involves sufficiently broadening the learner’s representational dimensionality beyond what was previously known. This expansion is likely to be a key part of human intelligence, with their enhanced ability to learn as adults being predicated on rapidly exploring and mapping new task sets (Domenech et al., 2020). Initial experience with a to-be-learned domain necessitates an expansion of the recruited resources commensurate with the expanded dimensionality of potentially relevant information or actions. Expanded resources for the purpose of exploring novel dimensionality may be considered in terms of neural circuits (Lövdén et al., 2020), behaviorally (“exploration” behavior), or in the recruitment of the attentional control and working memory necessary to explore and encode possibly relevant dimensions of the environment.

For instance, when first learning to drive a car, there are a multitude of important features that are irrelevant to passengers. The transition from only being a passenger to being a driver involves expanding one’s consideration of relevant dimensions such as explicit signals from the environment (e.g., other drivers’ turn signals), one’s own motor movements, (e.g., coordinating precise foot movements), as well as proprioceptive signals due to the car’s motion. A new driver must expand their consideration and integration of sources of information available to them beyond what they would have learned as a passenger. Much of this learning may be done through explicit communication but much of it must also be learned through attempting novel actions and experiencing the consequences. Thus, while the overarching goal of “driving to my brother’s house” may be explicitly known by the learner, the structure of the task and its myriad subgoals (e.g., shifting into a gear) must be addressed and integrated into the learner’s skillset.

Early learning is therefore characterized by defining the space in which actions can be selected. Given the high dimensionality of the real world, learners combine features and possibilities to understand a structure or lower-order dimensionality. For example, in learning to distinguish two categories of sine wave sounds that may vary on two dimensions, Roark and Holt (2019) found that learners appeared to treat stimuli as if they arose from a single dimension rather than the true two orthogonal stimulus dimensions. Surprisingly, however, this single dimension was similar regardless of the true category discriminant that was supposed to be learned. That is, when the category boundaries were compatible with a latent dimension in which sound properties were positively correlated, participants learned quite well. In contrast, participants fared poorly in a condition where the category boundary assumed other relations between sound properties. The authors speculated that participants were utilizing their prior experience to inform their choices and were unable to unlearn these priors during the task. More broadly, to make sense of the world, learners must develop a sense of its relevant dimensionality and structure, a process crucially informed by prior experience.

Communication and Prior Knowledge

The expansion of resources and the exploration of a newly increased dimensionality facilitates an understanding of the important (discriminative and varying) dimensions while learning to discount unimportant (nondiscriminative and invariant) dimensions (Kellman & Massey, 2013; Makino et al., 2016). In most humans, this may often be accomplished through language and other symbolic communications, while other animals, very young humans, and many machine learning algorithms must often proceed through this phase without the assistance of such externally directed search.

In the case of learning to drive, certain aspects of the learning are accomplished through explicit instructions. A teacher can explain the meaning of signs, road markings, and safe distances to remain from other vehicles. In this case the features of the environment that are newly relevant to the learner are guided by another person’s prior learning; this is in essence generalization of learning from one person to another, which humans tend to excel at (Nowak & Komarova, 2001).

On the other hand, certain aspects of learning to drive are difficult or impossible to convey with language. Some important sources of information, such as proprioceptive signals, are difficult to verbalize. Many other aspects of driving are simply too complex or combinatorically numerous to be enumerated by a teacher. While basic principles can be explained, no complete explanation could identify the infinite variations of road layouts, traffic, pedestrians, weather, differences between cars, and other factors. Instead, each time the learner encounters a novel situation or combination thereof, an effortful process is necessary to attend to the immediately relevant features of the environment.

In an example from the fields of psychology and neuroscience, consider a spatial span task. Participants see a set of unevenly placed blocks and their attention is directed to the blocks one at a time (e.g., by changing color on a computer screen, or by an experimenter tapping them in the classic “Corsi” task; Berch et al., 1998; Cochrane & Green, 2021). The experimenter may communicate explicit instructions prior to starting the task, thereby directing the participant’s attention to the fact that they will need to tap on the blocks in the same order that the experimenter tapped on them. Even in the absence of explicit communications, however, some behavioral policy would arise (Traulsen et al., 2010). Here it is likely that most people would be able to quickly “figure out” that they should repeat the sequence that they see, although this “figuring out” would simply be a reliance on prior knowledge of some sort. But what if the true task was something besides simple imitation? What if the participant was supposed to, say, point at the third-to-last block that was tapped (a “running span” or “updating” task)? Or perhaps the participant was supposed to repeat the sequence in the reverse order (a “backward span” task)?

How would participants decide to imitate the sequences they see? In principle there are an overwhelmingly large number of options; the participant could point to only the first block that was highlighted, tap out a tune as if each of the blocks represented a key on a piano, or even refuse to do anything. Presumably the actual choice to imitate instead would be influenced by prior experience or even by evolutionarily conserved instincts to imitate. Many behaviors in the world, ranging from games to modes of formal instruction, involve one person repeating the actions of another. The choice of the participant to repeat the sequence would therefore be a generalization of this previous learning that is combined with possibly even more basic instincts, namely, that imitation tends to be an appropriate strategy.

The example of the block-tapping task can be instructive to a more abstract set of principles. In the absence of explicit instructions, participants’ understanding of the appropriateness of any pattern of behavior must rely on other signals to reduce ambiguity. Signals reinforcing or diminishing behaviors may be largely absent, in which case the participant would rely on their prior knowledge, or alternatively feedback signals may be present. In this latter case (e.g., after tapping a series of blocks, either a red light or a green light is illuminated), the combinations of stimuli, behaviors, and feedback would be integrated in order to better develop an understanding of what possible behaviors would be valid (the structure or “rules” of the context, e.g., “tap blocks, but do not tap any block more than once in a single trial”). Within the dimensions of valid behaviors within the context, the participant would also identify which behaviors would lead to success, that is, how to choose a valid behavior that also satisfies the goals of the task.

Prior experience may therefore influence early learning in a variety of ways. To the extent that prior experience conflicts with relevant dimensions, it may be difficult for learners to complete even the initial phases of learning (Roark & Holt, 2019; Wimer, 1964). In contrast, various types of previous experience may facilitate novel learning. For example, superficially novel learning may be facilitated by having a shared structure with prior learning. In a study of learning and transfer of perceptual discrimination abilities, for example, it has been found that consistent features of sequentially trained task structures (e.g., the timing of stimuli and the fact that the perceptual discriminant was in the middle of the distribution of presented stimuli) led to faster subsequent learning on novel tasks. This was the case even though direct transfer of perceptual abilities (i.e., initially improved performance) on the novel tasks was not observed (Kattner et al., 2017). Importantly, the amount of variety of previous perceptual learning tasks (e.g., training on a color lightness task followed by an orientation discrimination task, and so on) accelerated subsequent generalization relative to generalization after an equivalent amount of training on a single task with the same structural features. That is, variety in training appeared to improve participants’ abilities to learn on novel tasks independently of their perceptual abilities on the particular discriminations required by those tasks. While prior knowledge and verbal communication may facilitate the identification of the learning space, so can attentional and cognitive control skills.

More broadly, early learning has been observed to generalize more than later learning (Jeter et al., 2010). Explanations for this pattern of effects emphasize that the learning necessary early on may share more features with a wider variety of other tasks (Ahissar & Hochstein, 2004) or similarly that the skills and decisions refined by early learning are those that are shared by future tasks (e.g., motor mappings or more generally “learning how to do the task” within the context of computerized experiments). The route to generalization may be seen to be the result of increased processing, possibly in terms of resources recruited (e.g., more widespread attentional allocation or neural activity in previously less-active areas; Bavelier et al., 2018) and possibly in terms of the task or stimulus space being considered (i.e., “exploration” behavior). While later learning utilizes inhibitory processes to increase efficiency and reduce representational dimensionality, these processes are less involved very early in learning. The shift from excitatory (i.e., glutamate-dominant) to inhibitory (i.e., GABA-dominant) has in this sense been understood as gating excitation–associated plasticity or inhibition-associated stabilization (Barron, 2021; Shibata et al., 2017), wherein an inhibition-dominant state is associated with a more restricted representational dimensionality (i.e., efficiency, robustness, and likelihood to interfere with future learning).

In practice, the generalization of early learning is difficult to disentangle from other task features such as difficulty or overall amount learned. For instance, Schmidt and Bjork (1992) demonstrated several cases of verbal learning and motor learning in which generalization was increased when initial learning appeared to have been less successful. One example is that of interleaved training compared to blocked training, in which variety in task demands (e.g., stimuli) during training often inhibits initial learning yet enhances generalization when compared to training in which task demands are consistent within blocks of trials (Baddeley & Longman, 1978; Dunlosky et al., 2013; Hussain et al., 2012). The initially poorer learning in interleaved training can be understood as keeping learners in an earlier phase of learning, in which there is an expansion of resource recruitment and representational dimensionality.

Attention and Cognition

Attention may take several forms (Chun et al., 2011), and it is instructive to identify the most relevant forms of attention at different points in learning. The first phase of learning is characterized by, and is constrained by, the attentional ability to define the task features relevant to the learning context. Specifically, the learner has to sustain attention to possibly relevant features of the task in order to develop knowledge regarding the useful dimensions to use for task completion. In the absence of prior knowledge guiding attention and prediction, sustained effort is necessary to maintain goal-orientedness in the unknown space. Given some goal and a lack of knowledge regarding the relevance of features in the environment, very little information is available to inform selection—rather, a broad-based and relatively resource-intensive increase in attentiveness (i.e., sustained attention) must be recruited and tuned to features likely to be rewarded in the environment. This is even the case in contexts in which a specific goal itself is not explicitly recognized, for example, in the “no instruction” Corsi task. The participant would not be given a proximal goal (e.g., “tap the blocks in order”) but they would still presumably have some more general goal (e.g., “I’m supposed to do something”) that would drive their learning. Selective attention may also play some role in early learning, to the extent that feature discovery may be facilitated by internal boosting of certain signals (LaBerge, 1976). A primary goal of early learning, then, is to identify what signals are necessary to boost, with attention and reward serving as potent boosting mechanism (Watanabe & Sasaki, 2015).

Learning in this initial phase tends to proceed very quickly. Relevant dimensions may be widely explored, but due to the remaining complexity, only a short prediction horizon may be maintained. That is, the temporal scale on which learning occurs is short in two interrelated senses: Because predictions and confirmations (or disconfirmations) are being iterated rapidly, improvements in performance occur quickly even while the improvements are grounded in relatively superficial acquisition of knowledge (Newell & Rosenbloom, 1981). In such cases, ambiguity leads to broad-based attention and large learning steps increasing the learning space considered (Behrens et al., 2007). Such short time horizons are compatible with Bayesian accounts of learning and neural computations, in which high environmental variability induces a fast learning rate (in this case, larger magnitudes of updating in response to evidence; Behrens et al., 2007). Under this view, environmental variability corresponds to the extent to which the relevant dimensionality is poorly understood and under these conditions, the anterior cingulate reflects an on-line estimate of variability and a corresponding large learning rate (i.e., allocation of “salience” or assessment of relevance of stimuli, in conjunction with striatal prediction error signals; Behrens et al., 2007).

From a broader perspective on the neuroscience of early learning, such early phases of learning are likely to involve the recruitment of novel neural circuits commensurate with the novelty of the learned task or knowledge (e.g., expansion into relatively undifferentiated candidate circuits in motor cortices; Lövdén et al., 2020). Expansion of processes is accompanied by the possibility of exploring novel combinations (e.g., of neural circuits, perceptual features, motor movements, or concepts) and may be associated with increases in neural activity (e.g., in primary and secondary motor regions during early motor learning; Wymbs & Grafton, 2015; in visual cortices in early perceptual learning; Yotsumoto et al., 2008). This increase is in contrast to decreases in cortical activity often observed during later learning.

Early Learning in Video Games

One environment that provides a structure for understanding early-phase learning is video games. Games are more limited than the real world and therefore provide constrained settings to understand learning, yet within these constraints they can be quite complex and provide insights into what players are learning. During the earliest phase of learning, video games use a variety of methods to teach the player the relevant dimensionality. Interestingly, the best-designed video games make limited use of written or verbal instructions, but rather excel at employing nonverbal cues relying on salience or reward to guide the learner’s knowledge of the task at hand, whether it is stacking colored candies or evading enemy starships. The constructed environment of the game also relies on structural components that are designed to align with the trajectory of learning, such as introducing elements of play one at a time; the player has to learn to walk before they can run, or drive, or shoot a bow and arrow. By introducing dimensions (e.g., items and goals) gradually, the game reduces the possibility that the learner will find themselves at a point without knowing what the relevant actions would be. Such scaffolding has a long history in other fields as well, most notably in education with the seminal work of Vygotsky on the zone of proximal development (Wertsch, 1984). Yet, entertainment video games are quite unique in their seamless use of nonverbal scaffolding cues, whether they enforce acquisition of prior knowledge, properly orient the player’s attentional resources to the to-be-learnt content, or astutely distribute rewards. Such subtleties compose a savoir-faire that many educational video games have struggled to master, as top-down instructional design is often at odds with a more effective approach of bottom-up guided learning (Gee, 2005; but see, e.g., Hess & Gunter, 2013; Parong & Mayer, 2018; Weitze, 2014).

Intermediate Learning: Carving Out the Learning Space

Once early learning has established the relevant dimensionality of the task at hand, continued learning acts to refine the learner’s knowledge of task structure, feature relevance, and predictive models. Early learning may involve expansion of recruited resources and exploration of novel dimensionality, but such an increased recruitment is inefficient and therefore unsustainable (Makino et al., 2016). The space of possibilities must be reduced to improve efficiency and free mental resources for other tasks. Further learning involves the development and evaluation of models of the environment, or equivalently, the ways in which the features explored in the earlier phase may be combined or be pruned and the representational dimensionality may be reduced and refined. By testing and evaluating the representational patterns that may link sensory stimuli with responses, internal models of the relevant task are compared and superior models for the task at hand are retained, while other, previous models may begin a process of forgetting. Such a selection process additionally leads to the rejection of the nonselected representational combinations.

Communication and Prior Knowledge

Explicit communication is less critical during an intermediate phase of learning compared to earlier learning, due to the ongoing utility of prior knowledge developed at earlier points of learning. An intuitive example of the shift from early-phase learning to intermediate-phase learning occurs in the observed effects of stimulus labels in perceptual discrimination (Ahissar & Hochstein, 2004; Boutonnet & Lupyan, 2015; Lupyan, 2008b; Thierry et al., 2009). Namely, language-mediated reductions in ambiguity provide benefits (e.g., easier and faster processing) as well as potential disadvantages (e.g., being locked in the wrong part of the learning space), such as warping of perception by category knowledge (Lupyan, 2008a). For example, while a reliable (i.e., matched) cue word speeds up visual processing of a subsequent stimulus, an unreliable (i.e., mismatched) cue word actually slows the visual processing of that stimulus relative to a nonlinguistic cue (Boutonnet & Lupyan, 2015). Such language-mediated changes effectively let the learner skip past the expansion of representational dimensionality involved in early-phase learning and instead work within the structure provided by the language, while accepting the constraints of that externally provided structure.

Each of these outcomes has a direct bearing upon the expected generalization of learning as well. While generalization that leverages reduced task dimensionality may occur if the demands of the original and those of the subsequent task overlap, “negative transfer” may also occur if later task demands require processing of dimensions that were down-weighted in earlier learning (Wimer, 1964) or in opposition to those that earlier learning enhanced (i.e., “proactive interference”; Cothros et al., 2006). In this sense, learning a second language presents an interesting test case for the implications of reduced dimensionality due to prior learning. It appears to be the case that adults may find second language learning more difficult than children do in part due to their extensive previous experience of a particular language (i.e., due to proactive interference; Brooks & Kempe, 2019).

Overall, instead of communication in the form of explicit instructions found in earlier learning, intermediate learning is largely guided by the generalization of prior knowledge that has been accumulated. Of course, as learning progresses, this accumulated knowledge is subsequently applied; a learning event combines prior knowledge and external evidence to update the learner’s knowledge, but this new estimate of the state of the world is then used as the prior knowledge in subsequent learning events. The cycle updating prior knowledge to inform ongoing learning provides a continuous refinement of the dimensions being represented as relevant to a given task context (Knill & Pouget, 2004).

Attention and Cognition

As learning progresses, the absolute rate of improvement (e.g., expected increase in accuracy on a task for each training trial) tends to decrease. While such a decrease over time may be characterized by a single continuous learning curve (Heathcote et al., 2000; Newell & Rosenbloom, 1981), several qualitative changes in timescale are likely occurring. For instance, in learning tasks that attempt to recreate the types of exploration and exploitation necessary in the real world, there is evidence that the time horizon for predictions increases over learning (van Opheusden & Ma, 2019).

An increased time horizon may imply more effortful prediction. This is likely to only be transitory, as continued learning tends to be associated with a decrease in the effortful processing necessary. Each of these features can be explained by a reduction in the representational dimensionality which both allows a reduction in resource-demanding exploration (Lövdén et al., 2020) and in turn a longer temporal integration and prediction window, thanks to the freed resource. This process effectively normalizes neural activity to enhance the learned signal and suppress sources of noise (Dosher & Lu, 1998; Reynolds & Heeger, 2009). In conjunction, prior learning guides the allocation of attention (Hutchinson & Turk-Browne, 2012) such that internally generated feedback signals can inform subsequent learning. Indeed, the very nature of these “internally generated” feedback signals are predicated on error monitoring that must be present. Such monitoring as well as effortful attentional control are recruited less as learning progresses and gradually shifts to the last phase of learning (Bavelier et al., 2018; Fitts & Posner, 1967; Radulescu et al., 2019).

Together these effects indicate an increase in efficiency of processing as learning proceeds. Neuroimaging studies have likewise supported efficiency-based mechanisms of learning, with training-related decreases in activity observed in various domains (often interpreted as selectivity; Bassett et al., 2015; Chen et al., 2015; Lövdén et al., 2020; Makino et al., 2016; Yotsumoto et al., 2008). Selectivity of activity and increased efficiency is accomplished largely through increased inhibitory activity which simultaneously increases the robustness to future learning (i.e., decreased retroactive interference) while decreasing generalization (i.e., increasing specificity and proactive interference; Shibata et al., 2017).

More specifically, decreases in overall activity are accompanied by the development of more specific networks, often linking lower-level and higher-level areas of processing. In perceptual-motor tasks these include connections with sensory or motor cortices (Chen et al., 2015; Makino et al., 2016), while model-based reinforcement learning has been linked to activity in dorsomedial striatum and prefrontal cortex (Daw et al., 2005; Shan et al., 2015). Striatal structures are of particular importance in prediction and feedback signals that refine learned models of the task and decision space (Shohamy, 2011). Such signals shape online adjustments in attentional allocation that guide the learning in a self-supervising process that is closely linked to prior learning and generalization (Roelfsema & van Ooyen, 2005). Then, as novel evidence is observed regarding the trade-offs between increased dimensionality (exploration) and reduced complexity (exploitation), top-down evaluations of models and actions allow the learner to refine their models for successful behaviors (Domenech et al., 2020). The ongoing process of this refinement continues through the intermediate phase of learning.

Intermediate Learning in Video Games

Various examples of intermediate learning are present in video games which are often designed, at least implicitly, with extended intermediate-phase learning in mind. Well-designed games reduce the amount of explicit instruction as games progress, with counterexamples to this principle that range from frustrating to comical—after a hair-raising high-speed chase, no player really wants to have the game distract them by offering a tutorial on how to steer a car. More implicit forms of reward and feedback are retained, however, and these continue to guide the player’s skillset and playing experience. For instance, in many games that allow crafting or modifying items, the core mechanics are explicitly taught early on (e.g., upgrading armor in action role-playing games (RPGs) such as The Elder Scrolls). However, it is only through hours of gameplay, including repetition, exploring new options, and learning what strategies have led to successes, that players are able to learn the more complex and intricate world of possible modifications and how they can create items that fit their personal playing preferences. Item modifications are also likely to be just a small part of the overall game; the player learns how to modify items to enhance their playing experience in other aspects of the game (e.g., armor that best serves their playstyle in completing certain quests) that they are likewise continuing to learn.

Late Learning: Optimizing and Automatizing Task Performance

Following extensive learning, behaviors become automatic and low-dimensional, which is a hallmark of expertise in a given domain. Experts are able to complete complex tasks without recruiting controlled attention, thereby allowing cognitive resources to be allocated to other thoughts or tasks (Bavelier et al., 2018; Kellman & Massey, 2013). Such automaticity may be problematic, however, when features of the present environment or task change in ways that are incompatible with the low-dimensional representations that have been learned. For example, regular users of keyboards have difficulty effectively using keyboards with changed layouts (i.e., motor mappings), despite the fact that the same symbols and motor effectors are used regardless of the keyboard layout. More strikingly, typing is also disrupted when a keyboard retains its layout but does not have any tactile feedback (i.e., laser keyboard), even though the motor mappings between symbols and actions remain constant (Crump & Logan, 2010). This is a case in which initial learning fails to fully transfer to a new context even when the differences between contexts are seemingly irrelevant to task performance.

The relative automaticity of stimulus–action pairings with extensive learning is the basis of habit formation (Graybiel, 2008; noting that powerful neuromodulators, such as those associated with traumatic events or psychoactive substances, may trigger habit-like learning with minimal experience—unlike the cases considered here). Alongside a reduction in the role of controlled attention (and its associated frontal and parietal cortices; Bassett et al., 2015), such overlearned patterns of knowledge and decision-making are likely to be mediated by a shift to dorsolateral striatal control signals (Shan et al., 2015; Yin & Knowlton, 2006). This shift to an efficient decision process is associated with model-free learning (i.e., utilization of learned associations and a lack of mental simulation of possible outcomes; Daw et al., 2005) and an insensitivity to the devaluation of rewards, a hallmark of habitual behaviors.

Communication and Prior Knowledge

Late learning is necessarily the product of a great deal of previous experience (relative to the complexity of the domain). There is thus ubiquitous need to adapt to local changes when developing expertise (i.e., “near transfer”), but this stands in contrast with the commonly observed lack of generalization to more novel features or tasks (i.e., “far transfer”; Barnett & Ceci, 2002; Fahle, 2005; Klahr & Chen, 2011). The apparent contradiction may in part be a side effect of training until low-dimensional and inflexible representations have been learned. For example, when considering typists’ reliance on consistency in keyboards, it can be seen that initially learning to type involves a representation of clearly task-relevant features (i.e., mappings between symbols and actions) as well as features that are less clearly task-relevant (i.e., tactile feedback). The resulting low-dimensional representation can be surprisingly sensitive to changes in the task or environment, in the sense that they are unable to adapt to those changes. The sequenced-task perceptual learning example highlights this phenomenon (Kattner et al., 2017); while typical perceptual learning was not expected to generalize across tasks with different features and goals, the inclusion of superficially goal-irrelevant structural features shared by the tasks led to learning that was able to generalize in the form of accelerating learning. That is, the lack of generalization after extensive amounts of learning is likely to be due to the continued reduction of the representational dimensionality that begins earlier in the trajectory of learning. As learning progresses, then, reliance on attentional control is expected to continually decrease, which may in fact lead to even greater insensitivity to changing environments and reliance on (possibly no-longer-relevant) knowledge (e.g., insensitivity to devaluation; Daw et al., 2005; Seabrooke et al., 2019).

Though a lack of generalization tends to be expected from a late learning phase, such a case is not inevitable. The implications for generalization of normalization and reduced dimensionality may be seen in the case of double-training of visual perception. When trained on a visual discrimination task in one location in the visual field, there is often a lack of generalization to other locations. Such generalization has been observed, though, when during training an unrelated task was also completed at the location of the future generalization test (Xiao et al., 2008). This has been interpreted as evidence that attention to the irrelevant-task location prevented spatial inhibition to that location and thereby facilitated generalization (Watanabe & Sasaki, 2015). Said differently, training at a specific location would be expected to lead to normalization and suppression of visual locations that were deemed to be irrelevant (i.e., in a reduction of the representational dimensionality of the visual inputs), while another task at a different visual location prevented the suppression of that location in the context of the task’s dimensionality.

Further, the superficially unchanging nature of expertise does not mean that an expert does not continue learning. The timescale of change may instead be extremely long, so much so that performance may seem relatively stable. Such stability should not be treated as complete, however. Incremental improvements continue to occur and often involve rare events or information that only occurs when long timescales and prediction horizons are considered. As such, putatively stable performance in any task setting may be treatable as such for practical purposes. In reality, though, various factors contribute to ongoing improvements. In activities such as music performance or chess playing, stable plateaus in performance levels can be overcome through effortful practice and, ideally, explicit feedback from another person with a large amount of relevant domain knowledge (Ericsson et al., 1993; Macnamara et al., 2014). In other areas, the changing nature of the environment may lead learners to consider an increased representational dimensionality (Behrens et al., 2007). In effect, this would be the renewal of an earlier-learning increase in exploration and representational dimensionality. Such a change away from the narrow space considered in late learning is difficult, however, due to the specificity of the represented dimensions at that time. Bayesian learning theories describe this phenomenon as difficulty learning when learned priors only allow for a consideration of a restricted range of the overall task space; in such cases, updating knowledge or belief away from those narrow priors requires a large amount of contradicting evidence and effort. Indeed, anyone who has tried to “unlearn a bad habit” has likely found that it was very difficult due to the very automaticity of the habit. Habits are of course even more difficult to unlearn when they are linked to primary reinforcers (e.g., psychoactive substances or foods), as this situation means that the habit provides a “reward” with minimal required effort while being insensitive to subsequent removal of that reward association.

Attention and Cognition

Late learning is characterized by a reduction in explicit allocation of cognitive effort and gradual refinements of the low-dimensional representations of task and stimuli and apparent automatization of behaviors (Logan, 1988; Shiffrin & Schneider, 1977). In this phase of learning, improvements are progressively smaller, leading to relatively stable performance. Such smaller improvements may be due to progressively accurate tuning of representational structures to the physical, temporal, or statistical regularities of the environment.

As learning transitions from intermediate phase to its later phase, attended representations are more likely to be at abstract levels (e.g., attention to task representations rather than feature representations; Wilson et al., 2014) than at a feature level as sensory or motor processing becomes more automatic and thus less linked to attentional control. In sum, once the dimensionality of a task context has been sufficiently reduced, each new stimulus can be mapped to a response choice without effortful processing, allowing in turn efficiency increases and the release of previously necessary resources. For instance, Bassett et al. (2015) found that over 30 sessions of training on simple motor sequences, neural activity in both visual and motor cortices remained high during sequence execution. However, these two areas became more independent over time. Even more strikingly, by the end of the training there was a drastic reduction in the recruitment of higher cortical areas associated with planning or effortful control. The learned motor sequences had instead been learned to the point of automatization and modularization of response patterns in the necessary low-level cortices.

Late Learning in Video Games

The development of expertise in the context of video game is particularly interesting; as with the previous phases of learning, video games provide a rich test-bed for testing the effects of both simple manipulations (e.g., repeating a given stimulus–response mapping over and over again) and more complex sensorimotor manipulations (e.g., achieving more and more complex sequences of moves as performance improves). A noted advantage for games as a way to promote expertise is the incredible time on task they deliver, in line with the role of deliberate practice in achieving expert levels (Ericsson et al., 1993; Macnamara et al., 2014; Röhlcke et al., 2018). Indeed, certain genres of video games are designed to deliver extended periods of play, from tens of hours to thousands of hours of play. Achieving such time on task is not an easy feat in terms of game design. While video game play is often cited as organically delivering engagement and augmenting motivation, maintaining such engagement for tens or thousands of hours requires exquisite game design skills (Mayer, 2016; Rigby & Ryan, 2011). Indeed, promoting time on task comes from properly aligned combinations of many levels of game design, from a rich story that maintains interest, to clear goals at many different timescales, or the possibility of updating one’s social status in the game world.

The expertise developed through video game play, like most other expertise, tends to be highly specific to the very game played (Stafford & Dewar, 2014). Angry Birds experts do not display outstanding skills at evaluating the intercept position of a ball in an in-person soccer game; similarly, Tetris experts are not outstanding at mental rotation in general. Rather, even brain teaser games, often advertised as cognitive enhancers, lead to rather narrow expertise, limited to the very training conditions experienced within the mini-game (Owen et al., 2010; Stojanoski et al., 2021). Again, such results are perfectly in line with the efficiency gain on the very trained task that the expertise stage of learning facilitates.

There are two ways in which video games uniquely inform the field of expertise. First, many of today’s video games consist of dynamically evolving worlds, in which the rules of play and the overall learning challenges change from one season to another. Imagine a game of chess which every year allows for a new chessboard configuration and new rules for pieces’ movement! Expert video game players, at least for genres such as shooter games, multiplayer online battle arenas (MOBAs), or real-time strategy games, are regularly asked to face entirely novel situations in a way that strikingly departs from most other fields of expertise such as music, sports, or chess play. Exposing the player to a novel environment that builds on past experience is akin to moving back the player to an intermediate learning stage where some novel exploration of the learning space is needed and again proper dimensionality reduction needs to operate, while simultaneously exploiting previous knowledge. Such back and forth between use of known routines (i.e., low-dimensional representations) and the learning of novel contingencies (i.e., new dimensions) is likely to be highly beneficial to cognition. Note that brain teaser games lack such complexity and so would not be expected to deliver such broad cognitive benefits. This represents an interesting field of research for future work; additional evidence is necessary to understand the mechanisms of effective video game training and the reasons for certain patterns of specificity (Bediou et al., 2018; Dale & Shawn Green, 2017; Green et al., 2014).

Second, and in line with different video games having different cognitive impacts, video games that contain action-like mechanics appear to enhance cognition. Action-like video games include first- and third-person shooter games as well as action-RPG, real-time strategy, and racing sports games. What these games all have in common is that they require responses under time pressure, rely heavily on divided attention, as well as call upon focused attention, as when having to aim at enemies in shooter games (Cardoso-Leite et al., 2020). Such flexible shifts in attentional modes between divided and focused attention is likely at the source of the heighted attentional control found in action video game players (Bavelier & Green, 2019; Bediou et al., 2018). Crucially, such gameplay also seems to facilitate generalization of learning through learning to learn (Bavelier et al., 2012; Bejjanki et al., 2014). In a set of studies in which participants trained on action video games were compared to those trained on nonaction video games, subsequent learning in both perceptual and working memory tasks was faster in the group trained on action video games (Zhang et al., 2021). While generalization of learning is a phenomenon studied for many decades and in domains ranging from explicit associations (Harlow, 1949) to perceptual decision-making (Kattner et al., 2017), the underlying mechanisms are still poorly characterized. In particular, two rather different views still need to be adjudicated: Direct transfer leading to immediately enhanced performance on novel tasks, versus learning to learn (e.g., improved domain-general resources involved in learning may speed up subsequent learning processes Bavelier et al., 2012).

Discussion

Humans have remarkable abilities to learn, and these experience-dependent changes are essential for typical behaviors ranging from low-level perceptual discriminations to complex reasoning and decision-making. In this article, human learning has been considered from a broad perspective in an attempt to conceptually link various approaches to understanding how learning progresses from initial experiences to expert performance. Learning is conceived as progressing across three main “phases,” using the concept of representational dimensionality to organize the progression of learning over time. Each of these is a convenient abstraction, and real-world learning is unlikely to have discretely identifiable phases nor have a clear trajectory of representational dimensionality. The utility of these heuristics is evident, however, in their ability to bridge domains of learning research that may otherwise be isolated in separate disciplines.

Across these disciplines, a distinctive feature of adult skill learning is the development of expertise and its concomitant automatization of behavior, an asset when it comes to being efficient at known tasks, but a possible weakness when it comes to gracefully adapting to novel environments. Such a trade-off has been proposed to be one source of the pattern of changes in cognitive aging. A reliance on automatic patterns of behavior or learned strategies, in response to a decrease in certain cognitive or perceptual abilities, may lead to some functioning being preserved while also being less flexible than in younger adulthood in the face of unexpected circumstances (Goh & Park, 2009; Morcom & Johnson, 2015; Riediger et al., 2006).

The implications of successful learning are obvious and ubiquitous; adaptive social behaviors, occupational success, and countless other activities rely on effective learning and generalization of the learning to relevant contexts. However, given the discouraging lack of generalization in many empirical studies (Barnett & Ceci, 2002; Fahle, 2005), it is often worth stepping back and considering approaches to learning that explicitly encourage generalization and subsequent learning. In a practical sense, learning for the sake of generalization should often be the goal, whether in education or in applied workforce training. In such cases, superficially fast learning (e.g., rapid reduction of errors) is unlikely to be the best goal (Schmidt & Bjork, 1992). Retaining flexibility for novel future learning is instead desirable, and this retention of the ability to work with a larger representational dimensionality may be facilitated through such strategies as training cognitive flexibility (Bavelier et al., 2018), intentionally continuing to engage effortful cognition (Dunlosky et al., 2013), or training with a larger set of features or actions (Berniker et al., 2014; Kattner et al., 2017; Schmidt & Bjork, 1992; Xiao et al., 2008). The ecosystem of video games provides ecologically valid and rich environments where the balance between automatization and flexibility as learning progresses may be studied in a carefully controlled way, despite being highly complex.

Different approaches to learning may be broadly understood as an expansion and then refinement of representational dimensionality (Lövdén et al., 2020) associated with an expansion and then a selection of cognitive and neural resources brought to bear (Bavelier et al., 2018; Makino et al., 2016). Such a broad heuristic cannot provide precise descriptions or predictions of specific areas of learning, but it does call for experimental methods that seek to characterize learning at a range of complexities so as to identify the features of experience-dependent change both earlier and later in learning trajectories. That is, implementing tasks that are neither too simple nor too high-dimensional allows for a formal characterization of the decision processes as learners progress from naïve to experts (van Opheusden & Ma, 2019). Again, video games provide an appealing path to do so by permitting relatively rich, yet tightly controlled worlds. Research at different levels of complexity and targeting different timescales can then be theoretically linked rather than remaining in isolated disciplines.

In a further attempt to link separate fields of inquiry, a heuristic description of representational dimensionality allows for a conceptual link between domains of learning research that are concerned with disparate topics. Neural activity in various loci, attentional allocation, working memory representations, stimulus feature complexity or chunking, levels of concreteness or abstractness, or other aspects of learning may be of interest to different research domains, but each can be addressed with this heuristic approach. While a sharp increase and a gradual decrease in representational dimensionality does not alone capture empirical effects on any of these characteristics of learning, it may provide a common language for connecting research across disciplines (Whittington et al., 2018).

Further Reading

  • Bavelier, D., Bediou, B., & Green, C. S. (2018). Expertise and generalization: Lessons from action video games. Current Opinion in Behavioral Sciences, 20, 169–173.
  • Behrens, T. E. J., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. S. (2007). Learning the value of information in an uncertain world. Nature Neuroscience, 10(9), 1214–1221.
  • Fiser, J., & Lengyel, G. (2019). A common probabilistic framework for perceptual and statistical learning. Current Opinion in Neurobiology, 58, 218–228.
  • Hutchinson, J. B., & Turk-Browne, N. B. (2012). Memory-guided attention: Control from multiple memory systems. Trends in Cognitive Sciences, 16(12), 576–579.
  • Lövdén, M., Garzón, B., & Lindenberger, U. (2020). Human skill learning: Expansion, exploration, selection, and refinement. Current Opinion in Behavioral Sciences, 36, 163–168.
  • Roelfsema, P. R., & van Ooyen, A. (2005). Attention-gated reinforcement learning of internal representations for classification. Neural Computation, 17(10), 2176–2214.
  • Rosenbaum, D. A., Carlson, R. A., & Gilmore, R. O. (2001). Acquisition of intellectual and perceptual-motor skills. Annual Review of Psychology, 52(1), 453–470.
  • Schmidt, R. A., & Bjork, R. A. (1992). New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science, 3(4), 207–217.
  • Seitz, A., & Dinse, H. R. (2007). A common framework for perceptual learning. Current Opinion in Neurobiology, 17(2), 148–153.
  • van Opheusden, B., & Ma, W. J. (2019). Tasks for aligning human and machine planning. Current Opinion in Behavioral Sciences, 29, 127–133.

References