Using ChatGPT Makes Your Brain Lazy? New Challenges in the AI Era Revealed by MIT's Shocking Research

Hi everyone, I'm Tak@, a system integrator. In my free time, I enjoy developing web services using generative AI.

In today's column, I'll share some fascinating research findings on how using AI assistants might affect our brains.

In recent years, large language models (LLMs) like ChatGPT have become deeply integrated into our daily lives, and their convenience makes them useful for many tasks.

But what changes are occurring in our cognitive functions as we use these convenient tools?

A study by MIT researchers, "Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task△", offers surprising insights into this very question.

How AI Assistants Change the Writing Experience: Research Background

Essay Writing and Cognitive Load

Essay writing is more than just putting words on paper; it's a highly complex cognitive task. From big-picture work like organizing ideas and building arguments to minute details like word choice and grammar, you need to manage multiple mental processes simultaneously.

This task places a significant burden on our working memory.

This is where Cognitive Load Theory (CLT) comes in handy. CLT is a framework for understanding the mental effort required for learning and problem-solving, and it's divided into three elements:

  • Intrinsic Cognitive Load (ICL): The load related to the complexity of the learning material itself and the learner's prior knowledge.
  • Extraneous Cognitive Load (ECL): Irrelevant mental effort caused by the way information is presented.
  • Germane Cognitive Load (GCL): The mental effort spent on constructing and automating mental frameworks (schemas) that support learning.

Excessive extraneous load, in particular, can hinder the acquisition of new knowledge and reduce learning efficiency.

LLMs have been shown to reduce this cognitive load.

Compared to traditional search methods, LLMs make information easier to understand and retrieve, reportedly reducing users' cognitive load by as much as 32%.

The biggest difference was particularly seen in germane cognitive load. LLMs streamline the process of information presentation and integration, thereby reducing the cognitive effort needed to build mental frameworks.

This is said to make us more willing to work on tasks for longer and also boosts productivity.

However, this reduction in cognitive load doesn't always lead to better learning outcomes.

While lower cognitive load increases productivity, users tend to engage less deeply with the content, which can compromise the germane cognitive load needed to build robust schemas.

It's also suggested that using LLMs can shift the focus of thought from active critical thinking to passive content verification.

Three Groups and Four Sessions

To uncover these cognitive costs, MIT's study divided participants into three groups for an essay writing task:

  • LLM Group: Used only ChatGPT-4o.
  • Search Engine Group: Used only websites like Google Search (AI features prohibited).
  • Brain-Only Group: Wrote essays using only their own knowledge, without any tools.

Each participant experienced three sessions within the same group, totaling 54 participants. Additionally, in a fourth session, involving 18 of these participants, group assignments were switched:

  • LLM to Brain-Only Group: Switched from using LLM to writing without tools.
  • Brain-Only to LLM Group: Switched from writing without tools to using LLM.

The study recorded participants' brain activity with an electroencephalograph (EEG) to assess cognitive engagement and load.

Essays were also analyzed using Natural Language Processing (NLP) and evaluated by both human teachers and AI. After each session, participants were interviewed to gather their subjective experiences.

Unpacking Cognitive Strategies from Brain Activity: Surprising EEG Results

More Active Brains with Less External Support

One of the study's clearest findings was that brain connectivity patterns differed significantly across the LLM, search engine, and brain-only groups.

The less external assistance provided, the wider the range of brain activity and the stronger the connections tended to be.

  • Brain-Only Group: Showed the strongest and most widespread network.
  • Search Engine Group: Had an intermediate level of activity.
  • LLM Group: Showed the weakest overall connectivity.

This was supported by the fact that the LLM group had up to a 55% reduction in total dDTF (dynamic direct transfer function) connectivity strength in lower frequency bands like alpha, theta, and delta waves compared to the brain-only group.

dDTF is a method for analyzing the "effective connectivity" of different brain regions in the frequency domain, showing how they influence each other.

Simply put, by observing how much information one brain region sends to and influences another, we can gain a deeper understanding of brain coordination patterns during cognitive tasks.

The study used a 32-electrode EEG to measure information transfer between each electrode on the brain's surface.

Alpha Waves: Key to Creative Thinking

Alpha waves (8-12 Hz) are strongly associated with internal attention, semantic processing, and creative thinking.

  • Brain-Only Group: Showed significantly stronger alpha wave connectivity. Notably, a very strong connection from the left parietal lobe (P7) to the right temporal lobe (T8) was observed, along with strengthened connections from the parieto-occipital region to the prefrontal cortex (PO4→AF3). This suggests that the brain was more deeply involved in internal processing to generate ideas and retrieve information from memory without external help.
  • LLM Group: Lower alpha wave connectivity suggests that the LLM shouldered some of the creative burden, meaning participants didn't need to rely as much on purely internal semantic generation.
  • Search Engine Group: Tended to have lower alpha wave connectivity, which might be consistent with the "Google effect," where the availability of online information reduces reliance on internal memory.

Beta Waves: Concentration and Executive Function

Beta waves (13-30 Hz) are linked to active cognitive processing, focused attention, and sensorimotor integration.

  • Brain-Only Group: Showed a slight dominance in low beta waves (13-20 Hz), particularly with stronger connections from the temporal to frontal regions. This suggests sustained cognitive and motor engagement when structuring essays without external tools.
  • LLM Group: Did not show an increase in beta wave connectivity.
  • Search Engine Group: While overall beta wave strength was slightly lower than the brain-only group, they showed dominance in many beta wave connections and many significant inputs to the central parietal region (Pz). This suggests that the brain was more focused on integrating visual information and motor aspects like scrolling from the search engine.

Theta and Delta Waves: Deep Memory and Integration

Theta waves (4-8 Hz) are deeply associated with working memory load and executive control, while delta waves (0.5-4 Hz) are linked to attention, motivation, and the coordination of large-scale brain networks.

  • Brain-Only Group: Showed remarkably higher values, with theta wave connectivity more than double that of the LLM group, and delta wave connectivity more than double that of the search engine group. This strongly suggests that writing without tools placed a greater cognitive load on participants, who were coordinating multiple cognitive elements in real-time, such as generating ideas, retrieving information from memory, and adjusting linguistic structures.
  • LLM Group: Had significantly lower theta wave connectivity, suggesting that the LLM provided external cognitive support (text suggestions, information, structure, etc.), which reduced the burden on working memory.
  • Search Engine Group: Had much weaker theta and delta wave connectivity, suggesting that the availability of the internet reduced the need for deep internal coordination. Their attention was directed externally (Browse information), and tasks like internal memory search and idea linking decreased.

Differences in Information Flow

EEG dDTF analysis also provided interesting insights into the direction of information flow within the brain.

  • Brain-Only Group: During essay writing, there was more "bottom-up" flow (from lower to higher) from semantic and sensory regions like the temporal and parietal lobes to the frontal executive system. This suggests a state where the brain internally generates content, and the frontal lobe integrates and makes decisions for expression.
  • LLM Group: Given external input from the LLM, they likely experienced more "top-down" connections (from higher to lower), such as high beta waves from the frontal to occipital regions. Their frontal cortex likely played a role in integrating the tool's contributions (external sources) and reflecting them in the overall writing.

These results suggest a potential trade-off: while using LLMs makes tasks feel "easier" for the brain and reduces cognitive load, it may weaken the internal connections necessary for creativity and deep thinking.

Serious Implications of "Cognitive Debt": Impact on Memory and Creativity

Decreased Essay Recall Ability

The most consistently noticeable behavioral difference was in the ability to recall what they had written in their essays.

  • LLM Group: In Session 1, 83% of participants (15 out of 18) struggled to recall their essays, and none could do so correctly. This problem persisted in subsequent sessions, with 6 out of 18 still unable to recall correctly in Session 3. This suggests that LLM use may bypass deep memory encoding processes, leading participants to simply read, select, and transcribe suggestions generated by the tool without integrating them into their own memory networks.
  • Search Engine and Brain-Only Groups: Did not show a decrease in recall ability. By Session 2, both groups achieved near-perfect recall, and by Session 3, 100% of participants reported being able to recall their essays. This is associated with the strong parietal-frontal and temporal-frontal connectivity observed in the brain-only group, suggesting strong integration of content into memory systems.

Changes in Sense of Ownership and Satisfaction

The "sense of ownership" over the essays also differed significantly depending on tool usage.

  • LLM Group: In Session 1, half felt "complete ownership," but three reported "no ownership," and the rest reported partial ownership. While "no ownership" responses disappeared in later sessions, more participants reported partial ownership. Some participants felt ChatGPT's output was "robotic" and felt the need to add a personal tone. Some even expressed ethical discomfort, feeling they were "cheating" by using ChatGPT.
  • Search Engine and Brain-Only Groups: Most participants reported "complete ownership." Brain-only group participants highly valued the independence and credibility of writing on their own, stating they could "focus on their own thoughts" and "share unique experiences."

The Shocking Results of Session 4

The most interesting and significant discovery in this study came in Session 4.

In this session, participants were reassigned to the opposite group from their previous three sessions, without prior notification.

  • LLM to Brain-Only Group:
    • When participants who previously relied on AI wrote essays without tools, their brain activity did not reach the peaks of Sessions 2 or 3, remaining at an intermediate level similar to Session 1. This suggests that the cognitive load previously reduced by LLM-provided suggestions and content weakened the neural activity needed for content planning and generation when writing without AI.
    • Participants in this group showed a significant decrease in their ability to recall their essays, with 78% unable to recall and only 11% recalling correctly. This suggests that reliance on AI impaired the strong fronto-parietal synchronization needed for deep semantic encoding and source memory retrieval.
    • Furthermore, participants in this group tended to focus repeatedly on limited ideas, as shown by N-gram analysis. This suggests they may not have deeply engaged with or critically examined the content provided by the LLM. The study points out that this pattern reflects the accumulation of "cognitive debt." Cognitive debt is a state where short-term deferral of mental effort leads to long-term costs such as reduced critical thinking, increased vulnerability to manipulation, and decreased creativity.
    • As someone who struggles with studying for AWS Certified Solutions Architect, I understand the convenience of this tool firsthand. However, this cognitive aspect might be the flip side of that convenience.
  • Brain-Only to LLM Group:
    • When participants who previously wrote without tools were allowed to use LLM, their brain activity showed a significant increase across all brainwave bands. This suggests that re-engaging AI as support facilitated higher levels of cognitive integration, memory reactivation, and top-down control. The study suggests that strategically introducing AI tools after initial self-directed effort could enhance engagement and neural integration, potentially being a neurocognitively optimal sequence compared to consistent AI tool use.

Word Choice and AI's "Habits": What NLP Analysis Reveals

Essay "Homogenization"

NLP analysis of essay content revealed noticeable linguistic characteristics for each group.

  • Brain-Only Group: Showed strong diversity in essay writing, with each participant demonstrating unique perspectives and word choices.
  • LLM Group: In contrast, the LLM group produced statistically homogeneous essays, with significantly less variability compared to other groups. This suggests that LLMs bias towards certain expressions and structures, leading to similar essays from users who leverage them.
  • Search Engine Group: There were also differences in the frequency of specific Named Entities (NER) used. The LLM group used the most (171 in total), with names of people and works being particularly frequent, while the search engine group used less (104 in total), and the brain-only group used the least (81 in total).

Cognitive Bias from N-Grams

N-gram analysis (sequences of consecutive words) also revealed interesting biases.

  • LLM Group: Frequently used N-grams related to "career" (e.g., "choose career," "person success"), tending towards general success stories and objective descriptions. Data from the Google Ngram Viewer also pointed to a tendency to use "third-person perspective" expressions (e.g., "he," "she") common in LLM training data.
  • Search Engine Group: For certain topics, N-grams potentially influenced by Google Search's ad optimization stood out. For example, in the topic of "PHILANTHROPY," N-grams like "homeless person" were frequently used. This suggests that because search engines display promoted information higher for specific keywords, users are more likely to be influenced by that information.
  • Brain-Only Group: Characterized by more introspective and value-based expressions such as "true happi" (true happiness) and "benefit other" (benefit others).

In Session 4's N-gram analysis, participants from the LLM to Brain-Only group showed a tendency to reuse N-grams that frequently appeared when they previously used LLM (e.g., "before speaking"). This suggests that prior AI use can leave a bias in a user's vocabulary and thought patterns.

The Gap Between Human Teacher and AI Evaluation

In essay evaluation, there was an interesting disconnect between human teachers and AI assessments.

  • Human Teacher Evaluation:
    • Teachers felt that AI-generated essays "lacked soul" and lacked personal nuances or clear arguments.
    • They gave lower scores for originality and content but highly rated language, structure, and accuracy.
    • They recognized a "distinct writing style" and "homogeneous structure" in the LLM group's essays, regardless of topic, and could even identify the writing style of specific participants.
  • AI Evaluation:
    • AI evaluations tended to rate most essays highly, averaging "4 points (good)."
    • There were significant disagreements between human teachers and AI evaluations regarding originality and content quality. AI evaluations sometimes rated essays that human teachers scored 1 or 2 points as 4 points or higher.
    • Surprisingly, AI evaluations could not identify the unique writing style of each participant, even after multi-stage tuning.

These results indicate that while AI excels at evaluating based on objective criteria, it has limitations in fully capturing human creativity and individuality.

Human teachers prioritize "depth of thought" and "personal perspective" behind the writing, highlighting the reality that these aspects are often hard to discern in AI-generated text.

Our Future and AI: Towards Better Coexistence

The Challenge of Cognitive "Training"

What this research reveals is the complexity of AI tool usage on our brain's cognitive processes.

In particular, the possibility that prolonged LLM use could lead to "cognitive debt," where the brain slackens on "higher-order thinking processes" it should ideally perform, serves as a warning.

The study showed that repeatedly writing essays without tools strengthens brain networks related to planning, language, and attentional control, mobilizing a wide range of brain regions to improve writing ability.

However, with AI assistance, this "brain training" might not function adequately. If AI takes over high-level planning like organizing ideas, the brain won't need to allocate resources to that function, potentially leading to those circuits not being sufficiently strengthened.

Balancing AI Use

This research suggests the need to balance convenience with long-term skill development when using AI in education and learning settings.

AI is a highly effective tool for routine tasks and information organization, but creative idea generation, critical thinking, and the ability to express oneself in one's own words remain crucial cognitive processes that humans should actively engage in.

We need to see AI not as a "silver bullet" but as a "tool that extends our capabilities."

This also applies to AI programmers as a programming tool. While AI can generate code snippets, it's merely a support tool for development, and I believe humans must always review and test the generated code.

As the study suggests, it might be crucial to refrain from using AI tools in the early stages of learning to encourage the brain's "full neural activity" in integrating information and thinking for itself.

Then, once a certain level of skill is acquired, strategically introducing AI for specific tasks (e.g., proofreading or diversifying ideas) could reduce extraneous cognitive load and potentially improve learning efficiency.

Protecting Our "Thinking Ability"

While this study's results are in the specific context of essay writing, it provides a profound opportunity to consider the impact of AI on our learning, work, and overall thought processes.

While mastering convenient AI tools is essential in modern society, it's equally important to make conscious efforts to prevent the decline of our inherent "thinking ability" and "creativity."

AI is not a "magic wand" for our thoughts but a "tool" to unlock our creativity. How we use this tool and how we engage with our brains is entirely in our hands. How do you want to interact with AI?

Please consider this question.

Follow me!