Get the latest crypto news, updates, and reports by subscribing to our free newsletter.
Giấy phép số 4978/GP-TTĐT do Sở Thông tin và Truyền thông Hà Nội cấp ngày 14 tháng 10 năm 2019 / Giấy phép SĐ, BS GP ICP số 2107/GP-TTĐT do Sở TTTT Hà Nội cấp ngày 13/7/2022.
© 2026 Index.vn
In 2024, a simple question circulated widely on social media: “How many letters r are in the word strawberry?” The correct answer is 3. ChatGPT consistently replied 2, and did so with confidence—sparking mockery in the tech community and raising a broader question about how large language models process text.
The issue is tied to how large language models read language. Unlike humans, who can process characters one by one, large language models break text into tokens—chunks that are often larger than a single character, such as parts of words, prefixes, suffixes, or common syllables.
With GPT-4, the word “strawberry” is not treated as 10 individual letters. Instead, it is represented as 3 tokens: str, aw, and berry. Only str and berry contain the letter “r,” so the model’s token-based representation leads it to count 2.
This behavior is described as a consequence of model architecture—built to understand meaning rather than to parse spelling at the character level—rather than a simple bug that can be patched.
The story gained additional context in late 2023 and 2024. In November 2023, Sam Altman was fired from OpenAI, and rumors followed about a secret project named Q* that allegedly achieved a breakthrough in mathematical reasoning.
Eight months later, in July 2024, Reuters reported that OpenAI was developing a model with the internal codename Strawberry. The report indicated that the codenames Q* and Strawberry referred to the same effort.
OpenAI later described the “Strawberry” name as an internal declaration: the team was building a final model intended to do what previous models could not—count the number of “r” letters in “strawberry.” When o1 was released in September 2024, OpenAI included the strawberry question as a public challenge in the interface for its older product.
According to the account, the improvement was not achieved by changing tokenization. Instead, the model was trained to “think aloud” before answering, using chain-of-thought reasoning. Rather than responding immediately, it was described as counting letters step by step, then checking again—producing the correct answer of 3.
However, this approach has a performance trade-off. The article states that it takes 20–30 seconds for o1 to answer, compared with under 3 seconds for GPT-4o. It also notes that the API cost for o1 is “many times higher” than standard models. OpenAI characterizes o1 as a model for “complex tasks,” not a replacement for GPT-4o.
The pattern returned with newer releases. In December 2025, OpenAI released GPT-5.2. When asked the same question, the newest model again answered 2.
The article attributes this to tokenization and the model’s trade-offs. It says GPT-5.x uses a newer tokenization scheme named o200k_harmony, but “strawberry” still breaks into str + aw + berry. It also states that the chain-of-thought reasoning used by o1 was not carried over into GPT-5.x, because regular users typically do not want to wait 20–30 seconds for every answer.
As of April 2026, the article says GPT-5.3 and GPT-5.4—the latest models—partly address the issue. It also claims that other systems, including Claude, Gemini, Grok, and Perplexity, “rarely” answer incorrectly, attributing this to different tokenization approaches.
It adds that o1-preview and o1-mini were deprecated since April 2025, but the “Strawberry” naming joke has continued.
The article concludes that the most notable aspect may be the choice of name itself: a reminder of a specific limitation the team aimed to overcome. In an industry where new products are often marketed with broad, promotional language, “Strawberry” is presented as an unusually direct and candid label for a technical challenge.
By Thế Duyệt
Premium gym chains are entering a “golden era” that is ending or already in decline, as rising operating costs collide with shifting consumer preferences toward more flexible, community-based ways to exercise. Long-term memberships are shrinking, margins are pressured by higher rents and facility expenses, and competition from smaller, more personalized…