Anyone who has tested a chatbot for creative work has probably run into the same issue: you ask the AI to “tell a joke about coffee,” and it delivers the same line about the coffee getting “mugged.” Ask ten more times and the answer rarely changes.
This isn’t because the AI lacks imagination. In fact, language models are trained on enormous collections of text and are capable of producing thousands of potential jokes. The real reason for the repetition mirrors a trend uncovered in recent research on gendered language in AI. Teleki et al. (2025) found that AI systems tend to default to masculine-typical language not because they are biased “on purpose,” but because those patterns are the most frequent ones in the training data.
In other words, AI defaults to what it has seen the most, whether the task involves humor, creativity, or everyday language patterns.
Understanding this defaulting behavior is key to unlocking richer and more diverse AI output.
The “Safe Response” Problem
To understand why AI creativity collapses into repetition, imagine the model’s knowledge as a huge orchard filled with different fruits. Each fruit represents a different idea, joke, story, or thought the model could offer.
But because the AI is trained to please human evaluators—people who often reward familiar, predictable answers—the model learns to reach for the same fruit every time. This is how large language models develop safe-answer bias.
This is directly related to what researchers call mode collapse in language models: the model consistently picks the most likely or most reinforced answer, even when many alternatives exist.
The findings from the gender-language study echo the same pattern:
- The model leans toward masculine-coded words
- because those words appear more frequently in podcasts and text,
- not because the model “prefers” them.
Frequency shapes defaults.
And defaults suppress creativity.
The Breakthrough Prompt: “Generate 5 Jokes With Their Probabilities”
The discovery that reverses this problem is surprisingly simple. Instead of asking for one answer, ask the AI:
“Generate 5 jokes about coffee with their probabilities.”
These eight words dramatically alter the model’s behavior.
The AI suddenly begins producing a range of jokes:
- anxious espresso beans
- rebellious French presses
- moody baristas
- a baby cow named “de-calf-inated”
- and more
This transformation happens because the prompt forces the model to look beyond its single “safe” answer and explore its broader internal distribution of possibilities.
This technique—verbalized sampling—is emerging as one of the most powerful ways to increase AI output diversityand reduce repetitive responses.
Why Verbalized Sampling Works (Explained Simply)
Verbalized sampling works because it changes how the model searches for answers.
Asking for one answer triggers safe mode
With a single-output request, the model chooses the most common answer—the one that has been rewarded during training.
Asking for several answers forces exploration
The model must consider alternatives, even ones it would normally ignore.
Asking for probabilities reveals the model’s internal landscape
Requesting probabilities prompts the AI to examine its full distribution of possibilities, helping it remember the wider creative space learned during pretraining.
In practical terms, this simple phrasing encourages the model to revisit the parts of its knowledge that alignment layers usually keep hidden.
Connecting This to Broader AI Research
The behavior we see in creative tasks parallels the results from the gender-language study:
- AI defaults to the most frequent patterns, whether in humor or vocabulary.
- These patterns create stable but narrow outputs.
- Without specific prompting, the model hides much of what it can produce.
This doesn’t mean the model lacks creativity or fairness. Instead, it means the model needs the right kind of instruction to reveal more of what it knows. Verbalized sampling provides that.
Understanding this strengthens our grasp of how large language models function and why certain AI prompt engineering techniques work better than others.
Practical Implications for AI Users and Developers
This technique has wide-reaching benefits across creative and professional tasks:
• Enhanced AI creativity
Poems, stories, essays, and jokes become more varied and interesting.
• Improved brainstorming
The model offers multiple directions—useful for planning, ideation, and strategy.
• More human-sounding dialogue
Conversations gain nuance, unpredictability, and natural variation.
• Better synthetic data
Training examples become richer and more representative.
• More transparent AI behavior
Seeing probabilities helps users understand the model’s decision-making.
Developers can even embed verbalized sampling directly into their system prompts or interfaces to ensure consistently diverse AI output.
Creativity Was Never Missing—It Needed Permission
What these eight words reveal is that AI has far more creative potential than it typically shows. Its range wasn’t lost or damaged; it was simply narrowed by training processes designed to favor safe, predictable answers.
By asking for multiple responses and their probabilities, users give the AI a reason to revisit its deeper, more varied set of possibilities. This prompt doesn’t hack the system or force it to behave differently—it merely invites it to explore the orchard of ideas it learned long before fine-tuning narrowed its behavior.
The potential was always there.
The gate was closed, not locked.
And with a small shift in how we ask, the path opens again.
Make the first step and enjoy the journey!
- No limited trial period
- No upfront payment
- No automatic renewal
- No hidden costs
