Nine cross-cuisine praise patterns
from 142,816 reviews, zero labels.
142,816 Google Maps reviews. 3,917 restaurants. 270 cuisines. The same pipeline we ran on weather stations and jet engines found nine cross-cuisine praise patterns from customer language alone.
DisclosureBenchmark study on public Google Maps (public) data. No customer data was used. Reviewed by Drew Wasem, Founder, Coherany. Methodology available on request.
THE METHODOLOGY
Cluster the language, not the label.
Multi-location operators cannot see what great looks like across stores. Single-location feedback tools summarize one location at a time. Traditional sentiment analysis requires labeled training data and bakes in an analyst's assumptions about what matters. Neither tells a regional manager "here is what your best customers sound like when you are doing it right" in the customer's own vocabulary.
We took a stratified pull of public Google Maps restaurant reviews: up to three reviews per restaurant per star rating, so every cuisine and every star level had a voice in the clustering. Real customer language, real star ratings preserved, no sentiment labels attached. The resulting distribution was more balanced than a typical Google scrape, which let the pipeline see real complaint language at scale instead of just the happy majority.
Then we ran the same pipeline we ran on LA weather stations and NASA turbofan engines. Same pattern-discovery code path. The signature extraction step changes per domain (one text field for reviews, five weather measurements for wildfire, twenty-one sensor groups for engines), but the clustering and the insight workflow do not.
THE DATA
Every cuisine, every star level.
THE FINDINGS
What excellence sounds like across 270 cuisines
Three results from the run, each one actionable for a multi-location operator. None of them required a sentiment model, a labeled training set, or a single human rule.
Nine cross-cuisine praise patterns emerged without sentiment training
The biggest praise pattern spans 5,979 reviews across 609 restaurants in 64 cuisine categories: American, bar and grill, barbecue, steak house, Mediterranean, Greek, French, Lebanese. Different food. Same way customers describe the experience. Eight more specialty praise patterns sit alongside it: the French bistro axis, the healthy lifestyle axis, the cocktail program cluster, the dessert destination cluster. Every one of them crosses cuisine boundaries. Every one of them formed from language alone.
Why previous runs missed this. Why this one found it.
On 6,000 reviews the pipeline only found topic clusters: Italian, Thai, burger. On 53,000 the same thing, plus one big undifferentiated "positive" blob. At 142,816 the sentiment axis separated from the topic axis on its own, because every cuisine finally had enough praise voices AND enough complaint voices for the shape to form. Praise patterns are real. They just need scale. Under 50,000 reviews the topic vocabulary dominates; above 100,000 the sentiment vocabulary emerges alongside it.
The noise bucket is where service failures live
81,899 reviews did not fit any of the 221 patterns. That half is not leftover. It is where specific complaint language lives, because each bad experience is unique to one moment at one store with one employee, and unique things do not cluster. Noise-bucket reviews were 1.31× more likely to be 1–2 stars than clustered reviews. For a multi-location operator, the noise bucket is a ranked reading list: the most specific, most vivid complaints, surfaced automatically, ready to route to the right store manager before the star average starts to slide.
THE RESULT
A ranked inspection queue and a screen for "doing it right."
The most actionable output from the run was the "almost-there" cohort: five restaurants rated between 3.50 and 3.63 with multiple reviews landing in the universal complaint pattern and zero reviews landing in any of the nine praise patterns. Not failing. Not yet recognizable as excellence on any axis. Close enough that a focused service program could push them into the casual-excellence cluster within two quarters.
The atlas doesn't tell the operator what to do. It tells them which five restaurants have the most to gain from doing something, and it points at the exact complaint language showing up in their reviews. A 30-day rating slide becomes a 30-minute response loop.
“Same pipeline we ran on weather stations and jet engines. Strip the labels, cluster the language, read the shape of what comes out. The data changed. The code didn't.”
DWDrew WasemFounder, Coherany
HONEST LIMITS
What this does and doesn't prove.
The review pull was stratified to balance star levels. A real Google Maps distribution skews heavily positive, which narrows the noise premium vs an unstratified pull. There is also no ground-truth list of "these restaurants are objectively excellent" to validate the clusters against. The patterns look right to domain intuition, but domain intuition is not proof. What the benchmark does prove: the same primitive that worked on weather and engine telemetry works on customer language, and at sufficient scale the sentiment axis separates from the topic axis on its own. Real validation happens when an operator runs it on their own reviews and tells us which clusters they agree with.
This isn't really about restaurants.
Product reviews, app store feedback, support tickets, NPS comments, patient-experience surveys. Any domain with large-scale customer language has the same structure waiting inside it. Request the methodology or run it on your own corpus.