
About Tennis Scorigami
Discovering the undiscovered in professional tennis
The Project
Exploring some data, some math, and the beauty of tennis
What is Scorigami?
Scorigami represents the occurrence of a final score that has never happened before in a sport's history.
It's also unironically what our dear friend Sebastian would bring up anytime there was a SNIFF of a scoreline that hadn't occurred in the NFL. So it's something Henry and I heard a lot about while sitting on the couch watching Bengals games.
Originated by Jon Bois for American football, we've done our best to adapt this concept to tennis, tracking every unique match score combination across professional tournaments.
With tennis's unique scoring system, it gets... dare we say... a bit more interesting?! (please don't come for me football soccer basketball <insert other sport> superfans). There are 735 possible final scores in best-of-3 matches and over 108,000 in best-of-5 matches. Sure NFL scores are also technically unbounded, but if you include tiebreak outcomes as unique identifiers, then so are tennis scores.
Data Collection & Analysis
From tennis history to comprehensive database
Open Era begins
Professional tennis enters the modern era, allowing pros to compete in Grand Slams.
Welcome to civilization.
ATP Rankings launch
Official computer rankings system established for men's tennis...
How insane is that? Before this, it was just Lance Tigray and the likes ranking tennis players for seeds.
Digital scorekeeping
Finally some tech in the mix. Electronic line calling and digital match tracking begin at major tournaments.
Potential scorigami?
Henry and John play a best of five on the (fake green) clay in Cincinnati. Who knows - coulda been a scorigami, if only this project was around
Tennis Scorigami launch
Interactive visualization platform goes live to explore score patterns
The Challenge of Tennis Data
In a more meta sense, the data quality of this project was (and is) one of the biggest challenges of this. It's borderline impossible to find free high-quality tennis data. I signed up for numerous free trials to pull as much data as I could from SportRadar, SportsDataIO, SportsDev, and RapidAPI. We actually had a good amount of success with RapidAPI for a truly free (but highly rate limited platform). SportRadar (sadly) is preposteroulsy expensive but has some of the highest quality data.
Data is fragmented across multiple sources, often incomplete, and requires significant cleaning and normalization. The bulk of this data comes from Jeff Sackmann's comprehensive tennis databases, but... even with that, there are numerous issues, redundant player-ids, incorrect scores, etc.
We have plans to build a more sophisticated data collection system and make this data free through API and files that are free to download.
Although this being said, per usual, we're stronger together, so please join the Discord or file a Canny issue, or email us, whatever you want if you HAVE tennis data, or you see any issues with the data.
Special Thanks
Once again, we owe a tremendous debt of gratitude to Jeff Sackmann, whose comprehensive tennis databases form the foundation of our historical data. Jeff has painstakingly compiled match results, player information, and detailed statistics for ATP and WTA tours going back decades.
You can find his invaluable open-source repositories at ATP Data and WTA Data. He has put in hundreds of hours of work to get the data to where it is (and i think he's sandbagging and probably it's more time than that).
A Note on Data Collection
For our graph visualizations, we only include matches that are completed and exclude matches with win-by-2-games rules in the fifth set of Grand Slams. This ensures consistency in our score pattern analysis and provides cleaner visualization of traditional tennis scoring systems.
As of right now, we're only including Grand Slam matches and all non-challenger non-future ATP/WTA matches.
Comprehensive Coverage
Data from 1968 onwards, covering ATP, WTA, and Grand Slam tournaments
Real-time UpdatesComing Soon
Continuous monitoring of ongoing tournaments to identify new scorigami moments
Data Integrity
Rigorous validation and cross-referencing to ensure accuracy across all matches
Technology Stack
Built with modern tools (...and optimizing for cost)
NextJS
Turbopack with NextJS 15
D3.js, Sigma.js, and react-force-graph
Kudos to Vasco Asturiano for react-force-graph!
TypeScript
Type-safe development
Python
Used for data ETL (cleaning, preparing, ingestion) from various sources (ATP, WTA, ITF, etc.)
PostHog
Open-source analytics platform for tracking user behavior and product insights
Meet Our Team
Three friends from Cincinnati united by data and tennis
Get in Touch
Contact us by email or tweet at us on X! We're always looking to improve Tennis Scorigami and your feedback helps us build a better experience.
Tweet at Us
Follow and tweet @TennisScorigami on X for quick updates and discussions.
Tweet @TennisScorigami→Ready to Explore?
Dive into our interactive visualization and discover which tennis scores have never been played in professional history.
Explore the Data