SQL Challenge: Cross-Country Scoring
There are a lot of tutorials on the net that provide a basic introduction to SQL, but few that get into advanced techniques. I’m a fan of Cross-Country running and I’ve worked as a database programmer for the past 4 years. Here’s a problem that combines two of my interests into a puzzle that uses advanced joins and subqueries.
What is Cross Country?
Cross Country is team distance-running sport. Unlike track, it is run on grass or dirt. Each course is different. Some are hilly. Others flat.
How is it scored?
The places of each team’s first 5 runners are added together. A teams next 2 runners can “displace” another teams runners, raising the other teams score. The lowest score wins. In the event of a tie, the team with the faster 6th runner wins.
In a meet with 3 teams, each team is matched against the other teams as if it were a dual meet.
Large Meet Scoring
When the meets get large, the scoring method is changed. Scorers no longer separate out the teams because it would be too much work and it is more likely to result in ties.
The result is that scoring of the 4th and 5th runners become especially important. In a dual-meet scored match, teams are often able to win on the strength of their first 3 or 4 runners. A poor 5th runner is a limited liability because the maximum number of points a weak 5th runner can score is capped at 12 (7 opposing runners + 5).
But in a large meet, a poor 5th runner could score 200 points, effectively eliminating even the best team from from medal contention.
The challenge is to take a large meet, separate out each team and score it the same way that 2-way meets are scored. I’ve chosen the 2003 Pennyslvania Distric 3 meet for the sample data. There are 55 teams, resulting in 1456 pairs of matches.
I’ve included SQL-Server table definitions, data, and a few hints.
There’s more that one way to solve the problem and I’d be interested in hearing from people that have non-SQL solutions as well (perl, pyphon, lisp, etc).
In evaluating a solution, I consider:
- Simplicity – is it easy to read and understand.
- Performance – does it run in under 30 seconds. Faster is better.
- Portability – does it use vendor extensions to “standard” SQL
Just so you don’t think that you’re doing my homework for me, I’ve posted a solution. I’ll eventually open the solution section up, but for now, you have to demonstrate that you’ve solved the problem yourself by answering this question:
How many wins, losses and ties did Conestoga Valley have:
If you’re looking for other similar challenges, check out the “Yak Challenge“.
Note: photos taken by the author at several PIAA District and State meets
- I lost my original solution from 2005, but I wrote a post with a few parts of the solution.