In this SQL project, I worked on a 120-year Olympic history dataset, where I tried to find some insight by writing SQL queries. I used MySQL Workbench to analyze the dataset for this project. I have analyzed the dataset in the form of questions and answers by writing SQL queries where i used joins, common table expression, subqueries, window function, etc.
athlete_events table:
noc_regions table:
SQL Queries:
I am writing SQL Queries using this data. For each of these queries, you would find the problem statement, SQL Queries and then the screen shot of the output(some screen shot contains few lines of output).
1. How many olympics games have been held?
Problem Statement: Write a SQL query to find the total no of Olympic Games held as per the dataset.
select count(distinct(Games)) as olympics_games
from athlete_events;
Output:
2. List down all Olympics games held so far.Problem Statement: Write a SQL query to list down all the Olympic Games held so far.
select distinct year, season, city
from athlete_events
order by year;
Output:
3. Mention the total no of nations who participated in each olympics game?Problem Statement: SQL query to fetch total no of countries participated in each olympic games.
with all_countries as
(select games, nr.region
from athlete_events oh
join noc_regions nr ON nr.noc = oh.noc
group by games, nr.region)
select games, count(1) as total_countries
from all_countries
group by games
order by games;
Output:
4. Which year saw the highest and lowest no of countries participating in OlympicsProblem Statement: Write a SQL query to return the Olympic Games which had the highest participating countries and the lowest participating countries.
with all_countries as
(select games, nr.region
from athlete_events oh
join noc_regions nr ON nr.noc=oh.noc
group by games, nr.region),
tot_countries as
(select games, count(1) as total_countries
from all_countries
group by games)
select distinct
concat(first_value(games) over(order by total_countries)
, ' - '
, first_value(total_countries) over(order by total_countries)) as Lowest_Countries,
concat(first_value(games) over(order by total_countries desc)
, ' - '
, first_value(total_countries) over(order by total_countries desc)) as Highest_Countries
from tot_countries
order by 1;
Output:
5. Which nation has participated in all of the olympic games?Problem Statement: SQL query to return the list of countries who have been part of every Olympics games.
with tot_games as
(select count(distinct games) as total_games
from athlete_events),
countries as
(select games, nr.region as country
from athlete_events oh
join noc_regions nr ON nr.noc=oh.noc
group by games, nr.region),
countries_participated as
(select country, count(1) as total_participated_games
from countries
group by country)
select cp.*
from countries_participated cp
join tot_games tg on tg.total_games = cp.total_participated_games
order by 1;
Output:
6. Identify the sport which was played in all summer olympics.Problem Statement: SQL query to fetch the list of all sports which have been part of every olympics.
with tot_games as
(select count(distinct(games)) as total_games
from athlete_events where season = 'Summer'),
sports_played as
(select games, sport
from athlete_events
group by games, sport),
game_participated as
(select sport, count(1) as total_participated_games
from sports_played
group by sport)
select cp.*
from game_participated cp
join tot_games tg on tg.total_games = cp.total_participated_games
order by 1;
Output:
7. Which Sports were just played only once in the olympics.Problem Statement: Using SQL query, Identify the sport which were just played once in all of olympics.
with t1 as
(select distinct games, sport
from athlete_events),
t2 as
(select sport, count(1) as no_of_games
from t1
group by sport)
select t2.*, t1.games
from t2
join t1 on t1.sport = t2.sport
where t2.no_of_games = 1
order by t1.sport;
Output:
8. Fetch the total no of sports played in each olympic games.Problem Statement: Write SQL query to fetch the total no of sports played in each olympics.
select Games, count(distinct(Sport)) as total_sports
from athlete_events
group by Games
order by total_sports desc, Games;
Output:
9. Fetch oldest athletes to win a gold medalProblem Statement: SQL Query to fetch the details of the oldest athletes to win a gold medal at the olympics.
select * from athlete_events
where Medal = 'Gold' and
Age = (select max(Age) from athlete_events where Medal = 'Gold');
Output:
10. Fetch the top 5 athletes who have won the most gold medals.
Problem Statement: SQL query to fetch the top 5 athletes who have won the most gold medals.
with m1 as(
select Name, Team, count(Medal) as tot
from athlete_events
where Medal ='Gold'
group by Name, Team
order by tot desc),
m2 as
(select Name, Team, tot, dense_rank() over(order by tot desc) as rnk
from m1)
select name, team, tot
from m2
where rnk <= 5;
Output:
11. Fetch the top 5 athletes who have won the most medals (gold/silver/bronze).Problem Statement: SQL Query to fetch the top 5 athletes who have won the most medals (Medals include gold, silver and bronze).
with m1 as(
select Name, Team, count(Medal) as tot
from athlete_events
where Medal <> 'NA'
group by Name, Team
order by tot desc, Team, Name),
m2 as
(select Name, Team, tot, dense_rank() over(order by tot desc) as rnk
from m1)
select name, team, tot
from m2
where rnk <= 5;
Output:
12. Fetch the top 5 most successful countries in olympics. Success is defined by no of medals won.Problem Statement: Write a SQL query to fetch the top 5 most successful countries in olympics. (Success is defined by no of medals won).