Support: [email protected]
This repository is a Python package to easily stream StatsBomb data into Python using your log in credentials for the API or free data from our GitHub page. API access is for paying customers only
git clone https://github.com/statsbomb/statsbombpy.git
cd statsbombpy
pip install .
nose2 -v --pretty-assert
Authentication can be done by setting environment variables named SB_USERNAME
and SB_PASSWORD
to your login credentials.
Alternatively, if you don't want to use environment variables, all functions accept an argument creds
to pass your login credentials in the format {"user": "", "passwd": ""}
StatsBomb's open data can be accessed without the need of authentication.
StatsBomb are committed to sharing new data and research publicly to enhance understanding of the game of Football. We want to actively encourage new research and analysis at all levels. Therefore we have made certain leagues of StatsBomb Data freely available for public use for research projects and genuine interest in football analytics.
StatsBomb are hoping that by making data freely available, we will extend the wider football analytics community and attract new talent to the industry. We would like to collect some basic personal information about users of our data. By giving us your email address, it means we will let you know when we make more data, tutorials and research available. We will store the information in accordance with our Privacy Policy and the GDPR.
Whilst we are keen to share data and facilitate research, we also urge you to be responsible with the data. Please register your details on https://www.statsbomb.com/resource-centre and read our User Agreement carefully. By using this repository, you are agreeing to the user agreement. If you publish, share or distribute any research, analysis or insights based on this data, please state the data source as StatsBomb and use our logo.
from statsbombpy import sb
sb.competitions()
competition_id | season_id | country_name | competition_name | competition_gender | season_name | match_updated | match_available | |
---|---|---|---|---|---|---|---|---|
0 | 9 | 42 | Germany | 1. Bundesliga | male | 2019/2020 | 2019-12-29T07:47:45.981 | 2019-12-29T07:47:45.981 |
1 | 9 | 4 | Germany | 1. Bundesliga | male | 2018/2019 | 2019-12-16T23:09:16.168756 | 2019-12-16T23:09:16.168756 |
2 | 9 | 1 | Germany | 1. Bundesliga | male | 2017/2018 | 2019-12-16T23:09:16.168756 | 2019-12-16T23:09:16.168756 |
3 | 78 | 42 | Croatia | 1. HNL | male | 2019/2020 | 2020-01-02T10:35:49.065 | 2020-01-02T10:35:49.065 |
4 | 10 | 42 | Germany | 2. Bundesliga | male | 2019/2020 | 2019-12-27T00:36:37.498 | 2019-12-27T00:36:37.498 |
sb.matches(competition_id=9, season_id=42)
match_id | match_date | kick_off | competition | season | home_team | away_team | home_score | away_score | match_status | last_updated | match_week | competition_stage | stadium | referee | data_version | shot_fidelity_version | xy_fidelity_version | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 303299 | 2019-12-15 | 18:00:00.000 | Germany - 1. Bundesliga | 2019/2020 | Schalke 04 | Eintracht Frankfurt | 1 | 0 | available | 2019-12-17T09:50:17.558 | 15 | Regular Season | VELTINS-Arena | F. Zwayer | 1.1.0 | 2 | 2 |
1 | 303223 | 2019-09-01 | 18:00:00.000 | Germany - 1. Bundesliga | 2019/2020 | Eintracht Frankfurt | Fortuna Düsseldorf | 2 | 1 | available | 2019-12-16T23:09:16.168756 | 3 | Regular Season | Commerzbank-Arena | F. Willenborg | 1.1.0 | 2 | 2 |
2 | 303083 | 2019-12-15 | 15:30:00.000 | Germany - 1. Bundesliga | 2019/2020 | Wolfsburg | Borussia Mönchengladbach | 2 | 1 | available | 2019-12-17T15:52:17.843 | 15 | Regular Season | VOLKSWAGEN ARENA | F. Brych | 1.1.0 | 2 | 2 |
3 | 303266 | 2019-12-14 | 15:30:00.000 | Germany - 1. Bundesliga | 2019/2020 | Hertha Berlin | Freiburg | 1 | 0 | available | 2019-12-17T17:43:18.285 | 15 | Regular Season | Olympiastadion Berlin | F. Willenborg | 1.1.0 | 2 | 2 |
4 | 303073 | 2019-12-21 | 15:30:00.000 | Germany - 1. Bundesliga | 2019/2020 | Bayern Munich | Wolfsburg | 2 | 0 | available | 2019-12-23T18:02:36.454 | 17 | Regular Season | Allianz Arena | C. Dingert | 1.1.0 | 2 | 2 |
sb.lineups(match_id=303299)["Eintracht Frankfurt"]
player_id | player_name | player_nickname | birth_date | player_gender | player_height | player_weight | jersey_number | country | |
---|---|---|---|---|---|---|---|---|---|
0 | 3204 | Almamy Touré | None | 1996-04-28 | male | 182.0 | 72.0 | 18 | Mali |
1 | 5591 | Filip Kostić | None | 1992-11-01 | male | 184.0 | 82.0 | 10 | Serbia |
2 | 7713 | Obite Evan N"Dicka | Evan N'Dicka | 1999-08-20 | male | 190.0 | NaN | 2 | France |
3 | 8307 | Martin Hinteregger | None | 1992-09-07 | male | 184.0 | 83.0 | 13 | Austria |
4 | 8669 | Mijat Gaćinović | None | 1995-02-08 | male | 175.0 | 66.0 | 11 | Serbia |
events = sb.events(match_id=303299) # if you want to store all events in a given match on a single dataframe
grouped_events = sb.events(match_id=303299, split=True)
grouped_events["dribbles"]
id | index | period | timestamp | minute | second | type | possession | possession_team | play_pattern | team | player | position | location | duration | under_pressure | related_events | dribble | match_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | b190c01f-ad24-468c-8241-f955b91d996c | 131 | 1 | 00:02:08.032 | 2 | 8 | Dribble | 4 | Schalke 04 | Regular Play | Schalke 04 | Daniel Caligiuri | Right Wing | [110.2, 62.9] | 0.000000 | True | [60f822df-5747-4787-b0f9-45bf5217eb8a] | {'outcome': {'id': 8, 'name': 'Complete'}} | 303299 |
1 | 4d773c92-f89f-491e-b3e0-3a1d2e863148 | 399 | 1 | 00:08:48.623 | 8 | 48 | Dribble | 18 | Schalke 04 | Regular Play | Schalke 04 | Amine Harit | Center Attacking Midfield | [88.9, 22.7] | 0.000000 | True | [93d829df-eea7-416b-95aa-7593828cfade] | {'outcome': {'id': 8, 'name': 'Complete'}} | 303299 |
2 | 8a78dce4-998a-4e81-902c-9f3957cebc9d | 460 | 1 | 00:13:30.202 | 13 | 30 | Dribble | 23 | Schalke 04 | Regular Play | Schalke 04 | Daniel Caligiuri | Right Wing | [99.5, 68.1] | 0.007309 | True | [772c5aae-e34e-4364-8a98-7caf7636c90b] | {'outcome': {'id': 9, 'name': 'Incomplete'}} | 303299 |
3 | e44d0122-2f2e-4771-820d-cc326a8b0379 | 496 | 1 | 00:14:10.135 | 14 | 10 | Dribble | 24 | Schalke 04 | From Throw In | Schalke 04 | Suat Serdar | Left Defensive Midfield | [41.2, 31.7] | 0.000000 | True | [4de4039f-7efc-461b-b7d6-27c32ec2cd2a] | {'outcome': {'id': 8, 'name': 'Complete'}} | 303299 |
4 | 9555afbd-d838-42c9-8f80-be3cd09e4c4a | 793 | 1 | 00:20:18.409 | 20 | 18 | Dribble | 33 | Eintracht Frankfurt | Regular Play | Eintracht Frankfurt | Timothy Chandler | Right Wing Back | [81.8, 75.7] | 0.000000 | True | [a5c88cee-6319-4c25-91cd-8a028d8dbfbf] | {'outcome': {'id': 9, 'name': 'Incomplete'}} | 303299 |
# if you want to store all events in a given competition on a single non tidy dataframe
events = sb.competition_events(
country="Germany",
division= "1. Bundesliga",
season="2019/2020",
gender="male"
)
grouped_events = sb.competition_events(
country="Germany",
division= "1. Bundesliga",
season="2019/2020",
split=True
)
grouped_events["dribbles"]
id | index | period | timestamp | minute | second | type | possession | possession_team | play_pattern | team | player | position | location | duration | under_pressure | related_events | dribble | match_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | b190c01f-ad24-468c-8241-f955b91d996c | 131 | 1 | 00:02:08.032 | 2 | 8 | Dribble | 4 | Schalke 04 | Regular Play | Schalke 04 | Daniel Caligiuri | Right Wing | [110.2, 62.9] | 0.000000 | True | [60f822df-5747-4787-b0f9-45bf5217eb8a] | {'outcome': {'id': 8, 'name': 'Complete'}} | 303299 |
1 | 4d773c92-f89f-491e-b3e0-3a1d2e863148 | 399 | 1 | 00:08:48.623 | 8 | 48 | Dribble | 18 | Schalke 04 | Regular Play | Schalke 04 | Amine Harit | Center Attacking Midfield | [88.9, 22.7] | 0.000000 | True | [93d829df-eea7-416b-95aa-7593828cfade] | {'outcome': {'id': 8, 'name': 'Complete'}} | 303299 |
2 | 8a78dce4-998a-4e81-902c-9f3957cebc9d | 460 | 1 | 00:13:30.202 | 13 | 30 | Dribble | 23 | Schalke 04 | Regular Play | Schalke 04 | Daniel Caligiuri | Right Wing | [99.5, 68.1] | 0.007309 | True | [772c5aae-e34e-4364-8a98-7caf7636c90b] | {'outcome': {'id': 9, 'name': 'Incomplete'}} | 303299 |
3 | e44d0122-2f2e-4771-820d-cc326a8b0379 | 496 | 1 | 00:14:10.135 | 14 | 10 | Dribble | 24 | Schalke 04 | From Throw In | Schalke 04 | Suat Serdar | Left Defensive Midfield | [41.2, 31.7] | 0.000000 | True | [4de4039f-7efc-461b-b7d6-27c32ec2cd2a] | {'outcome': {'id': 8, 'name': 'Complete'}} | 303299 |
4 | 9555afbd-d838-42c9-8f80-be3cd09e4c4a | 793 | 1 | 00:20:18.409 | 20 | 18 | Dribble | 33 | Eintracht Frankfurt | Regular Play | Eintracht Frankfurt | Timothy Chandler | Right Wing Back | [81.8, 75.7] | 0.000000 | True | [a5c88cee-6319-4c25-91cd-8a028d8dbfbf] | {'outcome': {'id': 9, 'name': 'Incomplete'}} | 303299 |
# alternatively, entities can be accessed as python dictionaries serving as an interface to raw jsons and without performing any preprocessing
sb.competitions(fmt="dict")
sb.matches(competition_id=9, season_id=42, fmt="dict")
sb.lineups(match_id=303299, fmt="dict")
sb.events(303299, fmt="dict")
sb.competition_events(
country="Germany",
division= "1. Bundesliga",
season="2019/2020",
gender="male",
fmt="dict"
)