How to download the FPL top 10k teams
18 May 2024
·Top 10k is the gold standard in FPL. You are considered an elite manager if you are able to get your team into that bracket. These are the teams that are ahead of the trends and setting the template from the front. Because they’re so good it’s become the norm to analyze their picks and benchmark your own team against them. There’s multiple websites to do this like livefpl.net and fpl.team. They do great analysis but what if you wanted the raw data? How are these website downloading the raw 10k picks? This post will explain…
FPL don’t make it easy to download all teams. They don’t provide an Excel file download or anything like that. The data is only available through their API which serves their web app. The API doesn’t have any sort of batching, ranking or paging so it takes a tone of calls to get team picks.
This all makes it very hard. You’re going to need a lot of technical know how to get the data from the FPL source. Which is why I’ve made an alternative option: FPLTemplate.com. There you can download the top 100k teams for any game week at a click of a button. But if you want to continue with the FPL API option I’ve listed more details below 🙂
Option 1: Download the top 100k from FPLTemplate.com
As you will see in the option 2 getting the teams from FPL is quite painful. Fortunately I’ve done the hard work for you. I’ve setup an automated process to download the top 100k teams every week.
They’re uploaded in a CSV to FPLTemplate.com. I want to make a full analysis website at some point when I have the time. But for now you can download the raw teams and for your own analysis 🙂
Option 2: Download all teams from FPL’s API
FPL has an open API that serves all the data for the web app. We can use the API to retrieve all FPL teams for our analysis.
There is no single endpoint to get all players teams. We need to make a http request for every game week of every player. FPL currently has around 10 million players. So to get 10 game weeks of teams we need to send 100 million http requests.
Individually each request only takes about 50 milliseconds. But 100 million of them will take 1,388 hours (57 days). There are a few optimization tricks to completing them in a reasonable time which I will list below. But it will require you to be proficient in writing code.
API Terminology
The API uses different terminology from the web app which can get confusing. Here are the mappings of the terms we care about:
Gameweek = Event
FPL Manager = Entry
Premier league player = Entity (E.g Rashford has an Entity Id of 396)
Endpoints
There are 3 endpoints that give all the data needed to get a complete picture.
General data
This endpoint returns general game information. It’s all interesting but the main data we want from here is the entity names. We will need them for later because the team picks endpoint only returns the entity id. This endpoint only needs to be called once.
https://fantasy.premierleague.com/api/bootstrap-static/
{
"events": [...],
"game_settings": {...},
"phases": [...],
"teams": [...],
"total_players": 10750843,
"elements": [...],
"element_stats": [...],
"element_types": [...]
}
Element summary
Use this endpoint to get the statistics of each entity:
https://fantasy.premierleague.com/api/element-summary/353/
{
"fixtures": [...],
"history": [....],
"history_past": [...]
}
Entry game week stats
This is the endpoint we’re going to be calling a lot. It returns entry information from a single game week (team picks and ranking etc):
https://fantasy.premierleague.com/api/entry/1/event/1/picks/
{
"active_chip": null,
"automatic_subs": [],
"entry_history": {
"event": 1,
"points": 78,
"total_points": 78,
"rank": 1020807,
"rank_sort": 1020807,
"overall_rank": 1020807,
"percentile_rank": null,
"bank": 5,
"value": 1000,
"event_transfers": 0,
"event_transfers_cost": 0,
"points_on_bench": 11
},
"picks": [...]
}
Optimization
As mentioned above it’s going to take a lot of HTTP requests to get all the data we want. This is a list of things we need to do to download all the data in a reasonable time. I will describe the issues generally but you may have to work out the implementation of them in your own programming language. My examples will be in the language I know the best: C#
Proxies
We need to hit the FPL API hard with 100’s of requests per second. The API is protected by Fastly’s CDN so it can handle practically any load with throw at it with good response times. However it will start blocking us when we go above a couple of requests per second. It doesn’t allow more than this and we will see the 429 too many requests status code.
It will cost some money but we can buy some proxies to fix this issue. By spreading our requests over multiple proxies no singe IP will be blocked. We will be able to hit our target of 100’s of requests per second.
For proxies I recommend using WebShare. They have reliable data center proxies at a reasonable price. They also provide an auto rotating proxy endpoint which makes using them easier.
To use their auto rotating proxy endpoint in C# create a HttpClient with a Webproxy:
var proxy = new WebProxy
{
Address = new Uri($"http://p.webshare.io:80"),
BypassProxyOnLocal = false,
UseDefaultCredentials = false,
Credentials = new NetworkCredential(
userName: "bla bla bla",
password: "bla bla bla")
};
var httpClientHandler = new HttpClientHandler
{
Proxy = proxy
};
httpClientHandler.ServerCertificateCustomValidationCallback = HttpClientHandler.DangerousAcceptAnyServerCertificateValidator;
var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);
await client.GetAsync("https://fantasy.premierleague.com/api/entry/1/event/1/picks/");
Http compression
By using the Accept-Encoding http header we can request the data be compressed before it is sent from the server. This makes every API response smaller so it’s quicker to download and takes less proxy bandwidth. A request to get a player’s picks is about 2.4KB without compression. With compression it is 1.2KB. That’s alot of data saved over 100 million requests. Remember in your code you will now have to decompress the response before deserializing it.
The most popular compression algorithm is Gzip. You can find examples of what it looks like in a Accept-Encoding header in the Mozilla documentation.
In C# we can add a default header to the HttpClient for every request to be compressed and enable automatic decompression on the HttpClientHandler:
var httpClientHandler = new HttpClientHandler
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
};
var client = new HttpClient(handler: httpClientHandler, disposeHandler: true);
client.DefaultRequestHeaders.AcceptEncoding.Add(new System.Net.Http.Headers.StringWithQualityHeaderValue("gzip"));
Other optimization
Other things you will need to do for optimization are: Threading, Bulk saving the data, Avoiding port exhaustion and limiting memory allocations. These are all huge topics so I won’t cover them here. However, they are necessary and you will need to learn how to implement them in your programming language.