Publish AI, ML & data-science insights to a global community of data professionals.

How Many Pokemon Fit?

Finding the best Pokemon team by modeling and solving a knapsack problem with PokeAPI and PuLP optimization Python library

Photo by Emmanuelle Magnenat on Unsplash
Photo by Emmanuelle Magnenat on Unsplash

In this post, I ‘ll show you how to model and solve an optimization problem in Python using the PuLP library and utilizing Pokémon data to illustrate the process. If you want to learn more about solving optimization problems in Python or just love Pokémon, this tutorial will walk you through the steps in an easy-to-understand way.

So let’s go! 🚀

A little bit of Pokémon history

The original Generation I of Pokémon consists of 151 Pokémon that were introduced in the Pokémon Red and Green games, released in Japan in 1996. Since then multiple additional generations and versions of Pokémon have been released, resulting on the creation of more than 1,000 Pokémon, featured in electronic games, anime, trading cards, and merchandise, rendering Pokémon a cultural staple. Nevertheless, in this tutorial I will just use data for the original 151 Pokémon of Generation I.

In the original Pokémon game, there are two primary goals – becoming the Pokémon Champion and completing the Pokédex. In order to become Pokémon Champion, you have to defeat the eight Gym Leaders in Pokémon battles __ and earn the respective eight Gym badges. Then, you have to also defeat the _Elite Fou_r and the current Pokémon Champion – and then you are the Pokémon Champion! In order to complete the Pokédex, you have to capture all 151 Pokémon, by capturing them in the wild, winning them in a battle, or trading.

A primary constraint of the game is that you can only have 6 Pokémon in your active team during a duel. So, a relevant question is that assuming that you have caught all 151 Pokémon, which 6 Pokémon you should choose to include in your team? In order to decide which 6 Pokémon to select, we should first choose a measure (that is, an objective function), in order to evaluate the effectiveness of each Pokémon in battle. Then, our goal would be to select the Pokémon team that maximizes the defined objective function.

Each Pokémon has several distinctive attributes that essentially form the gameplay mechanics, allowing us to play the Pokémon game and define the winner in each Pokémon duel on varying environments. There are 6 base stats for each Pokémon: HP (Hit Points), Attack, Defense, Special Attack, Special Defense, and Speed. Apart from the base stats, there are various other characteristics contributing to the effectiveness of a Pokémon, such as Pokémon Type (such as Fire, Water, Electric, etc.), Pokémon Moves, the current Level of the Pokémon, and many more. Nevertheless, a popular metric in the Pokémon community for assessing the effectiveness of a Pokémon is the Base Stat Total (BST), which as you may have guessed, is the total of a Pokémon’s base stats.

BST = HP + Attack + Defense + Special Attack + Special Defense + Speed

In the case that our only constraint is that we should only select 6 Pokémon, our problem is relatively easy and has a straightforward solution. That is, calculating the BST of each candidate Pokémon, and then just selecting the top 6 most effective Pokémon to include in our team.

In order to explore a more nuanced version of the optimization problem in this post, we are going to assume an additional constraint. More specifically, I am going to assume that there is a limitation on the total weight of the Pokémon – the total weight of the Pokémon we select should be equal or less to 1,000 kg. This extra constraint allows as to form a hypothetical knapsack problem using the Pokémon data.

What about the knapsack problem?

The knapsack problem is a classic combinatorial optimization problem that can be expressed as follows:

Given a set of items, each with a weight and a value, determine which items to include in the collection so that the total weight is less than or equal to a given limit and the total value is as large as possible.

In other words, in the knapsack problem, we need to pack a set of items with certain values and sizes into a knapsack of fixed size. The problem is to find the subset of the total items that results in the maximum total value and fits in the knapsack.

The most common instance of the knapsack problem is the 0/1 knapsack problem, in which there is only one instance of each item. To put it differently, in the 0/1 knapsack problem, each item is unique and is either selected to be included in the knapsack once or not. Similarly, in our Pokémon problem, we have 151 unique Pokémon, each of which will either be included once in our active team or not at all.

Image by author created with Dall-E 3
Image by author created with Dall-E 3

The 0/1 knapsack problem can be formulated as a Binary Integer Programming (BIP) problem as follows, if we are being fancy:

Image by author
Image by author

More specifically:

  • υi is the value of item i
  • wi is the weight of item i
  • W is the total weight capacity of the knapsack, and
  • i is the decision variable indicating whether item i is included in the knapsack or not, by being 1 or 0 respectively.

There are also other variations of the knapsack problem like the bounded knapsack problem (where we can select multiple repetitions of each item up to a given number), as well as, the unbounded knapsack problem (where we can select as many repetitions of each item as we please).

The knapsack problem is known to be NP-hard, nonetheless, small instances can be efficiently solved with exact algorithms. Thus, in this post, I model and solve a 0/1 knapsack problem using the PokeAPI data with PuLP Python library.

Catching ‘Em All!

I will be using a Jupyter Lab notebook to import the PokeAPI data, as well as model and solve the optimization problem with PuLP and make a simple Plotly visualization of the data.

Importing the data from PokéAPI

In order to acquire the various Pokémon characteristics needed for the analysis, I will be using the data provided by PokeAPI. More specifically, PokeAPI can be accessed in Python by using the requests library. We can easily get the 151 Pokémon of Generation 1 along with their URLs in PokeAPI, by setting ‘limit = 151’, since they are the first 151 Pokémon in the Pokédex.

import requests

url = "https://pokeapi.co/api/v2/pokemon?limit=151"
response = requests.get(url)
data = response.json()
pokemon_urls = [pokemon['url'] for pokemon in data['results']]

Then, by iterating over the URLs and calling the API specifically for each item, we can also acquire the attributes of each one of the 151 Pokémon. For each Pokémon I import the base stats needed for the calculation of BST, as well as, the Pokémon type. We store all 151 Pokémon names along with their respective attributes in a dictionary named pokemon_data.

counter = 0
pokemon_data = []
for url in pokemon_urls:
    response = requests.get(url)
    pokemon = response.json()
    base_stats = {stat['stat']['name']: stat['base_stat'] for stat in pokemon['stats']}
    pokemon_data.append({
        'name': pokemon['name'],
        'weight': pokemon['weight'],
        'hp': base_stats.get('hp', 0),
        'attack': base_stats.get('attack', 0),
        'defense': base_stats.get('defense', 0),
        'special_attack': base_stats.get('special-attack', 0),
        'special_defense': base_stats.get('special-defense', 0),
        'speed': base_stats.get('speed', 0),
        'type': [t['type']['name'] for t in pokemon['types']],
    })
    counter += 1

We can also take a glimpse of our dataset in dataframe format:

import pandas as pd 
df = pd.DataFrame(pokemon_data)
df
Image by author
Image by author

And now we are ready to model our knapsack problem with PuLP!

Modeling the problem with PuLP

The very first step for structuring the optimization problem is to define our decision variable. In our case, what we are trying to decide is which Pokémon to include in our team. In other words, we are trying to decide for each of the 151 Pokémon if it is going to be included in the team (thus the decision variable will be equal to 1 for this Pokémon) or not (thus, the decision variable will be 0). As a result, the decision variable for this problem is a binary decision variable x, indicating whether a Pokémon is included in our team or not.

After having defined the decision variable x, we can also formulate the objective function of our problem – that is, what is it that we are trying to optimize. Here, we are trying to optimize the total BST of our Pokémon team, and more specifically we aim to maximize it. Thus, the objective function of our problem can be expressed as following:

Image by author
Image by author

Lastly, we also need to formulate the constraints of the problem. These are:

  1. The decision variable x is binary (0 or 1)
  2. The total weight of our Pokémon team should be equal or less to 1,000 kg
  3. The total number of the selected Pokémon should be exactly 6
Image by author
Image by author

To write this in Python, initially we import the PuLP library and initiate an instance of a maximization problem.

import pulp

prob = pulp.LpProblem("Pokemon Team Optimization", pulp.LpMaximize)

Then, we define the decision variable. The binary variable constraint is directly incorporated here…

x = pulp.LpVariable.dicts("x", range(len(pokemon_data)), cat='Binary')

… and the objective function…

prob += pulp.lpSum(
    (pokemon['hp'] + pokemon['attack'] + pokemon['defense'] + pokemon['special_attack'] + pokemon['special_defense'] + 
     pokemon['speed']) * x[i] for i, pokemon in enumerate(pokemon_data)
), "Total Combat Effectiveness"

… as well as, the problem constraints.

# total weight constraint
max_weight_capacity = 1000
prob += pulp.lpSum(pokemon['weight'] * x[i] for i, pokemon in enumerate(pokemon_data)) <= max_weight_capacity, "Weight Capacity"

# total number of Pokemon constraint
prob += pulp.lpSum(x[i] for i in range(len(pokemon_data))) == 6, "Team Size"

In this way, we have completely defined the model of our optimization problem in Python, and we are now ready to solve it.

The Pokémon dream team

The defined problem can be solved simply by:

prob.solve()

… which returns ‘1’, meaning that the status of our problem is ‘LpStatusOptimal’, meaning the problem has an optimal solution. prob.solve() may also return other outputs, as for instance ‘-1’ (LpStatusInfeasible), meaning that there is no feasible solution for the problem given the constraints, or ‘-2’ (LpStatusUnbounded), indicating that the solution is unbounded.

In any case, given that our problem has an optimal solution, we can display it by:

selected_pokemon = [pokemon_data[i]['name'] for i in range(len(pokemon_data)) if pulp.value(x[i]) == 1]
print("Selected Pokémon:", selected_pokemon)
Image by author
Image by author

And ✨Voila✨

We get our Pokémon dream team!

Apparently, we can get different teams by changing the constraints of the problem (the allowed weight and/or number of Pokémon, or even add additional constraints, such as the Pokémon type), as well as, defining a different objective function or assigning different coefficients on each one of the base stats.

In the context of this problem, it is also interesting to check how BST generally relates with the weight of a Pokémon, as well as, have a visual representation of the recommended Pokémon team along the entire pool of 151 Pokémon we could choose from. We can do this with a Plotly scatter chart:

# Calculate BST for all Pokémon
for pokemon in pokemon_data:
    pokemon['BST'] = (pokemon['hp'] + pokemon['attack'] + pokemon['defense'] +
                                             pokemon['special_attack'] + pokemon['special_defense'] + 
                                             pokemon['speed'])

all_pokemon_df = pd.DataFrame(pokemon_data)
selected_df = pd.DataFrame(selected_pokemon_data)

# Create the scatter plot
fig = go.Figure()

# Add all Pokémon
fig.add_trace(go.Scatter(
    x=all_pokemon_df['weight'],
    y=all_pokemon_df['BST'],
    mode='markers',
    name='All Pokémon',
    marker=dict(size=10, color='blue', opacity=0.6),
    text=all_pokemon_df['name']
))

# Add selected Pokémon
fig.add_trace(go.Scatter(
    x=selected_df['weight'],
    y=selected_df['total_combat_effectiveness'],
    mode='markers',
    name='Selected Pokémon',
    marker=dict(size=12, color='red', opacity=0.9),
    text=selected_df['name']
))

# Add title and labels, resize
fig.update_layout(
    title="Pokémon Selection Optimization",
    xaxis_title="Weight (KGs)",
    yaxis_title="BST",
    legend_title="Legend",
    width=1000, 
    height=600,
    showlegend=True
)

# Show the plot
fig.show()
Image by author
Image by author

On my mind

Apart from forming our team for the Pokémon championship, knapsack problems emerge in real-world decision-making processes in several different fields. Such real world instances include budget allocation, cargo/truck loading, investment portfolio optimization, and drug formulation, or even diet planning. Recognizing, modeling and solving such problems programmatically can provide significant insight for any organization. PuLP is a powerful library that allows us to model and solve optimization problems efficiently in the Python environment, further demonstrating Python’s versatility as a full-stack language.


Attribution

Data sourced from PokéAPI. © 2013–2023 Paul Hallett and PokéAPI contributors, released under the 3-Clause BSD License. For full license details, please visit the PokéAPI GitHub page.


✨Thank you for reading!✨


Loved this post? Let’s be friends!

📰Substack 💌 Medium 💼LinkedIn Buy me a coffee!

or, take a look at my other data science posts:

From Data to Dashboard: Visualizing the Ancient Maritime Silk Road with Dash Leaflet and SeaRoute…

The Price of Gold: Is Olympic Success Reserved for the Wealthy?🥇An_alyzing 30 years of Olympic Games medals distribution and national wealth indicatorsto_wardsdatascience.com


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles