MTGJSON#
MTGJSON is an open-source project that catalogs all Magic: The Gathering cards in a portable format. A dedicated group of fans maintains and supplies data for a variety of projects and sites in the community. Using an aggregation process we fetch data between multiple resources and approved partners, and combine all this data in to various JSON files that you can learn about and download from this website.
mtgjson.com
import dvc.api as dvc
from pathlib import Path
import pandas as pd
from IPython.display import Code, HTML
import hvplot.pandas
import seaborn as sns
import pandera as pa
data_dir = Path(dvc.Repo().find_root())/'resources'/'data'/'mtg'
df = pd.read_feather(data_dir/'mtg.feather')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[1], line 10
7 import seaborn as sns
8 import pandera as pa
---> 10 data_dir = Path(dvc.Repo().find_root())/'resources'/'data'/'mtg'
11 df = pd.read_feather(data_dir/'mtg.feather')
AttributeError: module 'dvc.api' has no attribute 'Repo'
We’ve done some work to extract out a useful tabular form from the original (nested) json format. It is now stored as a feather
file to speed up read-times.
Validation is done using the following pandera
schema:
from tlp.data import mtg, styleprops_longtext
from inspect import getsourcelines
Code(''.join(getsourcelines(mtg.MTGSchema)[0]), language='python')
There are key text columns that will be of use to this course, specifically, namely:
- name
the name of the card
- text
the rules-text displayed on the main “body” of the card-face.
- flavor-text
the “story” and “fantasy” bit, which may not always be present, and is usually prose.
- keywords
special, meaningful terms that appear in the “text”, which have gameplay impacts
(df[['name', 'text','flavor_text']]
.sample(10, random_state=2).fillna('').style
.set_properties(**styleprops_longtext(['text','flavor_text']))
.hide_index()
)
There are a number of other potential sources of “fortuitous data”, as well:
%%HTML
<link href="//cdn.jsdelivr.net/npm/mana-font@latest/css/mana.min.css" rel="stylesheet" type="text/css" />
mtg.style_table(df.sample(10, random_state=2),
hide_columns=['text','flavor_text'])
Symbols are for vizualization only, with the original data consisting of lists of letters: ['W', 'U']
, etc.
“Mana font” is made by Andrew Gioia
(df
.set_index('release_date')
.sort_index()
.resample('Y')
.apply(lambda grp: grp.flavor_text.notna().sum()/grp.shape[0])
).hvplot( rot=45, title='What fraction of cards have Flavor Text each year?')