INDEX
Explanations
Pokémon card names
encoded or non-standard characters in text
New Auto-Interp
Negative Logits
Bourbon
-0.76
Franklin
-0.74
Toledo
-0.64
Johns
-0.62
SEC
-0.62
yth
-0.61
whis
-0.61
Sha
-0.60
Cord
-0.60
Ole
-0.59
POSITIVE LOGITS
ãĥ¼
4.41
ãĥ³
3.23
ãĥ¼ãĥ
2.80
ãĥĥãĤ¯
2.52
ãĥĥ
2.48
ãĥ«
2.41
ãĥ¼ãĥ«
2.34
ãĥ¼ãĤ¯
2.32
ãĤ¤
2.22
ãĤ¹
2.19
Activations Density 0.020%