INDEX
Explanations
references to the Pokémon franchise
references to Pokémon
New Auto-Interp
Negative Logits
perial
-0.77
ubb
-0.71
pres
-0.70
iffe
-0.70
rikes
-0.69
ndra
-0.68
pressed
-0.68
ijk
-0.67
outer
-0.67
akin
-0.66
POSITIVE LOGITS
Pokémon
0.99
Dex
0.96
pokemon
0.96
Pokémon
0.92
Pokemon
0.92
Pokemon
0.90
Poké
0.80
Gy
0.80
Pikachu
0.79
Trainer
0.78
Activations Density 0.015%