INDEX
Explanations
references to football and related terms
New Auto-Interp
Negative Logits
opl
-0.16
æľĽ
-0.14
ile
-0.14
enced
-0.14
opus
-0.13
prov
-0.13
repid
-0.13
elian
-0.13
eturn
-0.13
-produced
-0.13
POSITIVE LOGITS
s
0.27
er
0.18
å¤Ł
0.17
erb
0.17
istique
0.16
sÃŃ
0.16
vertiser
0.15
ëį°ìĿ´íĬ¸
0.15
erman
0.15
erland
0.15
Activations Density 0.027%