INDEX
Explanations
mentions of football and its related terms
New Auto-Interp
Negative Logits
ewe
-0.16
owitz
-0.15
blk
-0.14
воÑģ
-0.14
uckles
-0.14
inery
-0.14
Activation
-0.14
YaÅŁ
-0.14
ighborhood
-0.13
Trap
-0.13
POSITIVE LOGITS
rosse
0.17
nat
0.15
avier
0.15
follower
0.14
275
0.14
غر
0.14
æį®
0.14
isco
0.14
ARSE
0.14
ัà¸ķ
0.14
Activations Density 0.016%