INDEX
Explanations
mentions of sports teams
New Auto-Interp
Negative Logits
tml
-0.86
drip
-0.72
tain
-0.70
forward
-0.69
dit
-0.68
ibrary
-0.68
fect
-0.67
alam
-0.66
ously
-0.66
pire
-0.65
POSITIVE LOGITS
mates
1.19
mates
1.14
members
1.09
rons
0.95
liquid
0.91
mate
0.89
peak
0.88
members
0.88
sters
0.85
motto
0.84
Activations Density 0.447%