INDEX
Explanations
mentions of specific sports figures or entities
New Auto-Interp
Negative Logits
ingle
-0.19
ken
-0.16
vil
-0.16
legacy
-0.16
rig
-0.16
enze
-0.16
âĸĪ
-0.15
æĶ¿
-0.15
aggression
-0.15
iggins
-0.14
POSITIVE LOGITS
ging
0.30
gy
0.30
ged
0.30
gers
0.28
hetto
0.26
ues
0.25
gin
0.24
gle
0.24
gings
0.23
gie
0.22
Activations Density 0.390%