INDEX
Explanations
mentions of winning or victory
New Auto-Interp
Negative Logits
Factor
-0.66
senal
-0.62
bian
-0.60
uras
-0.60
ossus
-0.57
rouch
-0.57
periphery
-0.57
erity
-0.56
ria
-0.56
opian
-0.54
POSITIVE LOGITS
now
1.09
nings
1.05
throp
1.00
't
0.98
cest
0.87
ced
0.87
cing
0.81
ipeg
0.80
ests
0.80
hardt
0.79
Activations Density 1.287%