INDEX
Explanations
references to fighting and competition
New Auto-Interp
Negative Logits
]={↵-0.16
itan
-0.15
maj
-0.15
sson
-0.14
ivr
-0.14
518
-0.14
Vent
-0.14
ONGL
-0.14
wort
-0.14
ISIBLE
-0.14
POSITIVE LOGITS
olland
0.17
utow
0.16
LN
0.16
Glover
0.15
Mix
0.15
lemn
0.15
emax
0.15
ecies
0.14
allis
0.14
attles
0.14
Activations Density 0.068%