INDEX
Explanations
terms related to dominance and control in various contexts
New Auto-Interp
Negative Logits
uman
-0.19
tract
-0.17
ź
-0.17
ses
-0.17
ocation
-0.17
sl
-0.16
dozen
-0.16
burn
-0.15
enia
-0.15
riott
-0.15
POSITIVE LOGITS
Ñģобой
0.19
estic
0.17
ERRU
0.17
/dom
0.16
proceedings
0.16
headlines
0.16
ÑģобоÑİ
0.15
antly
0.15
ascar
0.15
/control
0.15
Activations Density 0.035%