INDEX
Explanations
references to statements and group dynamics, especially in contexts involving support or opposition
New Auto-Interp
Negative Logits
sidemargin
-0.63
Référence
-0.61
himself
-0.58
Havolalar
-0.56
antMatchers
-0.55
Tracce
-0.55
wife
-0.54
cèse
-0.53
وتسجيلات
-0.53
felf
-0.52
POSITIVE LOGITS
themselves
1.60
Their
1.39
collectively
1.35
themselves
1.34
their
1.32
their
1.25
Their
1.25
各自
1.24
yourselves
1.18
それぞれ
1.11
Activations Density 0.596%