INDEX
Explanations
elements related to social dynamics and problematic behavior
New Auto-Interp
Negative Logits
either
-0.59
Either
-0.54
either
-0.52
Either
-0.50
EITHER
-0.48
либо
-0.28
soit
-0.25
ither
-0.20
OTHERWISE
-0.18
ichever
-0.18
POSITIVE LOGITS
nor
0.94
nor
0.66
Nor
0.63
NOR
0.58
Nor
0.57
nors
0.35
ноÑĢ
0.29
neither
0.29
né
0.29
Norris
0.28
Activations Density 0.021%