INDEX
Explanations
phrases related to choices and decision-making
New Auto-Interp
Negative Logits
/Dk
-0.16
osto
-0.16
InBackground
-0.15
اب
-0.15
éļľ
-0.14
andan
-0.14
.scalablytyped
-0.14
Ïģοι
-0.14
aille
-0.13
undy
-0.13
POSITIVE LOGITS
either
0.80
EITHER
0.74
either
0.71
Either
0.70
Either
0.64
либо
0.50
soit
0.41
εί
0.33
ither
0.29
ITHER
0.27
Activations Density 0.148%