INDEX
Explanations
expressions of emotional conflict and situations involving moral dilemmas
New Auto-Interp
Negative Logits
wy
-0.16
imm
-0.15
ipur
-0.15
ahir
-0.15
posables
-0.14
amat
-0.14
Åĵ
-0.14
oras
-0.14
MMdd
-0.14
ync
-0.14
POSITIVE LOGITS
ears
0.15
ÑĩеÑĢ
0.15
Bret
0.14
McGr
0.14
mort
0.13
Ctl
0.13
ÑħÑĥд
0.13
.Support
0.13
ãģĭãĤı
0.13
Surname
0.13
Activations Density 0.148%