INDEX
Explanations
words related to emotion or feelings
New Auto-Interp
Negative Logits
dro
-0.15
amat
-0.15
ëŀĢ
-0.14
az
-0.14
Dob
-0.14
stor
-0.14
chim
-0.14
Fab
-0.14
ure
-0.13
.n
-0.13
POSITIVE LOGITS
ppard
0.18
PÅĻed
0.17
esktop
0.16
isclosed
0.16
éĺħ读次æķ°
0.15
uitka
0.14
ummy
0.14
reesome
0.14
evice
0.14
ãĤ
0.14
Activations Density 0.127%