INDEX
Explanations
expressions conveying feelings of disappointment or negativity
New Auto-Interp
Negative Logits
DDD
-0.16
opath
-0.16
active
-0.15
stab
-0.14
ालय
-0.14
ique
-0.14
Ñĩа
-0.14
ovsky
-0.14
atham
-0.14
itis
-0.14
POSITIVE LOGITS
fully
0.22
FUL
0.20
fulness
0.19
akening
0.19
aw
0.18
akens
0.17
arde
0.17
Aw
0.17
ful
0.17
.githubusercontent
0.17
Activations Density 0.014%