INDEX
Explanations
expressions of feelings and emotional reactions
New Auto-Interp
Negative Logits
ucus
-0.15
ous
-0.15
Jad
-0.14
ium
-0.14
plain
-0.14
dem
-0.14
ST
-0.14
лади
-0.13
consec
-0.13
_literal
-0.13
POSITIVE LOGITS
RIA
0.18
ombat
0.18
.onSubmit
0.16
ebek
0.15
hea
0.14
reature
0.14
adero
0.14
İ·
0.14
iyel
0.14
ınıza
0.14
Activations Density 0.015%