INDEX
Explanations
expressions of emotional responses and sentiments regarding situations or experiences
New Auto-Interp
Negative Logits
ίÏĦ
-0.17
ields
-0.16
758
-0.16
ç´¯
-0.15
acob
-0.14
ellan
-0.14
singleton
-0.14
IELD
-0.14
اÙĦا
-0.14
enti
-0.13
POSITIVE LOGITS
utto
0.15
considering
0.15
Fol
0.15
aupt
0.15
.gz
0.14
andel
0.14
igel
0.14
/debug
0.14
ưa
0.14
227
0.13
Activations Density 0.493%