INDEX
Explanations
phrases expressing emotional experiences and reflections
New Auto-Interp
Negative Logits
NB
-0.14
oso
-0.14
amak
-0.14
-0.14
ë¶Ģ
-0.13
hey
-0.13
oho
-0.13
lis
-0.13
oust
-0.13
elay
-0.13
POSITIVE LOGITS
aira
0.17
Tru
0.15
atur
0.14
atar
0.14
bet
0.14
Ñħод
0.14
sew
0.14
neon
0.14
abet
0.13
ilet
0.13
Activations Density 0.340%