INDEX
Explanations
references to personal experiences and relationships involving emotional connection and impact
New Auto-Interp
Negative Logits
vana
-0.18
urge
-0.16
nish
-0.16
osy
-0.15
uga
-0.15
empo
-0.15
urger
-0.15
/goto
-0.15
urgeon
-0.14
itoris
-0.14
POSITIVE LOGITS
among
0.21
amongst
0.20
individuals
0.19
someone
0.18
d
0.17
à¸Ĥà¸Ńà¸ĩà¸ľ
0.17
minds
0.16
male
0.15
among
0.15
Among
0.15
Activations Density 0.306%