INDEX
Explanations
instances of direct speech actions like saying, asking, or telling
New Auto-Interp
Negative Logits
xtap
-0.88
cv
-0.83
wx
-0.75
osponsors
-0.74
irtual
-0.71
unal
-0.68
mania
-0.65
ugal
-0.65
=~=~
-0.64
£ı
-0.64
POSITIVE LOGITS
bye
1.15
goodbye
1.14
hello
1.09
aloud
0.93
hi
0.83
hey
0.81
nothing
0.80
Amen
0.78
loudly
0.77
yes
0.76
Activations Density 0.112%