INDEX
Explanations
statements expressing gratitude or appreciation
New Auto-Interp
Negative Logits
ensical
-0.41
ħĭ
-0.40
awa
-0.40
©¶æ
-0.39
inctions
-0.37
ģ«
-0.34
coffin
-0.34
vortex
-0.33
Cooldown
-0.33
psey
-0.33
POSITIVE LOGITS
cause
0.58
then
0.46
said
0.44
whose
0.42
especially
0.42
clud
0.42
thus
0.41
nor
0.41
yet
0.41
today
0.41
Activations Density 11.383%