INDEX
Explanations
mentions or references to the word "mute."
the word "ute" in various contexts
New Auto-Interp
Negative Logits
kell
-0.81
izoph
-0.73
challeng
-0.72
etheless
-0.68
orescence
-0.64
iston
-0.64
awaru
-0.64
thirsty
-0.63
mith
-0.62
abeth
-0.62
POSITIVE LOGITS
anamo
0.91
anu
0.84
culosis
0.83
vich
0.80
xt
0.75
cki
0.73
tle
0.71
leness
0.70
lled
0.69
ously
0.69
Activations Density 0.030%