INDEX
Explanations
words starting with lowercase letters and have more than three characters
instances of the word "words"
New Auto-Interp
Negative Logits
olyn
-0.72
ño
-0.71
DERR
-0.70
©¶æ¥µ
-0.69
psey
-0.67
etheus
-0.66
ameron
-0.66
ntil
-0.65
roxy
-0.64
yss
-0.64
POSITIVE LOGITS
mith
1.33
words
1.15
terday
1.05
words
1.03
sworth
0.98
aloud
0.88
Words
0.85
uttered
0.83
Words
0.82
speak
0.79
Activations Density 0.017%