INDEX
Explanations
proper nouns, particularly names and titles featuring 'ee' vowel combinations
New Auto-Interp
Negative Logits
st
-0.22
stem
-0.19
thal
-0.19
tt
-0.18
ynos
-0.17
ness
-0.17
tn
-0.17
td
-0.17
nya
-0.17
icken
-0.16
POSITIVE LOGITS
zers
0.24
uw
0.23
eee
0.22
bles
0.21
zy
0.20
pee
0.20
zer
0.20
ple
0.19
pest
0.18
hive
0.17
Activations Density 0.027%