INDEX
Explanations
words that are part of the pattern "wn" followed by a number
occurrences of the word "own" and its various forms in different contexts
New Auto-Interp
Negative Logits
uate
-0.91
itol
-0.72
ptive
-0.71
infl
-0.67
rehab
-0.64
orescent
-0.64
recapt
-0.64
ttle
-0.63
uated
-0.62
ions
-0.61
POSITIVE LOGITS
wn
1.09
iak
0.96
nesday
0.96
estern
0.96
elcome
0.95
erness
0.94
akening
0.92
sylvania
0.91
itness
0.88
akens
0.85
Activations Density 0.015%