INDEX
Explanations
suffixes and endings that suggest abstract or conceptual themes
New Auto-Interp
Negative Logits
st
-0.36
sto
-0.29
studio
-0.24
sta
-0.24
ston
-0.23
sti
-0.23
sth
-0.22
stat
-0.22
sten
-0.22
story
-0.22
POSITIVE LOGITS
yyyy
0.35
yyy
0.32
tics
0.29
town
0.28
esterday
0.28
lation
0.27
outube
0.26
ielding
0.26
ields
0.25
yy
0.24
Activations Density 0.997%