INDEX
Explanations
the presence of instances of the substring "sp" in various contexts
New Auto-Interp
Negative Logits
mente
-0.15
IVITY
-0.15
letes
-0.15
nette
-0.14
ém
-0.14
spare
-0.14
ey
-0.14
IDAD
-0.14
anford
-0.14
estro
-0.14
POSITIVE LOGITS
okes
0.36
atial
0.32
reading
0.30
ouse
0.29
oke
0.29
oken
0.28
ouses
0.28
awning
0.27
aced
0.27
arsity
0.26
Activations Density 0.021%