INDEX
Explanations
words that are homophones or closely related to "so"
New Auto-Interp
Negative Logits
innocence
-0.65
theless
-0.62
misconception
-0.62
envy
-0.60
royalty
-0.59
DERR
-0.59
resemblance
-0.58
aristocracy
-0.57
FAA
-0.57
appeal
-0.57
POSITIVE LOGITS
ppy
1.19
bered
1.15
vere
1.14
oths
1.13
pp
1.09
fter
1.09
pping
1.09
FTWARE
1.09
aring
1.08
aked
1.07
Activations Density 0.042%