INDEX
Explanations
phrases related to denials or exclusions
New Auto-Interp
Negative Logits
srf
-0.72
tein
-0.68
Presence
-0.56
ancest
-0.55
assorted
-0.54
Lanka
-0.52
è¦ļéĨĴ
-0.51
odium
-0.50
smack
-0.50
widening
-0.50
POSITIVE LOGITS
whatsoever
1.04
necessarily
0.98
ever
0.91
theless
0.90
EVER
0.89
conom
0.88
bothered
0.85
anymore
0.85
dime
0.85
ĸļ
0.83
Activations Density 0.568%