INDEX
Explanations
proper nouns or names starting with "The"
the definite article "The" in various contexts
New Auto-Interp
Negative Logits
beware
-0.78
virtually
-0.75
peers
-0.75
anyway
-0.75
stopping
-0.75
behalf
-0.73
bolstered
-0.73
according
-0.72
regardless
-0.72
delivered
-0.72
POSITIVE LOGITS
odor
1.28
Hague
1.21
orem
1.21
mis
1.13
oret
1.11
resa
1.08
Simpsons
1.08
odore
1.05
sis
1.04
atre
1.04
Activations Density 0.090%