INDEX
Explanations
pronouns followed by an action or a statement
occurrences of the word "It."
New Auto-Interp
Negative Logits
ILCS
-0.78
Priv
-0.62
idious
-0.61
Spending
-0.60
dstg
-0.59
aversion
-0.59
hips
-0.59
pring
-0.58
å½
-0.58
Resistance
-0.57
POSITIVE LOGITS
includes
1.28
's
1.23
'll
1.22
contains
1.18
involves
1.18
reads
1.18
consists
1.17
revolves
1.13
sounds
1.12
spans
1.11
Activations Density 0.203%