INDEX
Explanations
references to individuals' personal backgrounds and qualifications
New Auto-Interp
Negative Logits
Wendell
-0.61
logical
-0.60
Parms
-0.58
scary
-0.58
connotation
-0.58
resumption
-0.58
reversal
-0.58
Gub
-0.58
schlaf
-0.58
forgiven
-0.58
POSITIVE LOGITS
has
1.14
was
1.11
had
1.08
will
1.07
is
1.06
have
0.97
could
0.91
would
0.87
must
0.86
can
0.85
Activations Density 0.324%