INDEX
Explanations
pronouns and specific names
the presence of end-of-text markers
New Auto-Interp
Negative Logits
itaire
-0.65
Royale
-0.57
Outside
-0.54
Outside
-0.54
CCC
-0.50
Redd
-0.50
Guest
-0.50
Pixie
-0.49
--------
-0.49
Keeper
-0.49
POSITIVE LOGITS
zbollah
0.86
'll
0.83
cannot
0.80
could
0.75
certainly
0.75
must
0.75
may
0.74
CVE
0.74
ain
0.74
sailed
0.72
Activations Density 0.337%