INDEX
Explanations
instances of punctuation, particularly periods
New Auto-Interp
Negative Logits
monetary
-0.64
achievement
-0.62
behavi
-0.61
brilliance
-0.60
intangible
-0.60
recol
-0.59
intellectual
-0.58
longevity
-0.58
genius
-0.58
spoiler
-0.57
POSITIVE LOGITS
28
0.83
31
0.83
29
0.83
ruary
0.82
ools
0.80
Madness
0.79
26
0.78
furt
0.78
27
0.78
eteenth
0.76
Activations Density 0.021%