INDEX
Explanations
punctuation marks, particularly periods
New Auto-Interp
Negative Logits
advis
-0.73
chopping
-0.68
levers
-0.68
wise
-0.68
mant
-0.61
microphone
-0.60
provoking
-0.59
confidentiality
-0.59
homework
-0.59
purse
-0.58
POSITIVE LOGITS
Va
0.97
COM
0.86
com
0.85
AX
0.84
NET
0.83
A
0.83
E
0.82
MX
0.81
rex
0.80
MO
0.79
Activations Density 0.034%