INDEX
Explanations
punctuation and sentence endings
citations containing numbers
New Auto-Interp
Negative Logits
myſelf
-0.77
ſeveral
-0.74
perſon
-0.69
themſelves
-0.68
paſſ
-0.68
ſever
-0.66
againſt
-0.66
anſ
-0.65
Anſ
-0.64
ſtre
-0.63
POSITIVE LOGITS
("$.0.66
+".
0.62
.$.
0.61
$.
0.60
+'.
0.59
\.
0.59
.
0.59
.
0.59
"*.
0.57
/\.
0.57
Activations Density 0.105%