INDEX
Explanations
numbers at the beginning of phrases
structured numerical identifiers or listings
New Auto-Interp
Negative Logits
tremend
-0.84
ileaks
-0.80
Ô
-0.78
behavi
-0.78
ifully
-0.77
chwitz
-0.77
anooga
-0.77
ierrez
-0.76
merce
-0.76
icist
-0.75
POSITIVE LOGITS
91
1.01
34
0.99
92
0.96
87
0.96
650
0.96
Apostles
0.94
94
0.93
71
0.93
noon
0.92
02
0.91
Activations Density 0.046%