INDEX
Explanations
punctuation marks, particularly colons
New Auto-Interp
Negative Logits
Dah
-0.17
876
-0.17
adle
-0.16
dera
-0.15
mah
-0.15
226
-0.15
alon
-0.14
ên
-0.14
mart
-0.14
Ùĭا
-0.14
POSITIVE LOGITS
usz
0.14
Crush
0.14
fold
0.14
.notice
0.14
Arbor
0.13
.WinForms
0.13
ADM
0.13
orch
0.13
ellipt
0.13
uff
0.13
Activations Density 0.096%