INDEX
Explanations
punctuation marks, particularly semicolons
New Auto-Interp
Negative Logits
fran
-0.73
alan
-0.69
of
-0.67
ondy
-0.67
widetilde
-0.66
vol
-0.62
nungs
-0.62
P
-0.61
NED
-0.61
olo
-0.60
POSITIVE LOGITS
$;
1.46
;;;
1.30
}$;
1.25
_;
1.24
AndEndTag
1.23
+;
1.23
;;;;
1.22
.;
1.19
,:);
1.16
icolon
1.15
Activations Density 0.163%