INDEX
Explanations
punctuation marks, especially semicolons
New Auto-Interp
Negative Logits
iſt
-1.06
Anſ
-0.94
ſelves
-0.91
―――――
-0.90
neſs
-0.89
Theſe
-0.85
Efq
-0.85
ſy
-0.82
་་
-0.81
Reſ
-0.81
POSITIVE LOGITS
;
1.36
.;
1.07
;
1.06
);
0.99
?;
0.95
;
0.93
));
0.93
>;
0.92
;">
0.91
!;
0.90
Activations Density 0.065%