INDEX
Explanations
punctuation marks and formatting symbols
New Auto-Interp
Negative Logits
myſelf
-1.71
itſelf
-1.63
Efq
-1.62
Jefus
-1.61
―――――
-1.60
doubtnut
-1.60
ſelves
-1.59
ſelf
-1.59
་་
-1.55
Anſ
-1.53
POSITIVE LOGITS
.
1.32
,
1.12
;
0.99
0.98
<eos>
0.97
(
0.94
0.92
)
0.92
↵↵
0.92
↵
0.91
Activations Density 0.183%