INDEX
Explanations
references to significant numerical data or values in a document
New Auto-Interp
Negative Logits
Мексичка
-1.42
Efq
-1.41
themſelves
-1.40
myſelf
-1.36
raiſ
-1.35
^(@)
-1.30
itſelf
-1.30
ſeveral
-1.28
ſelves
-1.26
purpoſe
-1.26
POSITIVE LOGITS
(
0.71
ing
0.70
[
0.64
/
0.63
↵↵
0.61
-
0.58
<eos>
0.58
–
0.58
nos
0.57
ent
0.55
Activations Density 0.944%