INDEX
Explanations
references to measurement or metrics related to performance or evaluation
New Auto-Interp
Negative Logits
mathrm
-0.69
stretchr
-0.68
<eos>
-0.67
and
-0.61
saites
-0.60
[…]
-0.60
</em>
-0.59
mathvariant
-0.57
…
-0.55
@"/
-0.54
POSITIVE LOGITS
pleaſure
0.94
raiſ
0.94
poffe
0.88
purpoſe
0.87
étoient
0.85
houſe
0.84
出版年
0.83
auffi
0.80
Efq
0.79
feroit
0.78
Activations Density 0.470%