INDEX
Negative Logits
houſe
-1.07
themſelves
-1.06
myſelf
-1.05
Majefty
-1.02
cauſe
-0.97
ſtate
-0.96
Reſ
-0.94
itſelf
-0.93
uſe
-0.92
Houſe
-0.91
POSITIVE LOGITS
'
0.61
0.59
’
0.57
</table>
0.56
UnitTesting
0.56
↵
0.55
</blockquote>
0.54
I
0.52
<eos>
0.51
M
0.50
Activations Density 0.654%