INDEX
Explanations
numerical values or statistics related to scientific findings
New Auto-Interp
Negative Logits
myſelf
-2.48
itſelf
-2.44
Efq
-2.15
pleaſure
-2.14
purpoſe
-2.10
houſe
-2.07
Jefus
-2.06
Monfieur
-2.01
himſelf
-2.00
ſelf
-2.00
POSITIVE LOGITS
<eos>
2.30
↵↵
1.51
1.20
,
1.14
(
1.08
'
1.03
-
1.00
-
0.99
:
0.97
"
0.93
Activations Density 0.304%