INDEX
Explanations
the word "succeed"
New Auto-Interp
Negative Logits
-1.03
(
-0.99
‘
-0.97
[
-0.93
<eos>
-0.92
-0.90
↵
-0.85
↵↵
-0.84
r
-0.84
des
-0.84
POSITIVE LOGITS
Efq
2.27
Monfieur
2.14
Theſe
2.06
auroit
2.03
étoit
2.02
feroit
2.02
enfans
2.00
Jefus
2.00
ainfi
2.00
purpoſe
1.99
Activations Density 1.360%