INDEX
Explanations
lines of code that include increment operations
New Auto-Interp
Negative Logits
als
-0.74
de
-0.68
-
-0.68
pe
-0.66
</i>
-0.63
so
-0.62
n
-0.61
t
-0.61
-0.61
von
-0.60
POSITIVE LOGITS
Monfieur
1.40
pleaſure
1.40
myſelf
1.36
purpoſe
1.35
Efq
1.34
Jefus
1.24
ſeveral
1.22
itſelf
1.21
raiſ
1.21
poffible
1.20
Activations Density 0.027%