INDEX
Explanations
instances of the word "exact."
New Auto-Interp
Negative Logits
I
-0.54
-0.51
a
-0.49
↵↵
-0.49
En
-0.48
p
-0.47
pecting
-0.47
kid
-0.47
For
-0.46
et
-0.46
POSITIVE LOGITS
itſelf
1.21
myſelf
1.02
himſelf
1.00
</thead>
0.99
Jefus
0.98
pleaſure
0.97
Efq
0.97
Roskov
0.94
themſelves
0.92
EndProject
0.92
Activations Density 0.114%