INDEX
Explanations
the presence of various document formatting and structural elements
New Auto-Interp
Negative Logits
Портали
-0.93
<?
-0.88
ivelany
-0.87
#+#
-0.81
ſelves
-0.80
bezeichneter
-0.79
ſelf
-0.79
كومونز
-0.79
principalColumn
-0.78
BeginContext
-0.78
POSITIVE LOGITS
...
0.51
↵
0.51
--
0.50
,"
0.50
<eos>
0.48
O
0.47
↵↵
0.46
(
0.46
You
0.45
mk
0.44
Activations Density 0.010%