INDEX
Explanations
instances of backslashes or escape characters
New Auto-Interp
Negative Logits
betweenstory
-0.96
queſta
-0.96
aarrggbb
-0.91
iſten
-0.89
httphttps
-0.88
Geiſt
-0.85
Personendaten
-0.85
يتيمه
-0.85
<unused8>
-0.85
<unused14>
-0.85
POSITIVE LOGITS
\
0.88
\
0.73
1
0.60
The
0.60
$\
0.59
<b>
0.59
$\
0.57
I
0.56
0.54
<h2>
0.54
Activations Density 0.006%