INDEX
Explanations
URLs and web links
website urls
New Auto-Interp
Negative Logits
ſte
-1.02
<unused79>
-1.02
<unused14>
-1.02
[@BOS@]
-1.01
<unused41>
-1.01
<unused16>
-1.01
<unused28>
-1.01
<unused23>
-1.01
<unused3>
-1.01
<unused8>
-1.01
POSITIVE LOGITS
://
0.88
website
0.57
www
0.53
www
0.51
website
0.43
the
0.42
The
0.41
.
0.41
Website
0.39
@
0.36
Activations Density 0.010%