INDEX
Explanations
symbols and formatting used in mathematical or programming notations
New Auto-Interp
Negative Logits
[toxicity=0]
-0.98
-0.92
also
-0.88
is
-0.85
?
-0.85
...
-0.84
-0.83
be
-0.80
;
-0.77
.
-0.75
POSITIVE LOGITS
<bos>
1.23
ThemeOverlay
1.17
Савезне
1.17
kaarangay
1.17
sizeCache
1.16
Paglinawan
1.08
Drapeau
1.07
resourceCulture
1.05
myſelf
1.00
windowFixed
1.00
Activations Density 5.170%