INDEX
Explanations
expressions of disagreement or contradiction
Follows a digit
numbers and measurements
New Auto-Interp
Negative Logits
MigrationBuilder
-0.96
الحياه
-0.90
sizeCache
-0.89
setof
-0.84
Билгалдахарш
-0.80
جغرافيا
-0.78
Portail
-0.78
balleur
-0.78
Wikispecies
-0.77
Datuak
-0.77
POSITIVE LOGITS
</blockquote>
0.50
</td>
0.45
</h4>
0.42
فريبيس
0.41
</h1>
0.41
</h6>
0.41
[toxicity=0]
0.40
<<<<<<<<<<<<<<
0.38
徊
0.38
↵
0.37
Activations Density 0.779%