INDEX
Explanations
colloquial expressions of uncertainty or reluctance
New Auto-Interp
Negative Logits
Παραπομπές
-0.75
betweenstory
-0.73
ьаж
-0.73
Portail
-0.72
Хьажоргаш
-0.71
OMITBAD
-0.70
īgs
-0.69
Πηγές
-0.69
يكب
-0.68
Vidite
-0.67
POSITIVE LOGITS
</blockquote>
0.63
[toxicity=0]
0.58
<sup>
0.57
’
0.56
</h1>
0.56
</td>
0.53
includegraphics
0.53
<strong>
0.52
’.
0.52
</h6>
0.52
Activations Density 1.225%