INDEX
Explanations
contractions indicating negation
Apostrophes at the end of words
follows punctuation
New Auto-Interp
Negative Logits
Референце
-0.72
beginnetje
-0.72
lenker
-0.71
Πηγές
-0.71
)++;
-0.69
зулта
-0.68
setof
-0.66
Personensuche
-0.66
sizeCache
-0.65
^(@)
-0.64
POSITIVE LOGITS
[toxicity=0]
0.77
<strong>
0.72
</blockquote>
0.67
</h6>
0.62
<h2>
0.62
例文帳に追加
0.62
</td>
0.60
<sup>
0.59
↵↵
0.58
<b>
0.56
Activations Density 0.671%