INDEX
    Explanations

    colloquial expressions of uncertainty or reluctance

    New Auto-Interp
    Negative Logits
    Παραπομπές
    -0.75
     betweenstory
    -0.73
    ьаж
    -0.73
    Portail
    -0.72
    Хьажоргаш
    -0.71
     OMITBAD
    -0.70
    īgs
    -0.69
    Πηγές
    -0.69
    يكب
    -0.68
    Vidite
    -0.67
    POSITIVE LOGITS
    </blockquote>
    0.63
    [toxicity=0]
    0.58
    <sup>
    0.57
    0.56
    </h1>
    0.56
    </td>
    0.53
    includegraphics
    0.53
    <strong>
    0.52
    ’.
    0.52
    </h6>
    0.52
    Act Density 1.225%

    No Known Activations