INDEX
    Explanations

    words expressing strong affirmation or certainty

    New Auto-Interp
    Negative Logits
    ModelForm
    -0.70
     Darío
    -0.67
    ագրություններ
    -0.67
     Пен
    -0.65
     Guen
    -0.65
    EndTag
    -0.64
     Haw
    -0.64
     dgv
    -0.64
    ęku
    -0.64
     vien
    -0.63
    POSITIVE LOGITS
     Absol
    1.25
     absolue
    1.13
     ABSOL
    1.09
     Abs
    1.00
     absol
    1.00
     Absolute
    0.99
     absolutely
    0.98
    ABSOL
    0.95
     absolut
    0.94
     Absolutely
    0.93
    Act Density 0.063%

    No Known Activations