INDEX
    Explanations

    indicators of emotional intensity or impactful phrases

    New Auto-Interp
    Negative Logits
    бо
    -0.16
    ast
    -0.14
    atts
    -0.14
    _DIM
    -0.14
    iba
    -0.14
    aa
    -0.14
    aN
    -0.14
    iena
    -0.14
    ÑģÑĤав
    -0.14
    æĥł
    -0.13
    POSITIVE LOGITS
    ahun
    0.16
    SystemService
    0.16
    581
    0.15
    uya
    0.15
    zyst
    0.15
    earch
    0.14
    olean
    0.14
    YD
    0.14
    abled
    0.14
    ansk
    0.13
    Act Density 0.548%

    No Known Activations