INDEX
    Explanations

    words and phrases that convey certainty or emphasis

    New Auto-Interp
    Negative Logits
    Prostit
    -0.14
    _tF
    -0.13
    ustry
    -0.13
    γÏģά
    -0.13
    itol
    -0.12
    nt
    -0.12
    uchos
    -0.12
     пÑĢоÑĦеÑģÑģионалÑĮ
    -0.12
    anca
    -0.12
    oppable
    -0.12
    POSITIVE LOGITS
    uma
    0.17
     has
    0.17
    celik
    0.17
    LY
    0.15
    nger
    0.15
    ANNOT
    0.14
    theless
    0.14
    -ÑĤаки
    0.14
     had
    0.14
     have
    0.14
    Act Density 0.324%

    No Known Activations