INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     PERSON
    -0.07
     orient
    -0.06
    -0.06
    _lift
    -0.06
    ewhere
    -0.06
     Karma
    -0.06
    Vi
    -0.06
     gelir
    -0.06
     sabe
    -0.06
    curity
    -0.06
    POSITIVE LOGITS
     значения
    0.07
    PTS
    0.06
     якого
    0.06
     Sandy
    0.06
    MK
    0.06
     Reporter
    0.06
    dik
    0.06
     poder
    0.06
     Content
    0.06
     crucial
    0.06
    Act Density 0.019%

    No Known Activations